Question

（我的问题的简短版本：在numpy中，是否存在一种从tensorflow模拟tf.sequence_mask的优美方法？）

我有一个二维数组a（每一行代表一个不同长度的序列）。接下来，有一个1d数组b（代表序列长度）。是否有一种优雅的方法来获得一个（扁平化的）数组，该数组将仅包含a的这些元素，这些元素属于按其长度b指定的序列：

a = np.array([
    [1, 2, 3, 2, 1],  # I want just [:3] from this row
    [4, 5, 5, 5, 1],  # [:2] from this row
    [6, 7, 8, 9, 0]   # [:4] from this row
])
b = np.array([3,2,4])  # 3 elements from the 1st row, 2 from the 2nd, 4 from the 4th row

所需结果：

[1, 2, 3, 4, 5, 6, 7, 8, 9]

elegant way是指避免循环的内容。

Answer 1

使用broadcasting创建与2D数组形状相同的蒙版，然后简单地蒙版并提取有效元素-

[('ADDRESS', 'int:4', 'd1'),
 ('ADDRESS', 'str:254', 'd2'),
 ('AREA', 'float:19.11', 'd2'),
 ('DEC_ID', 'int:4', 'd1'),
 ('DEC_ID', 'str:254', 'd2'),
 ('DESC_', 'str:254', 'd1'),
 ('FID_PERIVL', 'int:9', 'd1'),
 ('KAEK', 'str:50', 'd1'),
 ('KAEK', 'str:12', 'd2'),
 ('LEN', 'float:19.11', 'd2'),

样品运行-

a[b[:,None] > np.arange(a.shape[1])]

大量索引：二维数组中每行的第一个（可变）元素个数

1 个答案: