NumPy切片可变大小的多维数组

时间:2020-03-23 11:55:05

标签: python arrays list numpy numpy-slicing

假设具有以下代码行

import numpy as np

# The values equal to 1 inside this nested list indicate where the data need to be loaded. a = [7 x 6]
a = [
    [0, 1, 0, 1, None, None],
    [0, 0, 0, 0, None, 0],
    [0, 0, 1, 0, None, 0],
    [0, 1, 0, 1, None, 1],
    [0, 0, 0, 1, None, 0],
    [0, 0, 0, 0, None, 0],
    [1, 0, 0, 0, None, None]
]
# The list "a" cannot be modified for a number of reasons, so I create a np.array copy, named "b"
b = np.array(a)

N = int(1E7)  # Number of samples

# The loop below retrieves the positions inside "b" in which data need to be loaded
row = []
col = []
for i in range(len(b)):
    col.append([])
    if any(b[i] == 1):
        row.append(i)
    for j in range(len(a[i])):
        if b[i][j] is 1:
            b[i][j] = np.zeros((N, 1))
            col[i].append(j)


# Loading the data inside the selected positions of "b". "mydata" is a numpy array, whose shape is (N, 6)
for i in row:
    mydata = np.random.randn(N, len(a[0])).reshape(N, len(a[0])) # Generation of dummy data
    b[i, col[i]] = mydata[:, col[i]]  # This instruction returns a ValueError

但是,出现以下错误: ValueError:形状不匹配:形状(10000000,2)的值数组无法广播到形状(2,)的索引结果

为什么这种切片无法正常工作?是因为“ b”中数组元素的大小可变吗?

先谢谢您。

2 个答案:

答案 0 :(得分:0)

对于由于使用b[i, col[i]]而在行中插入多个零数组的情况,切片无法正常工作。

只考虑您的第一行。这给您row=[0]col =[[1,3]]。这意味着b[0,0]引用列1 3的零数组。您应该像以前一样使用嵌套的for循环遍历行和列

for i in row:
    for j in col[i]:
        mydata = np.random.randn(N, len(a[0])).reshape(N, len(a[0]))
        b[i, j] = mydata[:, col[i]]

答案 1 :(得分:0)

让我们将N减少到合理的水平,并添加一些印刷品:

print(row)
print(col)
print(a)
print(b)

运行:

0942:~/mypy$ python3 stack60813103.py 
[0, 2, 3, 4, 6]
[[1, 3], [], [2], [1, 3, 5], [3], [], [0]]
[[0, 1, 0, 1, None, None], [0, 0, 0, 0, None, 0], [0, 0, 1, 0, None, 0], [0, 1, 0, 1, None, 1], [0, 0, 0, 1, None, 0], [0, 0, 0, 0, None, 0], [1, 0, 0, 0, None, None]]
[[0 array([[0.],
       [0.],
       [0.],
       [0.],
       [0.]]) 0
  array([[0.],
       [0.],
       [0.],
       [0.],
       [0.]]) None
  None]
 [0 0 0 0 None 0]
 [0 0 array([[0.],
       [0.],
       [0.],
       [0.],
       [0.]]) 0
  None 0]
 ....
 [0 0 0 0 None 0]
 [array([[0.],
       [0.],
       [0.],
       [0.],
       [0.]]) 0 0 0
  None None]]
Traceback (most recent call last):
  File "stack60813103.py", line 38, in <module>
    b[i, col[i]] = mydata[:, col[i]]  # This instruction returns a ValueError
ValueError: shape mismatch: value array of shape (5,2) could not be broadcast to indexing result of shape (2,)

rowcola是列表,b是对象dtype数组(因为所有None)。您的循环中插入了一堆np.zeros((N,1))数组。

mydata是一个(N,5)浮点数组。

mydata[:, col[i]]col[0]时,

[1, 3]将为(N,2);对于其他i,它可能是(N,0)或(N,1),(N,3)。

b[i, col[i]]是(2,)(或(0,),(1,),(3,))。形状上存在相当明显的不匹配。您不能在(2,)插槽中放入(N,2)个数组。

您为什么要构造这样的数组? None,数字和形状为(N,1)和(N,2)的数组的混合吗?


我认为您需要添加一个迭代:

for j in col[i]:
    b[i, j] = mydata[:, j]

这应该为b[i,j]的{​​{1}}元素分配一个(N,)数组。

相关问题