从列表创建numpy数组会给出错误的形状

时间:2016-11-16 22:39:51

标签: python numpy machine-learning

我正在从numpy数组列表中创建几个numpy数组,如下所示:

seq_length = 1500
seq_diff = 200  # difference between start of two sequences
# x and y are 2D numpy arrays
x_seqs = [x[i:i+seq_length,:] for i in range(0, seq_diff*(len(x) // seq_diff), seq_diff)]
y_seqs = [y[i:i+seq_length,:] for i in range(0, seq_diff*(len(y) // seq_diff), seq_diff)]
boundary1 = int(0.7 * len(x_seqs))   # 70% is training set
boundary2 = int(0.85 * len(x_seqs))  # 15% validation, 15% test
x_train = np.array(x_seqs[:boundary1])
y_train = np.array(y_seqs[:boundary1])
x_valid = np.array(x_seqs[boundary1:boundary2])
y_valid = np.array(y_seqs[boundary1:boundary2])
x_test = np.array(x_seqs[boundary2:])
y_test = np.array(y_seqs[boundary2:])

我想得到6个形状阵列(n,1500,300),其中n分别是训练,验证和测试阵列数据的70%,15%或15%。

这是出错的地方:_train_valid数组结果很好,但_test数组是一维数组数组。那就是:

  • x_train.shape(459, 1500, 300)
  • x_valid.shape(99, 1500, 300)
  • x_test.shape(99,)

但是打印x_test会验证它是否包含正确的元素 - 即它是一个长度为99个元素的(1500, 300)数组。

为什么_test矩阵形状错误,而_train_valid矩阵却没有?

1 个答案:

答案 0 :(得分:2)

x_seqs中的项目长度不一。当它们的长度相同时,np.array可以从它们制作一个3d数组;当它们不同时,它会生成一个列表的对象数组。查看dtype的{​​{1}}。查看x_test

我拿了你的代码,补充道:

[len(i) for i in x_test]

得到了:

x=np.zeros((2000,10))
y=x.copy()
...
print([len(i) for i in x_seqs])
print(x_train.shape)
print(x_valid.shape)
print(x_test.shape)