Question

我正在尝试将2d numpy数组导出到csv，同时插入一个由数组外部的变量组成的额外列。

最终目标是使用python lasio库循环一系列文件，选择某些1d数组，将它们展平为2d数组，然后导出到csv文件，准备将数据加载到数据库。

ndarrays可能是：

文件1：

1.0, 3
1.5, 4
2.0, 56

文件2：

1.0, 76
1.5, 3
2.0, 45
2.5, 45.6

所需的输出将是：

F1, 1.0, 3
F1, 1.5, 4
F1, 2.0, 56
F2, 1.0, 76
F2, 1.5, 3
F2, 2.0, 45
F2, 2.5, 45.6

Answer 1

这可以通过numpy完成，如下所示，

import numpy as np

#Read and label file 1
f1 = np.genfromtxt('./file1.csv',delimiter=',',dtype="string")
label = np.array(["F1"]*f1.shape[0])
f1 = np.insert(f1, 0, label, axis=1)


#Read and label file 2
f2 = np.genfromtxt('./file2.csv',delimiter=',',dtype="string")
label = np.array(["F2"]*f2.shape[0])
f2 = np.insert(f2, 0, label, axis=1)

#Combine and write
fout = np.vstack((f1,f2))
np.savetxt("fout.csv", fout, delimiter=",", fmt="%s")

Numpy insert在组合数组时需要相同格式的所有数据。 file1.csv和file2.csv文件包含您问题中的数组，因此应该作为字符串读取。标签是根据列数生成的，并作为第一列插入。然后将两者垂直堆叠并写出。您需要指定savetext输出也是字符串。然后，生成的文件fout.csv为

F1,1.0, 3
F1,1.5, 4
F1,2.0, 56
F2,1.0, 76
F2,1.5, 3
F2,2.0, 45
F2,2.5, 45.6

Answer 2

如果我正确理解了您的问题，您可以为每个文件添加一个id列，然后连接所有数组。以下是仅有两个数组的代码：

import numpy as np

A1 = np.array([[1.0, 3],
               [1.5, 4],
               [2.0, 56]])
A2 = np.array([[1.0, 76],
               [1.5, 3],
               [2.0, 45],
               [2.5, 45.6]])

A1id = 0
A2id = 1
A1 = np.hstack((A1id*np.ones((A1.shape[0], 1)), A1))
A2 = np.hstack((A2id*np.ones((A2.shape[0], 1)), A2))

result = np.vstack((A1, A2))

导出numpy ndarray时将列添加到csv

2 个答案: