Question

我有一个包含数百行的CSV文件，我想选择并将每3行导出到一个新的CSV文件，新的输出CSV文件以选择的第一行命名。

例如，在以下CSV文件中....

1980 10 12            
1  2  3  4  5  6  7       
4  6  8  1  0  8  6  
1981 10 12
2  4  9  7  5  4  1  
8  9  3  8  3  7  3

我想选择前3行并根据第一行导出到名为“1980 10 12”的新CSV，然后选择接下来的3行并导出到名为“1981 10 12”的新CSV基于接下来的3行的第一行。我想用python做这个。

Answer 1

使用csv module和itertools.islice()每次选择3行：

import csv
import os.path
from itertools import islice


with open(inputfilename, 'rb') as infh:
    reader = csv.reader(infh)
    for row in reader:
        filename = row[0].replace(' ', '_') + '.csv')
        filename = os.path.join(directory, filename)
        with open(filename, 'wb') as outfh:
            writer = csv.writer(outfh)
            writer.writerow(row)
            writer.writerows(islice(reader, 2))

writer.writerows(islice(reader, 2))行从读取器获取接下来的两行，在将当前行（带有日期）首先写入输出文件后，将它们复制到写入器CSV。

您可能需要调整delimiter和csv.reader()对象的csv.writer()参数;默认值为逗号，但您没有指定确切的格式，可能需要将其设置为'\t'标签。

如果您使用的是Python 3，请使用'r'和'w'文本模式打开文件，并为两者设置newline=''; open(inputfilename, 'r', newline='')和open(filename, 'w', newline='')。

Answer 2

import csv
with open("in.csv") as f:
    reader = csv.reader(f)
    chunks = []
    for ind, row in enumerate(reader, 1):
        chunks.append(row)
        if ind % 3 == 0: # if we have three new rows, create a file using the first row as the name
            with open("{}.csv".format(chunks[0][0].strip(), "w") as f1:
                wr = csv.writer(f1) 
                wr.writerows(chunks) # write all rows
            chunks = [] # reset chunks to an empty list

Answer 3

使用轻微的迭代器技巧：

with open('in.csv', 'r') as infh:
    for block in zip(*[infh]*3):
        filename = block[0].strip() + '.csv'
        with open(filename, 'w') as outfh:
            outfh.writelines(block)

在Python 2.X上，您将使用itertools.izip。 docs实际上提到了izip(*[iter(s)]*n)作为聚类数据系列的习惯用语。

如何使用python选择CSV文件中的每个第N行

3 个答案: