读取包含标题的多行CSV文件

时间:2020-05-08 23:00:03

标签: python pandas csv parsing readfile

我有csv文件,这些文件是仪器生成的输出。每个文件包含多个数据集,这些数据集以“条件”分隔,后跟标题和数据。我想将“条件”列为相应数据集的列,然后读取文件。输出可以是一个文件,也可以是每个数据集的文件。条件,标题和数据都由csv文件中的选项卡分隔。

我什至不知道该如何开始。我有示例输入和输出的屏幕截图。任何见解或指示采取此将不胜感激。谢谢! Image of example input and desired output

2 个答案:

答案 0 :(得分:0)

有一种可能的解决方案:


#Open the fist file
mfile = open('file.csv', 'r')
string = mfile.read()
mfile.close()
# Split on the line breaks
string = string.split("\n")



#CAUTION if you CSV file uses ";" instead "," change it on the code!

condition = ''
newString = []
for i in range(len(string)):
    # Check if condition is trully oneline
    if(len(string[i].split(',')) ==1):
        condition = string[i]
        #Change the string 'header1,header2 to you header
    elif (string[i] == 'header1,header2'):
        pass
    else:
        newString.append(string[i] + ","+condition)

mfile = open('outfile.csv', 'w')
mfile.write('header1,header2\n')
for i in newString:
    mfile.write(i + '\n')

我已将其用作file.csv(输入)的内容:

condidtion1
header1,header2
2,3
2,3
2,3
2,3
condidtion2
header1,header2
3,4
3,4
3,4
3,4
3,4
3,4

运行代码后,outfile.csv看起来像(输出):

header1,header2
2,3,condidtion1
2,3,condidtion1
2,3,condidtion1
2,3,condidtion1
3,4,condidtion2
3,4,condidtion2
3,4,condidtion2
3,4,condidtion2
3,4,condidtion2
3,4,condidtion2

答案 1 :(得分:0)

这将解决您的问题

import csv

file = open('test.tsv', 'r')
lines = file.readlines()
# lines = ['Condition 1\t\n', 'Header 1\tHeader 2\n', '2\t3\n', '2\t3\n', '2\t3\n', 'Condition 2\t\n', 'Header 1\tHeader 2\n', '2\t3\n', '2\t3\n', '2\t3\n']
current_condition = ''
final_output = [['Header 1', 'Header 2', 'condition']]
for i in range(0,len(lines)):
    row = lines[i].rstrip().split('\t')
    if len(row) == 1:
        current_condition = row[0]
    elif row[0] != 'Header 1' and row[1] != 'Header 2':
        final_output.append([
            row[0],
            row[1],
            current_condition
        ])

fout = open('output.csv', 'w')
writer = csv.writer(fout)
writer.writerows(final_output)
相关问题