Python将单列数据转换为多列

时间:2014-03-11 17:23:30

标签: python

我有一个包含简单数字数据的.txt文件。数据反映了同一事物的多次测量,并且只是在一个长列中写出来。我想要一个脚本来读取文件,识别将一个实验与下一个实验分开的分隔符,并将其全部写入txt或csv文件中的单独列。

目前,数据由旗帜分隔。 #row = X'其中X = 0到~128。所以我想要一个打开文件的脚本,读到' row = 0',然后将下一个~1030行数据复制到某个列表/数组,作为"列0" 。然后,当它击中'row = 1'时,将下一个~1030行的数字复制到"第1列' ...依此类推。然后它应该把它写成多列。输入数据文件如下所示:

# row = 0
9501.7734375
9279.390625
[..and so on for about 1030 lines...]
8836.5
8615.1640625
# row = 1
4396.1953125
4197.1796875
[..and so on for about 1030 lines...]
3994.4296875
# row = 2
9088.046875
8680.6953125
[..and so on for about 1030 lines...]
8253.0546875

最终文件应如下所示:

row0          row1         row2       row3
9501.7734375  4396.1953125 etc        etc
9279.390625   4197.1796875
[..snip...]   [...snip...]
8836.5        3994.4296875
8615.1640625  3994.4347453

最好是python,因为我有几年前的经验! 感谢大家, 乔恩

1 个答案:

答案 0 :(得分:1)

from io import StringIO
from collections import OrderedDict

datastring = StringIO(u"""\
# row = 0
9501.7734375
9279.390625
8615.1640625
# row = 1
4396.1953125
4197.1796875
3994.4296875
# row = 2
9088.046875
8680.6953125
8253.0546875
""")      

content = datastring.readlines()
out = OrderedDict()
final = []

for line in content:
    if line.startswith('# row'):
        header = line.strip('\n#')
        out[header] = []
    elif line not in out[header]:
        out[header].append(line.strip('\n'))


for k, v in out.iteritems():
    temp = (k + ',' + ','.join([str(item) for item in v])).split(',')
    final.append(temp)

final = zip(*final)
with open("C:/temp/output.csv", 'w') as fout:
    for item in final:
    fout.write('\t'.join([str(i) for i in item]))

输出:

 row = 0         row = 1        row = 2
9501.7734375    4396.1953125    9088.046875
9279.390625     4197.1796875    8680.6953125
8615.1640625    3994.4296875    8253.0546875