更改每条记录的格式化文件

时间:2018-07-31 16:26:01

标签: python format

我要更改格式如下的文件:

1182659 Sample05 22
1182659 Sample33 14
4758741 Sample05 74
4758741 Sample33 2
3652147 Sample05 8
3652147 Sample33 34

对此:

       Sample05 Sample33 
1182659 22 14
4758741 74 2
3652147 8 34

我看到的一种方法是使用双索引字典,但是我想知道在进入之前是否有更简单的方法。

2 个答案:

答案 0 :(得分:1)

没有pandas,但是来自groupby的{​​{1}}:

itertools

打印:

from itertools import groupby

data = """
1182659 Sample05 22
1182659 Sample33 14
4758741 Sample05 74
4758741 Sample33 2
3652147 Sample05 8
3652147 Sample33 34
"""

groups = groupby((line.split() for line in data.splitlines() if line), key=lambda v: v[0])

rows = []
headers = []
for g, v in groups:
    v = list(v)
    for i in v:
        if i[1] not in headers:
            headers.append(i[1])
    rows.append([g] + [i[-1] for i in v])

print('\t'+ '\t'.join(headers))
for row in rows:
    for value in row:
        print(value, end='\t')
    print()

答案 1 :(得分:0)

使用pandas

import pandas as pd

# if the delimeter is a space
df = pd.read_csv("<path to file>.txt", sep=" ", header=None)
df.set_index([0, 1])[2].unstack()

输出:

1        Sample05  Sample33
0
1182659        22        14
3652147         8        34
4758741        74         2