更改输出的表格格式

时间:2017-11-05 03:18:16

标签: python python-2.7 python-3.x pandas

我想更改以下代码的输出格式。

import pandas as pd

x= pd.read_csv('x.csv')
y= pd.read_csv('y.csv')
z= pd.read_csv('z.csv')
list = pd.merge(x, y, how='left', on=['xx'])
list = pd.merge(list, z, how='left', on=['xx'])
columns_to_keep =  ['yy','zz', 'uu']
list = list.set_index(['xx'])
list = list[columns_to_keep]
list = list.sort_index(axis=0, level=None, ascending=True, inplace=False, 
                                   sort_remaining=True, by=None)
with open('write.csv','w') as f:
    list.to_csv(f,header=True, index=True, index_label='xx')

来自:

id date user_id user_name

1 8/13/2007 1 a1

2 1/8/2007 2 a2

2 1/8/2007 3 a3

3 12/14/2007 4 a4

4 3/6/2008 5 a5

4 4/14/2009 6 a6

4 5/30/2008 7 a7

4 5/30/2008 8 a8

5 6/17/2007 9 a9

到此:

id date user_id user_name

1 8/13/2007 1 a1

2 1/8/2007 2; 3 a2; a3

3 12/14/2007 4 a4

4 3/6/2008 5; 6; 7; 8 a5; a6; a7; a8

5 6/17/2007 9 a9

1 个答案:

答案 0 :(得分:0)

我认为以下内容应该适用于最终的数据帧(列表),但我建议不要使用" list"作为一个名称,因为它是python中的内置函数,您可能希望在其他地方使用该函数。因此,在我的代码中,我将使用" df"而不是" list":

ind = list(set(df.index.get_values()))
finaldf = pd.DataFrame(columns = list(df.columns))
for val in ind:
tempDF = df.loc[val]
print tempDF
for i in range(tempDF.shape[0]):
    for jloc,j in enumerate(list(df.columns)):
        if i != 0 and j != 'date':
            finaldf.loc[val,j] += (";"+str(tempDF.iloc[i,jloc]))
        elif i == 0:
            finaldf.loc[val,j] = str(tempDF.iloc[i,jloc])
print finaldf