pands groupby后不显示列

时间:2018-06-13 07:00:54

标签: python json pandas dataframe

我正在解析两个不同的JSON文件并将数据发送到两个excels。我正在根据列合并来自两个excel的数据。但是当我尝试按组执行时,它会删除两列。以下是示例输出:

   Ep_sg_id           Ep_ip       Ep_netmask      Uuid  \
0  36bc01bf  10.202.221.133  255.255.255.255       NaN   
1  36bc01bf  10.202.220.141  255.255.255.255       NaN   
2  cf564ff3      17.39.68.0  255.255.255.128       NaN   
3  001d2bd5   17.176.253.64  255.255.255.192  001d2bd5   
4       NaN             NaN              NaN  0448d01f   
5       NaN             NaN              NaN  0d928eff   
6       NaN             NaN              NaN  06306991   
7       NaN             NaN              NaN  11003dc5   
8       NaN             NaN              NaN  0a7509ea   

                            Name  
0                            NaN  
1                            NaN  
2                            NaN  
3                            VIP  
4                    ADMIN_HOSTS  
5                    DB-EXTERNAL  
6                           CORP  
7                        POD1-DB  
8                            UAT  
   Ep_sg_id           Ep_ip       Ep_netmask
0  36bc01bf  10.202.221.133  255.255.255.255
1  36bc01bf  10.202.220.141  255.255.255.255
2  cf564ff3      17.39.68.0  255.255.255.128
3  001d2bd5   17.176.253.64  255.255.255.192

       Uuid                           Name
0  001d2bd5                            VIP
1  0448d01f                    ADMIN_HOSTS
2  0d928eff                    DB-EXTERNAL
3  06306991                           CORP
4  11003dc5                        POD1-DB
5  0a7509ea                            UAT

                                  Ep_ip                       Ep_netmask
Ep_sg_id                                                                
001d2bd5                  17.176.253.64                  255.255.255.192
36bc01bf  10.202.221.133,10.202.220.141  255.255.255.255,255.255.255.255
cf564ff3                     17.39.68.0                  255.255.255.128

第一个是两者的组合数据。 第二和第三是各个数据帧。 最后一个是在我执行groupby之后。 Uuid和名字都没了。我不知道如何覆盖滋扰列功能。

这是我的代码:

#!/usr/bin/python
# -*- coding: utf-8 -*-
import xlwt
import json
from xlutils.copy import copy
import xlrd
import pandas as pd
import numpy as np

with open('ep1.txt', 'r') as f:
    js = json.loads(f.read())

with open('sc1.txt', 'r') as f1:
    js2 = json.loads(f1.read())

book = xlwt.Workbook(encoding="utf-8")
book1 = xlwt.Workbook(encoding="utf-8")

sheet1 = book.add_sheet("Sheet 1", cell_overwrite_ok=True)
sheet2 = book1.add_sheet("Sheet 1", cell_overwrite_ok=True)
sheet1.write(0, 0, 'Ep_sg_id')
sheet1.write(0, 1, 'Ep_ip')
sheet1.write(0, 2, 'Ep_netmask')
sheet2.write(0, 0, 'Uuid')
sheet2.write(0, 1, 'Name')
p = 1

for i, j in js.items():
    sg_id = js[i]['Ep_sg_id']
    ip = js[i]['Ep_ip']
    netmask = js[i]['Ep_netmask']

    sheet1.write(p, 0, sg_id)
    sheet1.write(p, 1, ip)
    sheet1.write(p, 2, netmask)
    p = p + 1

q = 1
for i, j in js2.items():
    uuid = js2[i]['Sg']['Uuid']
    name = js2[i]['Sg']['Name']

    sheet2.write(q, 0, uuid)
    sheet2.write(q, 1, name)
    q = q+1

book.save('new.xls')
book1.save('new1.xls')

df = pd.read_excel('new.xls')
df1 = pd.read_excel('new1.xls')
mergedDf = df.merge(df1, how='outer', left_on='Ep_sg_id', right_on='Uuid')
print mergedDf
mergedDf['Uuid'] = mergedDf['Uuid'].replace("", np.nan)
mergedDf['Name'] = mergedDf['Name'].replace("", np.nan)
mergedDf = mergedDf.groupby('Ep_sg_id').agg(','.join)
print df
print
print df1
print
print mergedDf
mergedDf.to_excel('final_excel.xls', index=False)

1 个答案:

答案 0 :(得分:0)

Automatic Exclusion of nuisance column是默认行为,因此您可以复制数据框,例如:

/var/log/apache2

然后按照目前的情况执行extra = mergedDf[['Ep_sg_id', 'Uuid', 'Name']].copy()

groupBy

然后最终合并数据帧

mergedDf = mergedDf.groupby('Ep_sg_id').agg(','.join)