简单的 Pandas DataFrame read_csv 然后 GroupBy with Count / KeyError

时间:2021-02-24 04:04:08

标签: pandas dataframe counting

我只是想获取给定列中某个值的行数,例如:

CSV 数据:

'Occupation','data'
'Carpenter','data1'
'Carpenter','data2'
'Carpenter','data3'
'Painter','data1'
'Painter','data2'
'Programmer','data1'
'Programmer','data2'
'Programmer','data3'
'Programmer','data4'

计划:

filename = "./data/TestGroup.csv"

df = pd.read_csv(filename)
print(df.head())

print("Computing stats by HandRank... ")
df_stats = df[['data']].groupby(['Occupation']).agg(['count'])
# also tried:  df_stats = df[['Occupation']].groupby(['Occupation']).agg(['count'])
print(df_stats.head())

如何获取变量中的计数? .groupby 和 .agg 是否返回另一个数据帧?

输出/错误:

  'Occupation'   'data'
0  'Carpenter'  'data1'
1  'Carpenter'  'data2'
2  'Carpenter'  'data3'
3    'Painter'  'data1'
4    'Painter'  'data2'
    Computing stats by HandRank... 
    Traceback (most recent call last):
      File "C:\Apps\PokerHandGenerator_Copy_not_Source\Server\TestPandasGroupBy.py", line 17, in <module>
        df_stats = df.groupby(['Occupation']).agg(['count'])
      File "C:\Apps\ProcessData\venv\lib\site-packages\pandas\core\frame.py", line 6714, in groupby
        return DataFrameGroupBy(
      File "C:\Apps\ProcessData\venv\lib\site-packages\pandas\core\groupby\groupby.py", line 560, in __init__
        grouper, exclusions, obj = get_grouper(
      File "C:\Apps\ProcessData\venv\lib\site-packages\pandas\core\groupby\grouper.py", line 811, in get_grouper
        raise KeyError(gpr)
    KeyError: 'Occupation'

df.head() 显示它使用“职业”作为我的列名。

1 个答案:

答案 0 :(得分:1)

Pandas 将第一列视为“职业”而不是职业。

使用这个:-

df_stats = df.groupby("'Occupation'").agg(['count'])

而不是使用这个:-

df_stats = df[['data']].groupby(['Occupation']).agg(['count'])
相关问题