在pandas中创建数据透视表时出错

时间:2015-03-02 14:53:40

标签: python csv pandas

我已搜索过,找不到其他任何有此问题的人。我正在尝试创建一个汇总表来汇总csv文件,然后通过电子邮件将该数据透贴给我自己。我已经构建了执行此过程的代码,但它并不普遍。我一直在我的列名上得到一个KeyError,但是如果我删除了不属于表的所有列和行,它就会奇迹般地起作用。

这是我的代码:

df = pandas.read_csv('/path/to/file'),encoding='utf-8')
pivot = pandas.pivot_table(df,index=['ClientID','ClientName','Branch'],
                           values=['EmailAddress'],aggfunc='count',margins=True)
pivotlocation = '/path/to/save'
pivot.to_csv(pivotlocation)

对于我的生活,我无法弄清楚出了什么问题,或者为什么这会对某些文件起作用而不是其他文件。

此外,这是抛出的错误:

Traceback (most recent call last):
File "C:\Users\rfulton\Desktop\Automation\Reports\UniversalUpload.py", line 86, in create_pivot
  pivot = pandas.pivot_table(df,index=columns,values=aggvalue,aggfunc='count',margins=True)
File "C:\Python34\lib\site-packages\pandas\util\decorators.py", line 88, in wrapper
  return func(*args, **kwargs)
File "C:\Python34\lib\site-packages\pandas\util\decorators.py", line 88, in wrapper
  return func(*args, **kwargs)
File "C:\Python34\lib\site-packages\pandas\tools\pivot.py", line 114, in pivot_table
  grouped = data.groupby(keys)
File "C:\Python34\lib\site-packages\pandas\core\generic.py", line 2898, in groupby
  sort=sort, group_keys=group_keys, squeeze=squeeze)
File "C:\Python34\lib\site-packages\pandas\core\groupby.py", line 1193, in groupby
  return klass(obj, by, **kwds)
File "C:\Python34\lib\site-packages\pandas\core\groupby.py", line 383, in __init__
  level=level, sort=sort)
File "C:\Python34\lib\site-packages\pandas\core\groupby.py", line 2131, in _get_grouper
  in_axis, name, gpr = True, gpr, obj[gpr]
File "C:\Python34\lib\site-packages\pandas\core\frame.py", line 1780, in __getitem__
  return self._getitem_column(key)
File "C:\Python34\lib\site-packages\pandas\core\frame.py", line 1787, in _getitem_column
  return self._get_item_cache(key)
File "C:\Python34\lib\site-packages\pandas\core\generic.py", line 1068, in _get_item_cache
  values = self._data.get(item)
File "C:\Python34\lib\site-packages\pandas\core\internals.py", line 2849, in get
  loc = self.items.get_loc(item)
File "C:\Python34\lib\site-packages\pandas\core\index.py", line 1402, in get_loc
  return self._engine.get_loc(_values_from_object(key))
File "pandas\index.pyx", line 134, in pandas.index.IndexEngine.get_loc (pandas\index.c:3807)
File "pandas\index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas\index.c:3687)
File "pandas\hashtable.pyx", line 696, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12310)
File "pandas\hashtable.pyx", line 704, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12261)
KeyError: 'ClientID'

如上所述,如果我删除表格边界之外的所有单元格,则不再抛出此错误。但是,我不知道如何使用csv或pandas模块执行此操作。

1 个答案:

答案 0 :(得分:0)

原来问题是文件的编码问题 将编码设置为 utf-8-sig 修复了问题。