Python / Pandas DataFrame.Drop无法识别汉字中的列名

时间:2018-01-14 20:29:00

标签: python

这是Jupyter脚本。有什么建议为什么“不工作”在下面?

import pandas as pd
df = pd.read_csv('hw1.csv', encoding='utf-8', skipinitialspace=True )
df.drop(['序号'], axis=1, inplace=True) # <= Works
#df.drop(['年度'], axis=1, inplace=True) # <= Does NOT work
df

----- hw1.csv文件----- 序号,年度,直接排放,间接排放,直接排放间接排放,一般烟煤,汽油,柴油,液化石油气,炼厂干气,天然气 1,2016,4647.09,4843.06,9490.15,2004.98,136.08,13.9,45.1816 2,2016,2496.72,3668.16,6164.879999999999,1368.83 ,,, 28.02,10.593 3,2016,10729.74,4042.2,14771.94,6681.8 ,,, 20.6 ,, 4,2016,231163.34,206918.68,438082.02,52330.48,13758.75,997.81,4690.22 5,2016,7373.27,4994.84,12368.11,3566.25 ,,, 123.6,60.9229 6,2016,62619.53,3324.15,65943.68 ,,,,,, 2896.1175

1 个答案:

答案 0 :(得分:2)

除了第一个列标题之外,所有列标题都以不可见的字节顺序标记(BOM)'\ufeff开头。在尝试任何与列相关的操作之前将其删除:

'年度' in df.columns
# False
df.columns = [s.replace(u'\ufeff', '') for s in df.columns]
'年度' in df.columns
# True