Question

我的数据框包含包含(和)的列名（来自.csv文件），我想用_替换它们。

我如何为所有列做到这一点？

Answer 1

使用str.replace：

df.columns = df.columns.str.replace("[()]", "_")

样品：

df = pd.DataFrame({'(A)':[1,2,3],
                   '(B)':[4,5,6],
                   'C)':[7,8,9]})

print (df)
   (A)  (B)  C)
0    1    4   7
1    2    5   8
2    3    6   9

df.columns = df.columns.str.replace(r"[()]", "_")
print (df)
   _A_  _B_  C_
0    1    4   7
1    2    5   8
2    3    6   9

Answer 2

方括号用于划分要提取的字符范围。例如：

In [131]: arr
Out[131]: 
array([(0, 0., 0, 0), (0, 0., 0, 0), (0, 0., 0, 0)],
     dtype=[('w', '<i8'), ('x', '<f8'), ('y', '<i8'), ('z', '<i8')])

In [132]: arr[['w','y']]
Out[132]: 
array([(0, 0), (0, 0), (0, 0)],
     dtype={'names':['w','y'], 'formats':['<i8','<i8'], 'offsets':[0,16], 'itemsize':32})

将提取出我们同时拥有“国家”和“国家”字样的两种情况，即提取N或n。

Pandas替换所有列名中的字符

2 个答案: