Groupby值以。开头

时间:2014-08-07 09:50:15

标签: pandas

有没有办法用相同的前缀

计算所有值
df.groupby(['MONTH'])['VAL_M1','VAL_M2','VAL_FULL','VAL_VER'].agg(['sum'])

指示'VAL_M1','VAL_M2','VAL_FULL','VAL_VER'

使用SAS,您只需要输入VAL:

我怎么能用熊猫做到这一点?

1 个答案:

答案 0 :(得分:1)

没有内置方法,但您可以轻松构建所需列的列表:

In [346]:
# build a list of existing columns
columns = list(df)
prefix='VAL'
# perform a list comprehension where column startswith the prefix 
cols = [x for x in columns if x.startswith(prefix) ]
In [347]:
cols
Out[347]:
['VAL_M1', 'VAL_M2', 'VAL_FULL', 'VAL_VER']
In [348]:
# now code is a little shorter
df.groupby(['MONTH'])[cols].agg(['sum'])
# my df has dummy data
Out[348]:
             VAL_M1    VAL_M2  VAL_FULL   VAL_VER
                sum       sum       sum       sum
MONTH                                            
-1.532558  0.868693 -0.302502 -0.434885  1.508662
-0.998384 -0.123799 -0.040477  1.014650 -0.783075
-0.684523  2.320911  2.000733  0.274961  0.126873
-0.414702  1.392947 -0.171937 -0.051815 -0.887229
-0.219279 -0.418810 -1.460006 -1.310480 -0.546437
 0.225726 -1.431633 -1.701184 -1.182562 -1.013886
 0.692964  1.478887  3.255294 -0.083931  0.204652
 0.818273 -1.645403 -1.774919  0.329704  0.192604
 0.914160 -1.036230 -1.662280 -1.154687  0.108503
 1.820811  0.300040  0.441961 -0.029089 -1.907390