Pandas df基于索引列对行进行求和

时间:2016-01-20 00:45:37

标签: python pandas indexing dataframe

我有一个Pandas df(见下文),我想根据索引列对值进行求和。我的索引列包含字符串值。请参阅下面的示例,这里我尝试将移动,播放和使用电话一起添加为"活动时间"并将它们的相应值相加,同时保留其他索引值,因为这些值已经存在。任何建议,我如何使用这种类型的场景?

**Activity  AverageTime**
Moving      0.000804367 
Playing     0.001191772 
Stationary  0.320701558 
Using Phone 0.594305473 
Unknown     0.060697612 
Idle        0.022299218 

2 个答案:

答案 0 :(得分:2)

我确信必须有一种更简单的方法,但这是一种可能的解决方案。

# Filters for active and inactive rows
active_row_names = ['Moving','Playing','Using Phone']
active_filter = [row in active_row_names for row in df.index]
inactive_filter = [not row for row in active_filter]

active = df.loc[active_filter].sum()       # Sum of 'active' rows as a Series
active  = pd.DataFrame(active).transpose() # as a dataframe, and fix orientation
active.index=["active"]                    # Assign new index name

# Keep the inactive rows as they are, and replace the active rows with the
# newly defined row that is the sum of the previous active rows.
df = df.loc[inactive_filter].append(active, ignore_index=False)

<强>输出

Activity       AverageTime
Stationary     0.320702
Unknown        0.060698
Idle           0.022299
active         0.596302

即使数据帧中只存在活动行名称的子集,这也会起作用。

答案 1 :(得分:0)

我会添加一个名为“active”的新布尔列,然后添加groupby列:

df['active']=False
df['active'][['Moving','Playing','Using Phone']] = True
df.groupby('active').AverageTime.sum()
相关问题