熊猫groupby聚合保持相等的值

时间:2019-08-08 09:44:26

标签: python pandas

我正在尝试构建一个聚合器,如果它等于变量中的所有其他值,则简单地返回一个值,否则返回NaN。

在汇总感官数据时保留元信息。

我收到一个奇怪的按键错误...

     v1   v2  v3  v4
 0   1  NaN   1   2
 1   2  NaN   NaN 3

预期输出为:

Traceback (most recent call last):

  File "pandas\_libs\index.pyx", line 131, in pandas._libs.index.IndexEngine.get_loc

  File "pandas\_libs\hashtable_class_helper.pxi", line 998, in pandas._libs.hashtable.Int64HashTable.get_item

KeyError: 0

但是我收到一个关键错误:

GoogleCredential credential = GoogleCredential.getApplicationDefault();

2 个答案:

答案 0 :(得分:1)

您需要使用iloc

检查位置
import pandas as pd
import numpy as np

df = pd.DataFrame.from_dict({'v1' : [1,1,1,2,2,2],
                             'v2' : [1,2,3,4,5,6],
                             'v3' : [1,1,1,2,3,2],
                             'v4' : [2,2,2,3,3,3]})
def keep_equal(x):
    if (x == x.iloc[0]).all(): return x.iloc[0]
    else: return np.NaN

df =  df.groupby(df["v1"], as_index=False, observed =True).agg(keep_equal)
print(df)
>>
   v1  v2   v3  v4
0   1 NaN  1.0   2
1   2 NaN  NaN   3

答案 1 :(得分:1)

如果性能更重要,请在此处使用Series.iat作为Series的第一个值的选择:

df = pd.DataFrame.from_dict({'v1' : [1,1,1,2,2,2],
                             'v2' : [1,2,3,4,5,6],
                             'v3' : [1,1,1,2,3,2],
                             'v4' : [2,2,2,3,3,3]})
def keep_equal(x):
    if (x == x.iat[0]).all(): 
        return x.iat[0]
    else: 
        return np.NaN

或使用1d numpy数组:

def keep_equal(x):
    if (x == x.values[0]).all(): 
        return x.values[0]
    else: 
        return np.NaN

df =  df.groupby(df["v1"], as_index=False).agg(keep_equal)
print (df)
   v1  v2   v3  v4
0   1 NaN  1.0   2
1   2 NaN  NaN   3