Question

我汇总了我的Pandas数据框：data。具体来说，我想通过[amount和origin]的元组获得平均值和总和type。为了平均和求和，我尝试了下面的numpy函数：

import numpy as np
import pandas as pd
result = data.groupby(groupbyvars).agg({'amount': [ pd.Series.sum, pd.Series.mean]}).reset_index()

我的问题是amount列包含NaN s，这导致上述代码的result具有大量NaN平均值和总和。

我知道默认情况下pd.Series.sum和pd.Series.mean都有skipna=True，所以我为什么还要NaN来这里？

我也试过这个，这显然不起作用：

data.groupby(groupbyvars).agg({'amount': [ pd.Series.sum(skipna=True), pd.Series.mean(skipna=True)]}).reset_index()

修改根据@ Korem的建议，我也尝试使用partial，如下所示：

s_na_mean = partial(pd.Series.mean, skipna = True)    
data.groupby(groupbyvars).agg({'amount': [ np.nansum, s_na_mean ]}).reset_index()

但得到此错误：

error: 'functools.partial' object has no attribute '__name__'

Answer 1

使用numpy＆＃39; nansum和nanmean：

from numpy import nansum
from numpy import nanmean
data.groupby(groupbyvars).agg({'amount': [ nansum, nanmean]}).reset_index()

作为numpy旧版本的解决方法，也是修复上次尝试的方法：

执行pd.Series.sum(skipna=True)时，实际上是在调用方法。如果你想这样使用它，你想要定义一个partial。因此，如果您没有nanmean，请定义s_na_mean并使用它：

from functools import partial
s_na_mean = partial(pd.Series.mean, skipna = True)

Answer 2

可能为时已晚，但无论如何对其他人可能有用。

尝试应用功能：

$post = ['lang'=> "593f973dea53161779dd5660",'password'=> "amit123d"];
  $ch = curl_init();
  $url="https://httpbin.org/post";
  curl_setopt($ch, CURLOPT_URL,$url);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($post));
  curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
  $response = curl_exec($ch);
  $result = json_decode($response);
  echo "<pre>";print_R($result);

熊猫聚合忽略了NaN＆＃39; s

2 个答案: