获取另一列加权的标准化值计数?

时间:2016-08-04 16:46:07

标签: python pandas

我在Pandas中有这样的数据框:

df = pd.DataFrame({
  'org': ['A1', 'B1', 'A1', 'B2'], 
  'DIH': [True, False, True, False], 
  'Quantity': [10,20,10,20], 
  'Items': [1, 2, 3, 4]
})

现在我想获取Quantity的值计数和模态值,但是按Items的数量加权。

所以我知道我可以做到

df.groupby('Quantity').agg({'Items': 'sum'}).sort_values('Items', ascending=False)

得到这个:

Quantity    Items
20          6
10          4

但是我怎么把它作为一个百分比值,像这样?

Quantity    Items
20          60
10          40

4 个答案:

答案 0 :(得分:2)

这对我有用

df.groupby('Quantity').agg({'Items': 'sum'}).sort_values('Items', ascending=False)/df['Items'].sum()*100

答案 1 :(得分:1)

如果它有一些兴趣,这里有一个函数,它将数据帧作为输入并输出加权值计数(标准化或不标准化)。

{
"manifestVersion": 1,
"id": "usButton",
"version": "1.0.56",
"name": "usButton",
"publisher": "Logrocon",
"icons": {
    "default": "img/logo.png"
},
"targets": [
    {
        "id": "Microsoft.VisualStudio.Services"
    }
],
"tags": [
    "Work Item",
    "Work Item control"
],
"files": [
    {
        "path": "img",
        "addressable": true
    },
    {
        "path": "dist",
        "addressable": true
    },
    {
      "path": "scripts/main.js",
      "contentType": "text/javascript",
      "addressable": true
    },
    {
        "path": "info.html",
        "addressable": true
    },

    {
        "path": "usButton.html",
        "addressable": true
    }
],
"categories": [
    "Plan and track"
],
"scopes": [
    "vso.work_write"
],
"contributions": [
        {
        "id": "usButton",
        "type": "ms.vss-work-web.work-item-form-control",
        "targets": [
            "ms.vss-work-web.work-item-form"
        ],
        "properties": {
            "name": "usButton",
            "uri": "usButton.html",
            "height": 40,
            "inputs": [
                {
                    "id": "FieldAppTestBtn",
                    "description": "Autocalculate Remaining Work.",
                    "type": "WorkItemField",
                    "properties": {
                        "workItemFieldTypes": ["Double"]
                    },
                    "validation": {
                        "dataType": "String",
                        "isRequired": true
                    }
                }
            ]
        }
    },
    {
        "id": "info",
        "targets": [],
        "description": "The content to be displayed in the dialog",
        "type": "ms.vss-web.control",
        "properties": {
            "uri": "info.html"
        }
    }        
]
}

使用问题示例,权重位于def weighted_value_counts(x, *args, **kwargs): normalize = kwargs.get('normalize', False) c0 = x.columns[0] c1 = x.columns[1] xtmp = x[[c0,c1]].groupby(c0).agg({c1:'sum'}).sort_values(c1,ascending=False) s = pd.Series(index=xtmp.index, data=xtmp[c1], name=c0) if normalize: s = s / x[c1].sum() return s 列中 您可以通过执行以下操作来获取加权归一化值计数:

Item

答案 2 :(得分:0)

只需在代码中再添加一行:

df2 = df.groupby('Quantity').agg({'Items': 'sum'}).sort_values('Items', ascending=False)
df2['Items']=(df2['Items']*100)/df2['Items'].sum()

print (df2)
Output :
              Items
Quantity       
20         60.0
10         40.0

答案 3 :(得分:0)

尝试这一点(一行):

df.groupby('Quantity').agg({'Items': 'sum'}).sort_values('Items', ascending=False).apply(lambda x: 100*x/float(x.sum()))