Pandas SettingWithCopyWarning重新排序列的分类值

时间:2018-12-21 08:48:29

标签: python pandas warnings

这是我正在制作的custom function

"""
Returns a seaborn barplot of the top/bottom n elements
Parameters:
n        -- Number of elements to use.
top      -- True gets the top n elements, False will get the bottom n 
elements. Default: True.
**kwargs -- Dictionary of values that satisfy the parameters of the barplot. 
Minimum necessary is data, x, y, estimator.
Return:
seaborn.barplot object
"""
import seaborn as sns
import numpy as np
import pandas as pd

def top_n_barplot(n, top = True, **kwargs):
    if n < 1:
        raise ValueError("n cannot be smaller than 1.")

    #Get necessary data
    data = kwargs["data"]
    x = kwargs["x"]
    y = kwargs["y"]
    estimator = kwargs["estimator"]

    #Get all the results and sort
    result = data.groupby(x)[y].apply(estimator).sort_values()

    if top:
        result = result[-n:]
    else:
        result = result[:n]

    #Filter according to necessary data
    top_x = result.index    
    newdata = data.loc[data[x].isin(top_x.values)]
    #Make values categorical and order them
    newdata[x] = pd.Categorical(newdata[x], categories=top_x, ordered=True)
    #Assign new data to use for plot
    kwargs["data"] = newdata

    return sns.barplot(**kwargs)

第38行给了我一个SettingWithCopyWarning。

newdata[x] = pd.Categorical(newdata[x], categories=top_x, ordered=True)

这里是完整警告:

C:\...\AppData\Local\Continuum\anaconda3\lib\site- 
packages\pandas\core\indexing.py:543: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas- 
docs/stable/indexing.html#indexing-view-versus-copy
  self.obj[item] = s

此外,我还尝试了其他方法来查看我做错了什么,因此我将第38行更改为newdata.loc[:,x] = newdata.loc[:,x].astype("category")  希望以后再使用this function重新排序。但是,我仍然得到了完全相同的警告。

任何帮助将不胜感激。我的功能按预期工作,但我真的很想了解为什么收到此警告。

0 个答案:

没有答案