Question

即使这是一个重复的问题，我也无法在下面找到我的问题的确切解决方案。

我有一个名为“ data1”的熊猫数据框，并且我想获取数据类型为“对象”的列的唯一类别的数量。以下是我使用的代码

for col in data1.columns:
if data1[col].dtypes =='object':
    unique_category = len(data1[col].unique())
    print("feature '{col}' has '{unique_category}' unique catogories".format(col=col,unique_category=unique_category))

此代码在其他程序中运行良好。但这一次它给出了以下错误

V

alueError                                Traceback (most recent call last)
<ipython-input-178-03999268fffa> in <module>()
      1 for col in data1.columns:
----> 2     if data1[col].dtypes =='object':
      3         unique_category = len(data1[col].unique())
      4         print("feature '{col}' has '{unique_category}' unique catogories".format(col=col,unique_category=unique_category))
      5 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
   1571         raise ValueError("The truth value of a {0} is ambiguous. "
   1572                          "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
-> 1573                          .format(self.__class__.__name__))
   1574 
   1575     __bool__ = __nonzero__

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

这有什么理由给出错误消息吗？

Answer 1

这是一个例子：

# Create data set
d = {'foo':[100, 111, 222], 
     'bar':['333', '444', '555']}
df = pd.DataFrame(d)
df

#     bar   foo
# 0   333   100
# 1   444   111
# 2   555   222

df.info()

# <class 'pandas.core.frame.DataFrame'>
# RangeIndex: 3 entries, 0 to 2
# Data columns (total 2 columns):
# bar    3 non-null object             # <- object type column
# foo    3 non-null int64
# dtypes: int64(1), object(1)
# memory usage: 128.0+ bytes

for col in range(len(df.dtypes)):
    if df.dtypes[col] == 'O':           # <- can also use `O`
        unique_category = len(df.loc[:,df.columns[col]].unique())
        print("feature '{col}' has '{unique_category}' unique categories".format(col=df.columns[col],unique_category=unique_category))
# feature 'bar' has '3' unique categories

Answer 2

您可以只使用select_dtypes：

for col in data1.select_dtypes('object'):
    print(f'feature {col} has {data1[col].nunique()} unique categories')

它将自动为您选择对象列

Python Pandas ValueError：系列的真值不明确

2 个答案: