我有这个错误,任何人都可以告诉我解决方案

时间:2014-06-10 12:00:51

标签: python

这是我的代码:

import pandas as pd
import numpy as np

# read dataframe
df = pd.read_csv("loc-brightkite_totalCheckins.txt", usecols=["location_id", "user"], delim_whitespace=True, names=["user", "check_in_time", "latitude", "longitude", "location_id"])

# remove duplicates (regarding location and user)
df = df.drop_duplicates(subset=["user", "location_id"])

#group by the locations, make each a series of users, count users
distinct_location_users = df.groupby('location_id')['user'].agg(lambda user_series: len(user_series))

# print top 10 locations
top_10 = distinct_location_users.order().tail(11)

print top_10

top_10.plot(kind="bar")

我收到了这个错误:

TypeError                                 Traceback (most recent call last)
<ipython-input-7-5c9c8115e794> in <module>()
      6 
      7 # remove duplicates (regarding location and user)
----> 8 df = df.drop_duplicates(subset=["user", "location_id"])
      9 
     10 #group by the locations, make each a series of users, count users

TypeError: drop_duplicates() got an unexpected keyword argument 'subset'

2 个答案:

答案 0 :(得分:5)

正如您在此处所见:http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.drop_duplicates.html

&#34;子集&#34;不是&#34; drop_duplicates&#34;的授权关键字方法

我认为你可以使用&#34; cols&#34;而不是&#34;子集&#34;。

答案 1 :(得分:1)

您正在以错误的方式使用drop_duplicates功能。看看pandas的drop_duplicates接受了什么参数。

关于Panda drop_duplicates的短搜索会产生Panda中两个drop_duplicates方法之一的文档(另一个是类系列)。

DataFrame.drop_duplicates(cols=None, take_last=False, inplace=False)

相关问题