熊猫新手很抱歉,如果这是老帽子。我尝试完成的内容与grouping rows in list in pandas groupby中包含的内容类似,但我有两列以上,无法弄清楚如何将所有列与分组值一起显示。这就是我想要做的事情。
data = [{'ip': '192.168.1.1', 'make': 'Dell', 'model': 'UltraServ9000'},
{'ip': '192.168.1.3', 'make': 'Dell', 'model': 'MiniServ'},
{'ip': '192.168.1.5', 'make': 'Dell', 'model': 'UltraServ9000'},
{'ip': '192.168.1.6', 'make': 'HP', 'model': 'Thinger3000'},
{'ip': '192.168.1.8', 'make': 'HP', 'model': 'Thinger3000'}]
In [2]: df = pd.DataFrame(data)
In [3]: df
Out[4]:
ip make model
0 192.168.1.1 Dell UltraServ9000
1 192.168.1.3 Dell MiniServ
2 192.168.1.5 Dell UltraServ9000
3 192.168.1.6 HP Thinger3000
4 192.168.1.8 HP Thinger3000
<magic>
Out[?]:
ip make model
0 192.168.1.1, 192.168.1.5 Dell UltraServ9000
1 192.168.1.3 Dell MiniServ
3 192.168.1.6, 192.168.1.8 HP Thinger3000
提前致谢:)
答案 0 :(得分:2)
groupby
takes a parameter, by
, through which you can specify a list
of variables you want to operate your groupby
over. So the answer of that question is modified as follows:
df.groupby(by = ["a", "c"])["b"].apply(list).reset_index()
EDIT: Looking at your comment: since all columns other than a
have the same values, you can list them easily in the by
parameter because they won't affect the result. To save you time and prevent you to actually type all the names you could do something like this:
df.groupby(by = list(set(df.columns) - set(["b"])))["b"].apply(list).reset_index()
Alternatively, you could exploit the agg
function by passing a dictionary which for all columns will take the max
and for b
will return the list:
aggregate_functions = {x: max for x in df.columns if x != "a" and x != "b"}
aggregate_functions["b"] = lambda x: list(x)
df.groupby(by = "a").agg(aggregate_functions)
Which you prefer is up to you, probably the latter is more readable.