我是编程新手,如果有人可以在Python / Pandas中提供以下帮助,我将不胜感激。 我有一个列表作为值列表。我希望能够将具有相似值的键组合在一起。我在这里看到过类似的问题,但在这种情况下的问题是我想忽略值的顺序,例如:
classmates={'jack':['20','male','soccer'],'brian':['26','male','tennis'],'charles':['male','soccer','20'],'zulu':['19','basketball','male']}
jack和charles具有相同的值,但顺序不同。我想要一个输出,无论顺序如何都会给出值。在这种情况下,输出将被写入csv
['20','male','soccer']: jack, charles
['26','male','tennis']: brian
['19','basketball','male']: zulu
答案 0 :(得分:2)
使用frozensets
,apply
,groupby
+ agg
:
s = pd.DataFrame(classmates).T.apply(frozenset, 1)
s2 = pd.Series(s.index.values, index=s)\
.groupby(level=0).agg(lambda x: list(x))
s2
(soccer, 20, male) [charles, jack]
(26, male, tennis) [brian]
(basketball, male, 19) [zulu]
dtype: object
答案 1 :(得分:1)
您可以使用以下代码以您希望的方式反转字典:
classmates={'jack':['20','male','soccer'],'brian':['26','male','tennis'],'charles':['male','soccer','20'],'zulu':['19','basketball','male']}
out_dict = {}
for key, value in classmates.items():
current_list = out_dict.get(tuple(sorted(value)), [])
current_list.append(key)
out_dict[tuple(sorted(value))] = current_list
print(out_dict)
打印
{('20', 'male', 'soccer'): ['charles', 'jack'], ('26', 'male', 'tennis'): ['brian'], ('19', 'basketball', 'male'): ['zulu']}
答案 2 :(得分:1)
from collections import defaultdict
ans = defaultdict(list)
classmates={'jack':['20','male','soccer'],
'brian':['26','male','tennis'],
'charles':['male','soccer','20'],
'zulu':['19','basketball','male']
}
for k, v in classmates.items():
sorted_tuple = tuple(sorted(v))
ans[sorted_tuple].append(k)
# ans is: a dict you desired
# defaultdict(<class 'list'>, {('20', 'male', 'soccer'): ['jack','charles'],
# ('26', 'male', 'tennis'): ['brian'], ('19', 'basketball', 'male'): ['zulu']})
for k, v in ans.items():
print(k, ':', v)
# output:
# ('20', 'male', 'soccer') : ['jack', 'charles']
# ('26', 'male', 'tennis') : ['brian']
# ('19', 'basketball', 'male') : ['zulu']
答案 3 :(得分:0)
首先将字典转换为pandas数据帧。
df= pd.DataFrame.from_dict(classmates,orient='index')
然后按年龄按升序排序。
df=df.sort_values(by=0,ascending=True)
这里0是默认列名。您可以重命名此列名称。
答案 4 :(得分:0)
您可以在一行中执行此操作:
print({tuple(sorted(v)) : [k for k,vv in a.items() if sorted(vv) == sorted(v)] for v in a.values()})
或
以下是详细解决方案:
dict_1 = {'jack': ['20', 'male', 'soccer'], 'brian': ['26', 'male', 'tennis'], 'charles': ['male', 'soccer', '20'],
'zulu': ['19', 'basketball', 'male']}
sorted_dict = {}
for key,value in dict_1.items():
sorted_1 = sorted(value)
sorted_dict[key] = sorted_1
tracking_of_duplicate = []
final_dict = {}
for key1,value1 in sorted_dict.items():
if value1 not in tracking_of_duplicate:
tracking_of_duplicate.append(value1)
final_dict[tuple(value1)] = [key1]
else:
final_dict[tuple(value1)].append(key1)
print(final_dict)