为什么itertools.groupby()不起作用?

时间:2018-05-06 10:21:26

标签: python python-3.x group-by itertools

我已经检查了一些关于 <ion-header> <ion-navbar> <ion-title> Ionic Blank </ion-title> </ion-navbar> </ion-header> <ion-content padding> <ion-list> <ion-item *ngFor = "let i of info"> {{i.first}} {{i.last}} </ion-item> </ion-list> </ion-content>` 的主题但我没有弄清楚我的例子有什么问题:

groupby()

分别打印每个学生。为什么我不能只获得3个群组:students = [{'name': 'Paul', 'mail': '@gmail.com'}, {'name': 'Tom', 'mail': '@yahoo.com'}, {'name': 'Jim', 'mail': 'gmail.com'}, {'name': 'Jules', 'mail': '@something.com'}, {'name': 'Gregory', 'mail': '@gmail.com'}, {'name': 'Kathrin', 'mail': '@something.com'}] key_func = lambda student: student['mail'] for key, group in itertools.groupby(students, key=key_func): print(key) print(list(group)) @gmail.com@yahoo.com

2 个答案:

答案 0 :(得分:2)

对于初学者来说,有些邮件是gmail.com,有些邮件是@gmail.com,这就是为什么将它们视为单独的组。

groupby还希望数据按相同的key函数进行预排序,这就解释了为什么会有两次@something.com

来自docs

  

...通常,迭代需要已经在相同的键函数上排序。 ...

students = [{'name': 'Paul', 'mail': '@gmail.com'}, {'name': 'Tom', 'mail': '@yahoo.com'},
            {'name': 'Jim', 'mail': 'gmail.com'}, {'name': 'Jules', 'mail': '@something.com'},
            {'name': 'Gregory', 'mail': '@gmail.com'}, {'name': 'Kathrin', 'mail': '@something.com'}]

key_func = lambda student: student['mail']

students.sort(key=key_func)
# sorting by same key function we later use with groupby

for key, group in itertools.groupby(students, key=key_func):
    print(key)
    print(list(group))

#  @gmail.com
#  [{'name': 'Paul', 'mail': '@gmail.com'}, {'name': 'Gregory', 'mail': '@gmail.com'}]
#  @something.com
#  [{'name': 'Jules', 'mail': '@something.com'}, {'name': 'Kathrin', 'mail': '@something.com'}]
#  @yahoo.com
#  [{'name': 'Tom', 'mail': '@yahoo.com'}]
#  gmail.com
#  [{'name': 'Jim', 'mail': 'gmail.com'}]

在修复了排序和gmail.com / @gmail.com之后,我们得到了预期的输出:

import itertools

students = [{'name': 'Paul', 'mail': '@gmail.com'}, {'name': 'Tom', 'mail': '@yahoo.com'},
            {'name': 'Jim', 'mail': '@gmail.com'}, {'name': 'Jules', 'mail': '@something.com'},
            {'name': 'Gregory', 'mail': '@gmail.com'}, {'name': 'Kathrin', 'mail': '@something.com'}]

key_func = lambda student: student['mail']

students.sort(key=key_func)

for key, group in itertools.groupby(students, key=key_func):
    print(key)
    print(list(group))

#  @gmail.com
#  [{'mail': '@gmail.com', 'name': 'Paul'},
#   {'mail': '@gmail.com', 'name': 'Jim'},
#   {'mail': '@gmail.com', 'name': 'Gregory'}]
#  @something.com
#  [{'mail': '@something.com', 'name': 'Jules'},
#   {'mail': '@something.com', 'name': 'Kathrin'}]
#  @yahoo.com
#  [{'mail': '@yahoo.com', 'name': 'Tom'}]

答案 1 :(得分:-1)

itertools使用数据的排序顺序。您的列表未排序。

因此,如果您有[&#34; gmail.com&#34;,&#34; something.com&#34;,&#34; gmail.com&#34;] itertools将创建三个组。这与一些函数式语言中的groupby不同(或者说是Python pandas)。

您需要先对字典进行排序。

import itertools

students = [{'name': 'Paul', 'mail': '@gmail.com'}, {'name': 'Tom',    'mail': '@yahoo.com'},
            {'name': 'Jim', 'mail': 'gmail.com'}, {'name': 'Jules', 'mail': '@something.com'},
            {'name': 'Gregory', 'mail': '@gmail.com'}, {'name': 'Kathrin', 'mail': '@something.com'}]


 for key, group in itertools.groupby(sorted(students, key=lambda x: x["mail"]), key=lambda student: student['mail']):
     print(key)
     print(list(group))

# @gmail.com
# [{'name': 'Paul', 'mail': '@gmail.com'}, {'name': 'Gregory', 'mail': '@gmail.com'}]
# @something.com
# [{'name': 'Jules', 'mail': '@something.com'}, {'name': 'Kathrin', 'mail': '@something.com'}]
# @yahoo.com
#[{'name': 'Tom', 'mail': '@yahoo.com'}]
#gmail.com
# [{'name': 'Jim', 'mail': 'gmail.com'}]