使用itertools.groupby汇总报告

时间:2015-09-23 04:58:18

标签: python python-3.x

有人可以帮我按功能第一栏,第二栏和第三栏进行分组。

from itertools import groupby
from operator import itemgetter

things = [('2009-09-02','j', 12),
          ('2009-09-02','j', 3),
          ('2009-09-03','k',10),
          ('2009-09-03','k',4),
          ('2009-09-03','u', 22),
          ('2009-09-06','m',33)]

for k, items in groupby(things, itemgetter(1)):    
    for subitem in items:
        print(subitem)

得到了这个结果:

('2009-09-02', 'j', 12) ('2009-09-02', 'j', 3) ('2009-09-03', 'k', 10) ('2009-09-03', 'k', 4) ('2009-09-03', 'u', 22) ('2009-09-06', 'm', 33) 

期待这个结果:

 ('2009-09-02', 'j', 15) ('2009-09-03', 'k', 14) ('2009-09-03', 'u', 22) ('2009-09-06', 'm', 33)

=============================================== =========================

   sales = [('Scotland', 'Edinburgh', 20000),
         ('Scotland', 'Glasgow', 12500),
         ('Wales', 'Cardiff', 29700),
         ('Wales', 'Bangor', 12800),
         ('England', 'London', 90000),
         ('England', 'Manchester', 45600),
         ('England', 'London', 29700)]

3 个答案:

答案 0 :(得分:2)

>>> for a, b in groupby(things, itemgetter(0, 1)):
...     print(a, sum(lst[2] for lst in b))

('2009-09-02', 'j') 15
('2009-09-03', 'k') 14
('2009-09-03', 'u') 22
('2009-09-06', 'm') 33

答案 1 :(得分:0)

您不需要groupby作为一种更有效的方式,您可以使用dict.setdefault方法使用字典:

>>> d={}
>>> 
>>> for date,char,val, in things:
...       d.setdefault((date,char),[]).append(val)
... 
>>> [(i,j,sum(k)) for (i,j),k in d.items()]
[('2009-09-02', 'j', 15), ('2009-09-03', 'u', 22), ('2009-09-06', 'm', 33), ('2009-09-03', 'k', 14)]
>>> 

如果您想使用groupby作为提示,您可能会注意到您需要将索引传递给itemgetter函数:

itemgetter(0, 1)

答案 2 :(得分:0)

如果你想要sum,你必须总结,只需打印它就不会为你神奇地加总值。

另外,根据您的示例,您似乎应该基于第一列和第二列进行分组。示例 -

for k,items in groupby(things, itemgetter(0, 1)):    
    print(k + (sum(x[2] for x in items),)