SQLAlchemy group_by日期和聚合计数,如何填写缺少的日期

时间:2018-03-09 19:29:06

标签: python sql sqlalchemy

我有一个sqlalchemy查询,看起来像这样。

首先,我按照时间戳对Pomo模型进行分组,然后按照创建Pomo的日期进行分组。

db.session.query(Pomo.timestamp, sa.func.count(Pomo.id))\
               .group_by(sa.func.date(Pomo.timestamp)).all()

返回看起来像这样的数据

[(datetime.datetime(2018, 3, 2, 0, 0), 1),
(datetime.datetime(2018, 3, 7, 0, 0), 1),
(datetime.datetime(2018, 3, 8, 0, 0), 6)]

如何填写日期以使输出类似

[(datetime.datetime(2018, 3, 2, 0, 0), 1),
(datetime.datetime(2018, 3, 3, 0, 0), 0),
(datetime.datetime(2018, 3, 4, 0, 0), 0),
(datetime.datetime(2018, 3, 5, 0, 0), 0),
(datetime.datetime(2018, 3, 6, 0, 0), 0),
(datetime.datetime(2018, 3, 7, 0, 0), 1),
(datetime.datetime(2018, 3, 8, 0, 0), 6)]

1 个答案:

答案 0 :(得分:2)

使用generate_series()生成所需范围内的所有日期,然后左键加入数据,将缺失值合并为0:

In [24]: series = db.session.query(
    ...:         db.func.generate_series(db.func.min(Pomo.timestamp),
    ...:                                 db.func.max(Pomo.timestamp),
    ...:                                 timedelta(days=1)).label('ts')).\
    ...:     subquery()
    ...:                             

In [25]: values = db.session.query(Pomo.timestamp,
    ...:                           db.func.count(Pomo.id).label('cnt')).\
    ...:     group_by(Pomo.timestamp).\
    ...:     subquery()

In [26]: db.session.query(series.c.ts,
    ...:                  db.func.coalesce(values.c.cnt, 0)).\
    ...:     outerjoin(values, values.c.timestamp == series.c.ts).\
    ...:     order_by(series.c.ts).\
    ...:     all()
    ...: 
Out[26]: 
[(datetime.datetime(2018, 3, 2, 0, 0), 1),
 (datetime.datetime(2018, 3, 3, 0, 0), 0),
 (datetime.datetime(2018, 3, 4, 0, 0), 0),
 (datetime.datetime(2018, 3, 5, 0, 0), 0),
 (datetime.datetime(2018, 3, 6, 0, 0), 0),
 (datetime.datetime(2018, 3, 7, 0, 0), 1),
 (datetime.datetime(2018, 3, 8, 0, 0), 6)]