用户

时间:2016-01-06 10:09:21

标签: sql postgresql sum greatest-n-per-group

我在postgresql数据库中有这个表:

purchase

userid |    date    | price
---------------------------
     1 | 2016-01-06 |    10
     1 | 2016-01-05 |     5
     2 | 2016-01-06 |    12
     2 | 2016-01-05 |    15

我想要所有用户的最后购买价格的总和。对于用户1,最后一次购买是在2016-01-06,价格是10.对于用户2,最后一次购买是在2016-01-06,价格是12.所以SQL查询的结果应该是{{1 }}

我如何在SQL中执行此操作?

4 个答案:

答案 0 :(得分:4)

您可以使用窗口函数来获取排名,然后使用SUM的正常汇总:

WITH cte AS
(
   SELECT *, RANK() OVER(PARTITION BY userid ORDER BY "date" DESC) AS r
   FROM purchase
)
SELECT SUM(price) AS total
FROM cte
WHERE r = 1;

SqlFiddleDemo

请记住,此解决方案可计算关联。要为每个用户只购买一次,您需要一个每个组不同的列(例如datetime)。但仍然有可能获得联系。

修改

处理关系:

CREATE TABLE purchase(
   userid INTEGER  NOT NULL 
  ,date   timestamp  NOT NULL
  ,price  INTEGER  NOT NULL
);
INSERT INTO purchase(userid,date,price) VALUES 
(1, timestamp'2016-01-06 12:00:00',10),
(1,timestamp'2016-01-05',5),
(2,timestamp'2016-01-06 13:00:00',12),
(2,timestamp'2016-01-05',15),
(2,timestamp'2016-01-06 13:00:00',1000)'

请注意差异RANK()ROW_NUMBER

SqlFiddleDemo_RANK SqlFiddleDemo_ROW_NUMBER SqlFiddleDemo_ROW_NUMBER_2

输出:

╔════════╦══════════════╦══════════════╗
║ RANK() ║ ROW_NUMBER() ║ ROW_NUMBER() ║
╠════════╬══════════════╬══════════════╣
║   1022 ║           22 ║         1010 ║
╚════════╩══════════════╩══════════════╝

UNIQUE上没有userid/date索引,总是有可能(可能很小)为平局。任何基于ORDER BY的解决方案都必须以稳定的方式工作。

答案 1 :(得分:3)

获得最新的"您可以在Postgres中使用distinct on ()的价格:

select distinct on (userid) userid, date, price
from the_table
order by userid, date desc

现在您只需要总结上述声明返回的所有价格:

select sum(price)
from (
   select distinct on (userid) userid, price
   from the_table
   order by userid, date desc
) t;

答案 2 :(得分:1)

在这种情况下,您可以使用LATERAL join:

$( "#o_time_mss" ).html(template);

演示:http://sqlfiddle.com/#!15/5569b/5

答案 3 :(得分:1)

所有提议的解决方案都很好并且有效但是由于我的表包含数百万条记录,我必须找到更有效的方法来做我想要的。似乎更好的方法是使用表purchaseuser之间的外键(在我的问题中我没有提到,我的道歉)purchase.user -> user.id 。知道了这一点,我可以做以下要求:

select sum(t.price) from (
    select (select price from purchase p where p.userid = u.id order by date desc limit 1) as price 
    from user u
) t; 

修改

要回答@a_horse_with_no_name,我的解决方案是explain analyse verbose
他的解决方案:

Aggregate  (cost=64032401.30..64032401.31 rows=1 width=4) (actual time=566101.129..566101.129 rows=1 loops=1)
    Output: sum(purchase.price)
    ->  Unique  (cost=62532271.89..64032271.89 rows=10353 width=16) (actual time=453849.494..566087.948 rows=12000 loops=1)
          Output: purchase.userid, purchase.price, purchase.date
          ->  Sort  (cost=62532271.89..63282271.89 rows=300000000 width=16) (actual time=453849.492..553060.789 rows=300000000 loops=1)
                Output: purchase.userid, purchase.price, purchase.date
                Sort Key: purchase.userid, purchase.date
                Sort Method: external merge  Disk: 7620904kB
                ->  Seq Scan on public.purchase  (cost=0.00..4910829.00 rows=300000000 width=16) (actual time=0.457..278058.430 rows=300000000 loops=1)
                      Output: purchase.userid, purchase.price, purchase.date
Planning time: 0.076 ms
Execution time: 566433.215 ms

我的解决方案:

Aggregate  (cost=28366.33..28366.34 rows=1 width=4) (actual time=53914.690..53914.690 rows=1 loops=1)
    Output: sum((SubPlan 1))
    ->  Seq Scan on public.user2 u  (cost=0.00..185.00 rows=12000 width=4) (actual time=0.021..3.816 rows=12000 loops=1)
          Output: u.id, u.name
    SubPlan 1
      ->  Limit  (cost=0.57..2.35 rows=1 width=12) (actual time=4.491..4.491 rows=1 loops=12000)
            Output: p.price, p.date
            ->  Index Scan Backward using purchase_user_date on public.purchase p  (cost=0.57..51389.67 rows=28977 width=12) (actual time=4.490..4.490 rows=1 loops=12000)
                  Output: p.price, p.date
                  Index Cond: (p.userid = u.id)
Planning time: 0.115 ms
Execution time: 53914.730 ms

我的桌子包含3亿条记录 我不知道它是否相关,但我也有purchase (userid, date)的索引。