根据条件计算连续记录的数量

时间:2018-03-07 08:10:29

标签: sql postgresql gaps-and-islands postgresql-9.6

我有一个表,其行数最初按时间戳排序:

+----+------------+-----+
| id | date       | foo |
+----+------------+-----+
| 1  | 2017-12-28 | abc |
+----+------------+-----+
| 1  | 2017-12-27 | abc |
+----+------------+-----+
| 2  | 2017-12-27 | xyz |
+----+------------+-----+
| 2  | 2017-12-26 | xyz |
+----+------------+-----+
| 2  | 2017-12-25 | abc |
+----+------------+-----+
| 1  | 2017-12-25 | abc |
+----+------------+-----+
| 2  | 2017-12-25 | abc |
+----+------------+-----+

我希望为不同的foo提供相同id顺序记录数量:

+----+-----+-------+
| id | foo | count |
+----+-----+-------+
| 1  | abc | 2     |
+----+-----+-------+
| 2  | xyz | 2     |
+----+-----+-------+
| 2  | abc | 1     |
+----+-----+-------+
| 1  | abc | 1     |
+----+-----+-------+
| 2  | abc | 1     |
+----+-----+-------+

因此,here是具有内置架构的sqlfiddle。

窗口功能看起来像是这类问题的关键,但它并没有像我使用的方式那样成功。

我很乐意得到任何帮助或至少一些有用的提示。 关于MySQL有一些与此相关的问题,但它们并非如此有用。

1 个答案:

答案 0 :(得分:1)

首先,非常感谢你的sqlfiddle。

使用标准方法(Tabibitosan)使用row_number()

来解决间隙和岛屿问题

SQL Fiddle

PostgreSQL 9.6架构设置

create table bar (
  id   bigint not null,
  date timestamp without time zone,
  foo  text
);

insert into bar (id, date, foo) values
  (1, '2017-12-28 17:54:02', 'abc'),
  (1, '2017-12-28 17:53:30', 'abc'),
  (2, '2017-12-28 17:50:13', 'xyz'),
  (2, '2017-12-28 17:44:35', 'xyz'),
  (2, '2017-12-28 17:30:00', 'abc'),
  (1, '2017-12-28 17:25:15', 'abc'),
  (2, '2017-12-28 17:21:20', 'abc');

查询1

SELECT MAX (id) AS id,
         foo,
         COUNT (*) AS "count"
    FROM (SELECT b.*,
                   ROW_NUMBER () OVER (ORDER BY date DESC)
                 - ROW_NUMBER () OVER (PARTITION BY id ORDER BY date DESC)
                    seq
            FROM bar b) t
GROUP BY foo, seq, id
ORDER BY MAX(DATE) DESC

<强> Results

| id | foo | count |
|----|-----|-------|
|  1 | abc |     2 |
|  2 | xyz |     2 |
|  2 | abc |     1 |
|  1 | abc |     1 |
|  2 | abc |     1 |