Question

假设我的数据如下所示：

    create table tab(id smallint, nums int4range)
    insert into tab values (1, int4range(1,10)), (2, int4range(1,20)), (3,int4range(3,8)), (4,int4range(15,25)), (5,int4range(3,8))

然后select * from tab给出：

 id |  nums
----+---------
  1 | [1,10)
  2 | [1,20)
  3 | [3,8)
  4 | [15,25)
  5 | [3,8)

我想要一个查询，它可以找到由这些范围的交集形成的范围以及属于这些子范围的id。所以输出看起来像某种形式：

  nums  | ids
--------+------------
[1,3)   | 1, 2
[3,8)   | 1, 2, 3, 5
[8,10)  | 1, 2
[10,15) | 2
[15,20) | 2, 4
[20,25) | 4

我对'ids'列的输出不可知 - 数组似乎是合乎逻辑的，但我完全满足给定范围内第一，第二，第三......第n个id的列。

我知道不会有超过五个具有重叠范围的ID，因此根据需要使用空值的固定数量的列完全没问题。我也知道，如果重要的话，就没有没有ID的范围。

感谢您提供的任何帮助。

Answer 1

重叠范围

如果您想要重叠范围：

WITH all_intersections
AS
(
SELECT
    t1.id AS id1, 
    t2.id AS id2, 
    t1.nums * /* intersection */ t2.nums AS nums 
FROM
    tab t1 CROSS JOIN tab t2
WHERE
    t1.id <= t2.id  /* Need only 1/2 + diagonal */
),
unique_nums AS
(
SELECT DISTINCT
    nums
FROM
    all_intersections
WHERE 
    nums <> 'empty' 
)
SELECT 
    nums, 
    array(SELECT DISTINCT id1 AS id 
            FROM all_intersections a1 
           WHERE a1.nums = a0.nums
          UNION
          SELECT DISTINCT id2 AS id 
            FROM all_intersections a2 
           WHERE a2.nums = a0.nums
          ORDER BY id
         ) AS ids
FROM
    unique_nums a0 
ORDER BY
    nums ;

结果如下：

|    nums |     ids |
|---------|---------|
|  [1,10) |     1,2 |
|  [1,20) |       2 |
|   [3,8) | 1,2,3,5 |
| [15,20) |     2,4 |
| [15,25) |       4 |

您可以在http://sqlfiddle.com/#!15/f83d5/5/0

查看

非重叠范围

如果您想获得非重叠范围（例如您的示例），可以使用以下CTE完成此操作：

WITH bounds AS         /* all bounds */
(
SELECT DISTINCT
    lower(nums) AS b
FROM
    tab
UNION
SELECT DISTINCT
    upper(nums) AS b
FROM 
    tab
),
range_bounds AS        /* pairs of consecutive bounds */
(
SELECT
    b, lead(b) OVER (ORDER BY b) AS next_b 
FROM
    bounds
),
ranges AS              /* convert the pairs to ranges */
(
SELECT
    int4range(b, next_b) AS nums
FROM
    range_bounds 
WHERE
    next_b is not null  -- ignore last
)
SELECT                 /* take every range and find intersection with originals */
    nums, 
    ARRAY
      (SELECT id 
        FROM tab
       WHERE tab.nums && ranges.nums
      ) AS ids
FROM 
    ranges ;

执行结果是：

|    nums |     ids |
|---------|---------|
|   [1,3) |     1,2 |
|   [3,8) | 1,2,3,5 |
|  [8,10) |     1,2 |
| [10,15) |       2 |
| [15,20) |     2,4 |
| [20,25) |       4 |

这是你的例子的结果。

这假定：

构建的所有范围都包含下限[并排除上限)。 [在其他情况下，它不会产生正确的结果。]

这个想法是：

你取得范围的所有界限（无论是低位还是高位）
对它们进行排序
从任意两个连续边界中制作范围
查看与其重叠的原始范围以构建ids

在http://sqlfiddle.com/#!15/f83d5/10/0

注意：如果您想通过纯替换来避免CTE，可以进一步压缩：

SELECT 
    nums, ARRAY
          (SELECT id 
             FROM tab
            WHERE tab.nums && ranges.nums
           ) AS ids
FROM 
    (SELECT
        int4range(b, next_b) AS nums
    FROM
        (SELECT
            b, lead(b) OVER (ORDER BY b) AS next_b 
        FROM
            (SELECT DISTINCT lower(nums) AS b FROM tab
             UNION
             SELECT DISTINCT upper(nums) AS b FROM tab
            ) AS bounds
        ) AS range_bounds 
    WHERE
        next_b is not null
    ) AS ranges 
ORDER BY
  nums ;

在http://sqlfiddle.com/#!15/f83d5/15/0

Answer 2

SELECT uniquenums.nums, array_agg(id) ids
FROM (
        SELECT numsgroup, int4range(min(boundary), max(boundary)) nums
        FROM (
                SELECT boundary, row_number() OVER (ORDER BY boundary, seriesvalue) / 2 AS numsgroup
                FROM (
                        SELECT DISTINCT upper(nums) AS boundary FROM tab
                        UNION
                        SELECT DISTINCT lower(nums) AS boundary FROM tab
                ) AS A
                JOIN (
                        SELECT generate_series(1, 2) AS seriesvalue
                ) AS B ON true
        ) AS A
        GROUP BY numsgroup
        HAVING COUNT(*) > 1
) AS uniquenums
JOIN tab ON tab.nums && uniquenums.nums
GROUP BY uniquenums.nums
ORDER BY uniquenums.nums

它是如何运作的？

提取所有不同的边界，无论是低层还是高层
通过将帮助表表达式与两行
为每个结果行分配一个组号，以便为两个连续的边界分配相同的组号
按这些数字分组并使用连续边界构建新范围
在标签中查找与刚刚计算的范围重叠的范围
汇总数组中找到的范围的ID

Answer 3

select rng as nums, array_agg(id) as ids
from (  
    select int4range(n, lead(n) over (order by n)) as rng
    from (  
        select distinct lower(nums) n
        from tab
        union
        select distinct upper(nums) n
        from tab
        ) s
    ) s
join tab on rng && nums
group by 1
order by 1;

  nums   |    ids    
---------+-----------
 [1,3)   | {1,2}
 [3,8)   | {1,2,3,5}
 [8,10)  | {1,2}
 [10,15) | {2}
 [15,20) | {2,4}
 [20,25) | {4}
(6 rows)

查找具有重叠范围的行

3 个答案:

重叠范围

非重叠范围