我正在尝试查询以下数据:
Student_ID Site Start End Primary_or_Secondary
1 A 1/1/19 2/28/19 Primary
1 B 2/1/19 6/30/19 Secondary
1 C 3/1/19 6/30/19 Primary
并获得如下所示的结果:
Student_ID Primary Secondary Start End
1 A null 1/1/19 1/31/19
1 A B 2/1/19 2/28/19
1 C B 3/1/19 6/30/19
因此,基本上,一个站点可以是学生的主要站点或辅助站点,我希望能够看到学生分别注册的所有时间范围,而不是任何时间范围重叠。
我不知道如何在PostgreSQL中做到这一点,甚至还看了交叉表功能,但这些日期使我的大脑很难:-)
任何一个查询或一组查询(包括一些CTE)的帮助都将非常有帮助!
答案 0 :(得分:0)
这不是小事。交叉表具有重叠和相交的范围,顶部还有边角案例(合并相同的开始/结束日期)。使用基于集合的操作(即纯SQL)很难解决。
我建议改用PL / pgSQL中的过程解决方案。应该也表现不错,因为它只需要对表进行一次(位图索引)扫描:
CREATE OR REPLACE FUNCTION f_student_xtab(VARIADIC _student_ids int[])
RETURNS TABLE (
student_id int
, "primary" text
, secondary text
, start_date date
, end_date date) AS
$func$
DECLARE
r record;
BEGIN
student_id := -1; -- init with impossible value
FOR r IN
SELECT t.student_id, t.site, t.primary_or_secondary = 'Primary' AS prim, l.range_end, l.date
FROM tbl t
CROSS JOIN LATERAL (
VALUES (false, t.start_date)
, (true , t.end_date)
) AS l(range_end, date)
WHERE t.student_id = ANY (_student_ids)
ORDER BY t.student_id, l.date, range_end -- start of range first
LOOP
IF r.student_id <> student_id THEN
student_id := r.student_id;
IF r.prim THEN "primary" := r.site;
ELSE secondary := r.site;
END IF;
start_date := r.date;
ELSIF r.range_end THEN
IF r.date < start_date THEN
-- range already reported
IF r.prim THEN "primary" := NULL;
ELSE secondary := NULL;
END IF;
start_date := NULL;
ELSE
end_date := r.date;
RETURN NEXT;
IF r.prim THEN
"primary" := NULL;
IF secondary IS NULL THEN start_date := NULL;
ELSE start_date := r.date + 1;
END IF;
ELSE
secondary := NULL;
IF "primary" IS NULL THEN start_date := NULL;
ELSE start_date := r.date + 1;
END IF;
END IF;
end_date := NULL;
END IF;
ELSE -- range starts
IF r.date > start_date THEN
-- range already running
end_date := r.date - 1;
RETURN NEXT;
END IF;
start_date := r.date;
end_date := NULL;
IF r.prim THEN "primary" := r.site;
ELSE secondary := r.site;
END IF;
END IF;
END LOOP;
END
$func$ LANGUAGE plpgsql;
致电:
SELECT * FROM f_student_xtab(1,2,3);
或者:
SELECT * FROM f_student_xtab(VARIADIC '{1,2,3}');
db <>提琴here -具有扩展的测试用例
关于VARIADIC
: