我在 sql server 2012 上遇到了连续日期/非连续日期的问题。
我有一张看起来像这样的表:
文章 | 创建日期 |
---|---|
1234 | 04/01/2021 |
1234 | 05/01/2021 |
1234 | 06/01/2021 |
1234 | 07/01/2021 |
1234 | 10/01/2021 |
1234 | 12/01/2021 |
12345 | 02/01/2021 |
12345 | 03/01/2021 |
12345 | 17/01/2021 |
123456 | 01/01/2021 |
123456 | 03/01/2021 |
123456 | 05/01/2021 |
问题是: 我想通过连续日期和范围的最小日期来获取每篇文章的计数,解释我想要什么有点困难,但有一个结果示例:
文章 | 创建日期 | 计数 |
---|---|---|
1234 | 04/01/2021 | 4 |
1234 | 10/01/2021 | 1 |
1234 | 12/01/2021 | 1 |
12345 | 02/01/2021 | 2 |
12345 | 17/01/2021 | 1 |
123456 | 01/01/2021 | 1 |
123456 | 03/01/2021 | 1 |
123456 | 05/01/2021 | 1 |
例如:
我从那个开始:
;WITH CTE AS (
SELECT Article, [Creation date], StartDate= Dateadd(day,-ROW_NUMBER() OVER (ORDER BY [Creation date]),[Creation date])
FROM MyTable
)
SELECT Article, min([Creation date]) as [Creation date], count(Article) as count
FROM CTE
GROUP BY StartDate, Article, [Creation date]
order by Article, [Creation date]
输出:
文章 | 创建日期 | 计数 |
---|---|---|
1234 | 04/01/2021 | 1 |
1234 | 05/01/2021 | 1 |
1234 | 06/01/2021 | 1 |
1234 | 07/01/2021 | 1 |
1234 | 10/01/2021 | 1 |
1234 | 12/01/2021 | 1 |
12345 | 02/01/2021 | 1 |
12345 | 03/01/2021 | 1 |
12345 | 17/01/2021 | 1 |
123456 | 01/01/2021 | 1 |
123456 | 03/01/2021 | 1 |
123456 | 05/01/2021 | 1 |
但结果是错误的,我真的不知道如何处理这个问题。如果有人能启发我,不胜感激。
谢谢
答案 0 :(得分:1)
这是一个间隙和孤岛问题的例子。在这种情况下,最简单的解决方案是减去递增的值序列并进行聚合。这是有效的,因为增量日期的差异是恒定的:
select article, min(creation_date), max(creation_date), count(*)
from (select t.*,
row_number() over (partition by article order by creation_date) as seqnum
from mytable t
) t
group by article, dateadd(day, -seqnum, creation_date)
order by article, min(creation_date);