计算值大于0的连续天数

时间:2017-07-25 15:55:13

标签: sql-server rank gaps-and-islands

使用SQL Server 2012,我正在尝试创建一个查询,为我提供气候数据库中排名前10位最长的潮湿(或干燥)时段。

我的临时表提供以下数据输出:

select monthid as [id], date, rain_today 
from #raindays
order by monthid asc, date asc

输出:

id  date            rain_today
-------------------------------
1   24 Dec 2014     2.4
1   25 Dec 2014     0
1   26 Dec 2014     8.7
1   27 Dec 2014     1.8
1   28 Dec 2014     0.3
1   29 Dec 2014     0
1   30 Dec 2014     0
1   31 Dec 2014     0.3
2   01 Jan 2015     0.3
2   02 Jan 2015     0.3
2   03 Jan 2015     18.3
2   04 Jan 2015     0.3

等。等

我想返回一个排名表,该表将计算rain_today>的时间段。 0,(或rain_today = 0)即:

Rank Start_Date   End_Date    Wet Period
----------------------------------------
1    31 Dec 2014  04 Jan 2015 5
2    26 Dec 2014  28 Dec 2014 3

...

我从审核其他类似查询得到的最接近的是(这是干天):

 select 
     #raindays.monthid as id,
     min(#raindays.date) as [FirstDryDay],
     max(#raindays.date) as [LatestDryDay],
     count(*) as countdays
 from 
     (select 
          monthid, 
          coalesce(max(case 
                          when rain_today > '0' 
                             then #raindays.date end), '19000101') as latestdry
     from 
         #raindays
     group by 
         monthid) g
join 
    #raindays on #raindays.monthid = g.monthid
              and #raindays.date > g.latestdry
group by 
    #raindays.monthid
order by 
    countdays desc

输出:

id  FirstDryDay     LatestDryDay    countdays
-----------------------------------------------
23  21 Oct 2016     31 Oct 2016     11
21  23 Aug 2016     31 Aug 2016     9
**15    23 Feb 2016     29 Feb 2016     7**
10  25 Sep 2015     30 Sep 2015     6
8   28 Jul 2015     31 Jul 2015     4
24  28 Nov 2016     30 Nov 2016     3
29  29 Apr 2017     30 Apr 2017     2
30  30 May 2017     31 May 2017     2
31  29 Jun 2017     30 Jun 2017     2
20  30 Jul 2016     31 Jul 2016     2
7   29 Jun 2015     30 Jun 2015     2
5   30 Apr 2015     30 Apr 2015     1
11  31 Oct 2015     31 Oct 2015     1
17  30 Apr 2016     30 Apr 2016     1
22  30 Sep 2016     30 Sep 2016     1

正如你所看到的,我真的不想按ID分组,因为我希望能够跨越不同的月份而且我错过了本月早些时候发生的其他时期。实际计数看起来正常,检查上面突出显示的时间段:

id  date    rain_today
15  22 Feb 2016     3.9
15  23 Feb 2016     0
15  24 Feb 2016     0
15  25 Feb 2016     0
15  26 Feb 2016     0
15  27 Feb 2016     0
15  28 Feb 2016     0
15  29 Feb 2016     0
16  01 Mar 2016     3

提前感谢您的帮助!

2 个答案:

答案 0 :(得分:2)

这是你想要的吗?

IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL 
DROP TABLE #TestData;

CREATE TABLE #TestData (
    id INT NOT NULL ,
    [Date] DATE NOT NULL,
    Rain_Today DECIMAL(9,2) NOT NULL 
    );

INSERT #TestData (id, Date, Rain_Today) VALUES 
    (1, '24 Dec 2014', 2.4),
    (1, '25 Dec 2014', 0),
    (1, '26 Dec 2014', 8.7),
    (1, '27 Dec 2014', 1.8),
    (1, '28 Dec 2014', 0.3),
    (1, '29 Dec 2014', 0),
    (1, '30 Dec 2014', 0),
    (1, '31 Dec 2014', 0.3),
    (2, '01 Jan 2015', 0.3),
    (2, '02 Jan 2015', 0.3),
    (2, '03 Jan 2015', 18.3),
    (2, '04 Jan 2015', 0.3);

--======================================

WITH 
    cte_AddRankGroup AS (
        SELECT 
            td.id,
            td.Date,
            td.Rain_Today,
            hr.HasRain,
            RankGroup = DENSE_RANK() OVER (PARTITION BY td.id ORDER BY td.Date) -
                DENSE_RANK() OVER (PARTITION BY td.id, hr.HasRain ORDER BY td.Date)
        FROM 
            #TestData td
            CROSS APPLY ( VALUES (IIF(td.Rain_Today = 0, 0, 1)) ) hr (HasRain)
        )
SELECT 
    arg.id,
    BegDate = MIN(arg.Date),
    EndDate = MAX(arg.Date),
    WetPeriod = IIF(arg.HasRain = 1, 'Wet', 'Dry'),
    ConsecutiveDays = COUNT(1)
FROM 
    cte_AddRankGroup arg
GROUP BY 
    arg.id,
    arg.HasRain,
    arg.RankGroup
ORDER BY 
    arg.id,
    MIN(arg.Date);

结果...

id          BegDate     EndDate     WetPeriod ConsecutiveDays
----------- ----------  ----------  --------- ---------------
1           2014-12-24  2014-12-24  Wet       1
1           2014-12-25  2014-12-25  Dry       1
1           2014-12-26  2014-12-28  Wet       3
1           2014-12-29  2014-12-30  Dry       2
1           2014-12-31  2014-12-31  Wet       1
2           2015-01-01  2015-01-04  Wet       4

编辑:代码版本使用CASE表达式代替IIF ...

--======================================

WITH 
    cte_AddRankGroup AS (
        SELECT 
            td.id,
            td.Date,
            td.Rain_Today,
            hr.HasRain,
            RankGroup = DENSE_RANK() OVER (PARTITION BY td.id ORDER BY td.Date) -
                DENSE_RANK() OVER (PARTITION BY td.id, hr.HasRain ORDER BY td.Date)
        FROM 
            #TestData td
            CROSS APPLY ( VALUES (CASE WHEN td.Rain_Today = 0 THEN 0 ELSE 1 END) ) hr (HasRain)
        )
SELECT top 10
    arg.id,
    BegDate = MIN(arg.Date),
    EndDate = MAX(arg.Date),
    WetPeriod = CASE WHEN arg.HasRain = 1 THEN 'Wet' ELSE 'Dry' END,
    ConsecutiveDays = COUNT(1)
FROM 
    cte_AddRankGroup arg
WHERE 
    arg.HasRain = '0' -- Top 10 Dry
    --arg.HasRain = '1' -- Top 10 Wet
GROUP BY 
    arg.id,
    arg.HasRain,
    arg.RankGroup
ORDER BY 
    ConsecutiveDays desc, MIN(arg.Date);

修改原始脚本以按每个句点类型生成前10名,这是我的最终目标(输出来自完整数据集):

id  BegDate EndDate WetPeriod   ConsecutiveDays
31  10 Jun 2017     26 Jun 2017     Dry 17
4   02 Mar 2015     14 Mar 2015     Dry 13
5   12 Apr 2015     24 Apr 2015     Dry 13
20  15 Jul 2016     26 Jul 2016     Dry 12
29  01 Apr 2017     11 Apr 2017     Dry 11
26  17 Jan 2017     27 Jan 2017     Dry 11
23  21 Oct 2016     31 Oct 2016     Dry 11
25  01 Dec 2016     09 Dec 2016     Dry 9
21  10 Aug 2016     18 Aug 2016     Dry 9
21  23 Aug 2016     31 Aug 2016     Dry 9

答案 1 :(得分:0)

这个问题可以通过递归方式解决:

-- this variable is needed to stop the recursion
declare @numrows int=(select count(1) from #raindays)

-- add a row number to the table creating a new table as "tabseq"
;WITH tabseq as (select row_number() over(order by date) as rownum, * from #raindays),
-- apply recursion to tabseq keeping a toggle running totals of wet and dry periods
CTE as
(
select *, 
 (case when rain_today=0 then 1 else 0 end) as dry,
 (case when rain_today>0 then 1 else 0 end) as wet 
from tabseq where rownum=1
union all 
select s.*,
 (case when s.rain_today=0 then cte.dry+1 else 0 end) as dry,
 (case when s.rain_today>0 then cte.wet+1 else 0 end) as wet
from tabseq s
join cte on s.rownum=cte.rownum+1
where s.rownum<=@numrows
)
select * from cte

一旦您将表(cte)与干/湿蓄电池配合使用,您可以订购并选择它以满足您的输出要求。

请注意,这是假设连续几天在桌子上,如果有差距,那么在案例陈述中添加+1而不是在一方或另一方添加一个日期,取决于您如何考虑缺少日期(湿的还是干的。