SQL 2012比较多行的日期

时间:2016-10-12 14:53:44

标签: sql sql-server

我有一个问题,我需要比较几行的日期。要求是数据需要按照“区域/区域”分组。结合最低的StartDate'最高的' EndDate'除非在之前的' EndDate'之间存在超过1天的差距。以及下一个“开始日期”#39;

' StartDate'将永远是本月的第一天,并且' EndDate'永远是这个月的最后一天。

给出一个简化的表格:

Region | Area |   StartDate   |   EndDate
-------|------|---------------|-------------
   A   |   1  |   01/01/2016  |  03/31/2016
   A   |   1  |   04/01/2016  |  05/31/2016
   A   |   1  |   07/01/2016  |  09/30/2016
   A   |   1  |   10/01/2016  |  01/31/2017
   A   |   1  |   02/01/2017  |  12/31/2017
   B   |   2  |   01/01/2016  |  04/30/2016
   B   |   2  |   05/01/2016  |  09/30/2016
   A   |   4  |   01/01/2016  |  05/31/2016
   A   |   4  |   06/01/2016  |  12/31/2016

我需要将结果看起来像这样:

Region | Area |   StartDate  |  EndDate  
-------|------|--------------|-----------
   A   |   1  |  01/01/2016  |  05/31/2016
   A   |   1  |  07/01/2016  |  12/31/2017
   B   |   2  |  01/01/2016  |  09/30/2016
   A   |   4  |  01/01/2016  |  12/31/2016

我尝试过使用MIN和MAX日期的GROUP BY,但我似乎无法弄清楚它的逻辑。

非常感谢任何想法或建议。

1 个答案:

答案 0 :(得分:2)

这似乎是一个数据岛问题。您可以使用SQL Server 2012中引入的窗口函数。使用LAG窗口函数,您可以确定您的上次记录结束日期时间是否与当前记录开始日期时间之间的差距大于一天。接下来,您可以使用SUM OVER子句为每个数据岛生成分组ID。

DECLARE @SourceData TABLE
(
     Region         NVARCHAR(10)
    ,Area           INT
    ,StartDate      DATETIME
    ,EndDate        DATETIME
);

INSERT INTO @SourceData
VALUES
('A', 1, '01/01/2016', '03/31/2016'),
('A', 1, '04/01/2016', '05/31/2016'),
('A', 1, '07/01/2016', '09/30/2016'),
('A', 1, '10/01/2016', '01/31/2017'),
('A', 1, '02/01/2017', '12/31/2017'),
('B', 2, '01/01/2016', '04/30/2016'),
('B', 2, '05/01/2016', '09/30/2016'),
('A', 4, '01/01/2016', '05/31/2016'),
('A', 4, '06/01/2016', '12/31/2016');

;WITH CTE_DataIslands  -- First CTE determine the start of each new data island
AS
(
    SELECT           Region
                    ,Area
                    ,StartDate
                    ,EndDate
                    ,(
                        CASE
                            WHEN DATEADD(DAY, 1, LAG(EndDate, 1) OVER  (PARTITION BY Region, Area ORDER BY StartDate ASC)) < (StartDate) THEN 1 -- If prev record's end date + 1 day  is not equal to current record's start date then it is the start of a new data island.
                            ELSE 0
                        END
                     ) AS [IsNewDataIsland]
    FROM            @SourceData 
)
, CTE_GenerateGroupingID
AS
(
    SELECT  Region
            ,Area
            ,StartDate
            ,EndDate
            ,SUM([IsNewDataIsland]) OVER (PARTITION BY Region, Area ORDER BY StartDate ASC ROWS UNBOUNDED PRECEDING) AS GroupingID  -- Create a running total of the IsNewDataIsland column this will create a grouping id we can now group on
    FROM    CTE_DataIslands
)
SELECT      Region  
            ,Area
            ,MIN(StartDate) AS StartDate
            ,MAX(EndDate) AS StartDate
FROM        CTE_GenerateGroupingID
GROUP BY    Region, Area, GroupingID