SQL Server:为连续的日期在多行中创建唯一键

时间:2018-07-12 22:21:11

标签: sql sql-server sql-server-2012

我为用户启动和停止程序提供了一组数据。我需要确定每个实例的运行总时间。但是,如果该程序在同一天停止并启动,我需要它是连续的。

最终结果应为:

User    Start      End        EventId
--------------------------------------
X       1/1/2016   1/1/2016   1
X       1/1/2016   1/5/2016   1
X       1/5/2016   1/10/2016  1
X       1/10/2016  1/13/2016  1
X       12/20/2016 12/26/2016 2
Y       01/01/2016 01/01/2016 3
Y       01/01/2016 01/02/2016 3
Y       01/04/2016 01/10/2016 4

或:

User   EventId   DurationDays
------------------------------
  X       1         13
  Y       2          6
  Y       3          2
  Y       4          6

但是我认为,如果有人可以帮助我正确地对他们进行分组,那么我可以很轻松地解决这个问题。

下表是我取得的成绩:

User    Start   End         LagStart      LagStop
-------------------------------------------------
X   1/1/2016    1/1/2016    Startgroup  
X   1/1/2016    1/5/2016    Follow  
X   1/5/2016    1/10/2016   Follow  
X   1/10/2016   1/13/2016   Follow        StopGroup
X   12/20/2016  12/26/2016  StartGroup    StopGroup
X   12/26/2016  12/30/2016  Startgroup    StopGroup
Y   01/01/2016  01/01/2016  StartGroup    
Y   01/01/2016  01/02/2016  StartGroup    StopGroup
Y   01/04/2016  01/10/2016  StartGroup    StopGroup

我为创建一个新的唯一ID而感到困惑,这些ID从每个“开始组”开始,到每个“停止组”都结束

如果有助于查看这些数据集,请参见以下内容:

select
    a.user_start_key as firstStartKey, 
    a.user_end_key as firstEndKey, 
    a.start_dt as firstStartDate,
    a.end_dt as firstDisch, 
    a.rnkkey as firstRank,
    nextRec.user_start_key as nextStart,
    nextRec.start_dt,
    nextRec.max_rank,
    case 
       when Lag(nextRec.max_rank, 1) over (order by a.rnkkey) is null 
          then 'StartGroup'
       when Lag(nextRec.max_rank, 1) over (order by a.rnkkey) in (a.rnkkey)  
          then 'Follow' 
       else 'Start' 
    end as LagStart,
    case 
       when lead(a.rnkkey, 1) over (order by a.rnkkey) is null 
          then 'StopGroup' 
       when lead(a.rnkkey, 1) over (order by a.rnkkey) <> nextRec.max_rank 
          then 'StopGroup' 
       else Null 
    end as Lagstop
from 
    #rnk1 a
inner join 
    (Select Distinct 
         user_start_key, 
         start_dt,
         --dschrg_dt,
         max(rnkkey) over (partition by user_start_key order by end_dt desc) max_rank
     from 
         #rnk1) nextRec on a.user_end_key = nextRec.user_start_key

“ User_ [state] _key”字段只是我为每个user_id按日期构建唯一的密钥,因为有多个用户,我需要将它们分别分组。

如果需要进一步说明,请告诉我。感谢任何可以提供帮助的人。

1 个答案:

答案 0 :(得分:0)

这是一个使用累积总和来计算排名的示例。
这样就可以将排名用于分组。

-- Using a table variable for easy testing
declare @T table (id int identity(1,1) primary key, [User] varchar(8), startdate date, enddate date);
-- Sample data
insert into @T ([User], startdate, enddate) values
 ('X','2018-01-01','2018-01-01')
,('X','2018-01-01','2018-01-05')
,('X','2018-01-05','2018-01-10')
,('X','2018-01-10','2018-01-13')
,('X','2018-12-20','2018-12-26')
,('Y','2018-01-01','2018-01-01')
,('Y','2018-01-01','2018-01-02')
,('Y','2018-01-04','2018-01-10')
;

select 
 [User], 
 cumm_sum_rank as EventId, 
 datediff(day, min(startdate), max(enddate))+1 as DurationDays
 , min(startdate) as [Start]
 , max(enddate) as [End]
from
(
    select *, 
     sum(startdate_diff_prev_enddate) over (order by [User], startdate, enddate) as cumm_sum_rank
    from
    (
        select [User], startdate, enddate, 
         iif(startdate = lag(enddate) over (partition by [User] order by startdate, enddate),0,1) as startdate_diff_prev_enddate
        from @T
    ) as q1
) as q2
group by [User], cumm_sum_rank;