聚合日期范围组

时间:2014-06-12 09:47:57

标签: sql

假设一个简单的查询,例如:

从SampleTable中选择名称,角色,placeOfWork,startDate,endDate

显示员工的姓名,以及他们在工作场所从开始日期到结束日期所占用的角色。当作业分配是最新的时,结束日期为空。

我有一个这样的查询的结果集,我得到了这样的样本:

Jack      Cook       Jimmy's Burger Joint    01-01-2010     21-01-2010
Jack      Cook       Jimmy's Burger Joint    21-01-2010     31-03-2010
Jack      Cook       Jimmy's Burger Joint    31-03-2010     24-12-2010
Ronald    Marketing  McDonald's              01-01-2010     22-01-2010
Ronald    Marketing  McDonald's              22-01-2010     06-06-2010
Ronald    Marketing  McDonald's              06-06-2010     NULL
Jack      Cosmonaut  NASA                    01-01-2011     NULL
...

我想将工作分配汇总到"单概念的",例如:

Jack      Cook       Jimmy's Burger Joint    01-01-2010     24-12-2010
Ronald    Marketing  McDonald's              01-01-2010     NULL
Jack      Cosmonaut  NASA                    01-01-2011     NULL
...

尽可能地我想避免临时表,因为我需要从各个地方运行查询。我无法使用内部联接或分组来完成它。

3 个答案:

答案 0 :(得分:2)

我的方法是首先将范围扩展为行(使用数字或日历表),所以这一行例如:

 StartDate  |   Enddate 
------------+------------
 2010-01-01 | 2010-03-01

变为

    Date    
------------
 2010-01-01
 2010-01-02
 2010-01-03

由于很多日期函数都是特定于DBMS的,我使用的是SQL-Server特定的语法,但这应该很容易适应sybase(我根本不熟悉),这将扩展一个简单的表格开始和结束日期到范围内的所有日期:

SELECT  DATEADD(DAY, n.Number, t.StartDate) AS Date
FROM    T
        INNER JOIN Numbers n
            ON  DATEADD(DAY, n.Number, t.StartDate) <= t.EndDate

现在你有一个可以使用Gaps and Islands Logic解决的集合。扩展您的范围之后,您需要确定间隙和岛屿,为此我使用的是在sybase和SQL Server中都支持的DENSE_RANK。这会在下面给出GroupingSet列。最后一步是根据您的岛屿汇总:

WITH Expanded AS
(   SELECT  Name,
            Job,
            Company,
            StartDate,
            DATEADD(DAY, n.Number, t.StartDate) AS Date,
            CASE WHEN EndDate IS NULL THEN 1 ELSE 0 END AS EndDateIsNull
    FROM    T
            INNER JOIN Numbers n
                ON  DATEADD(DAY, n.Number, t.StartDate) <= ISNULL(t.EndDate, t.StartDate)
), Grouped AS
(   SELECT  Name,
            Job,
            Company,
            Date,
            DATEADD(DAY, -DENSE_RANK() OVER(PARTITION BY Name, Job, Company ORDER BY Date), Date) AS GroupingSet,
            EndDateIsNull
    FROM    Expanded
)
SELECT  Name, 
        Job, 
        Company,
        MIN(Date) AS StartDate, 
        CASE WHEN MAX(EndDateIsNull) = 0 THEN MAX(Date) END AS EndDate
FROM    Grouped
GROUP BY Name, Job, Company, GroupingSet
ORDER BY Name, Job, StartDate;

<强> Example on SQL Fiddle

答案 1 :(得分:2)

我会用简单的逻辑来解决这个问题。当与先前的分配没有重叠时,分配开始。在这种情况下,我们可以为每个作业分配一个值,即过去的作业数量。这对于lag()和累积总和来说最简单。这是没有这些的版本:

with stp as (
      select name, role, placeOfWork, startDate, endDate,
             (case when exists (select 1
                                from SampleTable st2
                                where st2.name = st.name and st2.role = st.role and
                                      st2.placeOfWork = st.placeOfWork and
                                      st2.endDate = st.StartDate
                               )
                   then 0
                   else 1
              end) as PeriodStart
      from SampleTable st 
     ),
     stpg as (
      select stp.*,
             (select sum(PeriodStart)
              from stp stp2
              where stp2.name = stp.name and stp2.role = stp.role and
                    stp2.placeOfWork = stp.placeOfWOrk and
                    stp2.StartDate <= stp.StartDate
             ) as grp
      from stp
select name, role, placeOfWork, min(StartDate) as StartDate, max(endDate) as endDate
from stpg
group by grp, name, role, placeOfWork;

答案 2 :(得分:0)

这个怎么样:

SELECT name, role, placeOfWork,
       MIN (startDate),
       CASE WHEN COUNT(endDate) = COUNT(startDate) THEN MAX(endDate) ELSE NULL END
FROM SampleTable
GROUP BY name, role, placeOfWork;