SQL查询返回每个组的最新数据,然后根据不同的组

时间:2017-01-27 14:54:04

标签: sql sql-server

我有一个返回我想要的结果的查询,但它运行了很长时间。有没有人知道更好的写作方式?

每组Prgno / Prgdate我需要一排。 首先,我需要通过选择最新记录来确定每个员工的状态。 然后,如果有任何员工处于活动状态,则整个组都处于活动状态。

仅选择“A”记录是不够的,因为特定员工的非活动记录可能比活动记录更新。

以下是查询:

SELECT X_Prgno,X_Prgdate,X_Status 
FROM (
    -- sq2 choose the 1st record when ordering by status, this will choose Active before Inactive
    SELECT X_Prgno,X_Prgdate,X_Status,
           ROW_NUMBER() OVER (PARTITION BY X_Prgno,X_Prgdate ORDER BY X_Status) AS rn
    FROM (
        -- sq1 choose the most recently updated record per empno, prgno, prgdate
        SELECT X_Empno,X_Prgno,X_Prgdate,X_Status,
               X_Upddate AS Updated_datetime,
               MAX(X_Upddate) OVER (PARTITION BY X_Empno,X_Prgno,X_Prgdate) AS Max_Updated_datetime
        FROM X_demo
        ) sq1
    WHERE Updated_datetime = Max_Updated_datetime) sq2
WHERE rn = 1

我首先选择了按3列(prgno,prgdate,employee)分组的最新记录。然后,如果存在活动记录,则首先选择活动记录,仅按2列(prgno,prgdate)进行分组。

示例:(我希望这有帮助,您可以在此示例集上运行上述查询)

create table X_demo(
X_Prgno char(6),
X_Prgdate char(8),
X_Empno int,
X_Status char(1),
X_Upddate datetime);
insert into X_demo values ('P43','20170124',1033,'A','2015-07-06 23:05:32.000');
insert into X_demo values ('P43','20170124',1033,'I','2015-07-06 23:05:07.000');
insert into X_demo values ('P43','20170124',1033,'I','2015-07-06 23:03:58.000');
insert into X_demo values ('P43','20170124',1034,'A','2015-06-03 09:29:46.000');
insert into X_demo values ('P43','20170124',1029,'I','2015-06-03 07:26:36.000');
insert into X_demo values ('P43','20170124',1033,'I','2015-06-02 14:52:53.000');
insert into X_demo values ('P43','20170124',1010,'I','2015-06-02 14:52:12.000');
insert into X_demo values ('P43','20170124',1029,'I','2015-08-29 13:27:35.000');
insert into X_demo values ('P43','20170124',1074,'I','2015-05-19 01:20:06.000');

如果我们按Prgno,Prgdate和Empno分组,我们应该返回6行,这是每个员工的最新行。 然后由Prgno和Prgdate重新组合以返回该组的“A”。

期望的结果:

X_Prgno X_Prgdate   X_Status
P43     20170124    A

感谢您的帮助。

如果我插入2条额外记录,两条活动员工记录的更多当前记录使其处于非活动状态,则该组的结果应处于非活动状态。

insert into X_demo values ('P43','20170124',1033,'I','2017-01-27 09:30:00.000');
insert into X_demo values ('P43','20170124',1034,'I','2017-01-27 09:30:00.000');

结果:

X_Prgno X_Prgdate   X_Status
P43     20170124    I

更新 - 2017-01-30

我更改了子查询的MAX OVER PARTITION部分以使用ROW_NUMBER函数。 它将查询运行时间提高了几秒钟,但仍然运行时间过长。

SELECT X_Prgno,X_Prgdate,X_Status 
FROM (
    -- sq2 choose the 1st record when ordering by status, this will choose Active before Inactive
    SELECT X_Prgno,X_Prgdate,X_Status,
           ROW_NUMBER() OVER (PARTITION BY X_Prgno,X_Prgdate ORDER BY X_Status) AS sq2_rn
    FROM (
        -- sq1 choose the most recently updated record per prgno, prgdate, empno
        SELECT X_Prgno,X_Prgdate,X_Empno,X_Status,X_upddate,
               ROW_NUMBER() OVER (PARTITION BY X_Prgno,X_Prgdate,X_Empno
                        ORDER BY X_Upddate DESC,X_Status) AS sq1_rn
        FROM X_demo) sq1
    WHERE sq1_rn = 1) sq2
WHERE sq2_rn = 1

1 个答案:

答案 0 :(得分:1)

您似乎想要在日期日期/ prgno上为员工创建一条记录,其中优先处理活动记录。

了解您要执行的操作有助于简化查询。

SELECT x.*
FROM (SELECT x.*,
             ROW_NUMBER() OVER (PARTITION BY X_Prgno, X_PrgDate
                                ORDER BY X_Upddate DESC, status
                               ) as seqnum
      FROM X_demo x
     ) x
WHERE seqnum = 1;

对于此查询,(X_Prgno, X_PrgDate, status, X_Upddate)上的索引有助于提高性能。