带有自联接的SQL查询

时间:2011-08-05 04:37:52

标签: sql-server-2005 self-join

给定包含以下数据的表(TableA);

Id    Date    Status    RecordId
1    01/06/11    2      REC001
2    01/06/11    2      REC002
3    01/06/11    2      REC003
4    01/07/11    1      REC001

如何返回状态为2的所有记录,除了具有给定RecordId的记录,其中状态为2后面跟着记录为1(并且没有其他记录的状态为2。 / p>

因此,例如,查询应该返回REC002和REC003,因为REC001过去的状态为2,但是后来被记录ID 4取代,状态为1。如果在稍后的某个时间点为REC001添加了另一条状态为2的记录,那么这应再次存在于结果集中(假设没有后续记录的状态为1)。

我弄乱这个的微弱尝试是;

DECLARE @TableA TABLE
(
    Id INT,
    Dt DATETIME,
    Stat INT,
    RecId VARCHAR(6)
)

INSERT INTO @TableA 

SELECT   1,    DATEADD(day, -5, current_timestamp),  2,   'REC001'
UNION
SELECT   2,    DATEADD(day, -4, current_timestamp),  2,   'REC002'
UNION
SELECT   3,    DATEADD(day, -3, current_timestamp),  2,   'REC003'
UNION
SELECT   4,    DATEADD(day, -2, current_timestamp),  1,   'REC001'

   SELECT * 
     FROM @TableA t1
LEFT JOIN @TableA t2 ON t1.RecId = t2.RecId 
    WHERE t1.Stat = 2 
      AND (t1.Dt >= t2.Dt 
      AND t2.Stat != 1)

这种方法有效,但返回t1.Id = t2.Id的值。我知道我可以通过我的where子句排除这个,但如果我在表中添加更多记录,它会再次失败。例如;

INSERT INTO @TableA 
SELECT   1,    DATEADD(day, -15, current_timestamp),  2,   'REC004'
UNION
SELECT   2,    DATEADD(day, -14, current_timestamp),  2,   'REC002'
UNION
SELECT   3,    DATEADD(day, -13, current_timestamp),  1,   'REC003'
UNION
SELECT   4,    DATEADD(day, -12, current_timestamp),  1,   'REC001'
UNION
SELECT   11,    DATEADD(day, -5, current_timestamp),  2,   'REC004'
UNION
SELECT   21,    DATEADD(day, -4, current_timestamp),  2,   'REC002'
UNION
SELECT   31,    DATEADD(day, -3, current_timestamp),  1,   'REC003'
UNION
SELECT   41,    DATEADD(day, -2, current_timestamp),  1,   'REC001'

任何想法都表示赞赏。

编辑:我尝试了两个答案,虽然两者都没有给我我所需要的,但他们当然指出了正确的方向。使用给出的答案,我想出了以下似乎按照我的要求做的事情;

;WITH lastSuccess(recid, dt) AS (
    select recid, max(dt) from @tableA
    where stat = 1
    group by recid
),
lastFailure(recid, dt) AS (
    select recid, max(dt) from @tableA
    where stat = 2
    group by recid
)
select a.* from @tablea a
-- Limit results to those that include a failure
INNER JOIN lastFailure lf ON lf.recid = a.recid AND lf.dt = a.dt
-- If the recid also has a success, show this along with it's latest success date
LEFT JOIN lastSuccess ls ON ls.recid = lf.recid 
-- Limit records to where last failure is > last success or where there is no last success.
WHERE (lf.dt > ls.dt OR ls.dt IS NULL)

我在这里看到的唯一缺点是,如果有两个记录具有完全相同的时间戳,那么它将在结果集中出现两次。例如,如果Id 21被重新命名为22,那么它将出现两次。这不是一个真正的问题,因为实际上,时间戳将始终是唯一的。

2 个答案:

答案 0 :(得分:1)

这是关于你要追求的吗?

declare @tableB table (recid varchar(6), dt datetime)
insert @tableB
select recid, max(dt) from @tableA
where stat = 1
group by recid

这会创建一个包含所有“1”及其最大日期的表格。

select * from @tableA a
left join @tableB b on a.recid = b.recid
where a.stat = 2 and a.dt > isnull(b.dt, '1753-01-01')

这将显示来自A的所有记录,除了那些后来有B的记录。

答案 1 :(得分:1)

WITH ranked AS (
  SELECT
    *,
    rn = ROW_NUMBER() OVER (PARTITION BY RecId ORDER BY Dt DESC)
  FROM TableA
)
SELECT
  r1.Id,
  r1.Dt,
  r1.Stat,
  r1.RecId
FROM ranked r1
  INNER JOIN ranked r2 ON r1.RecId = r2.RecId AND r2.rn = 1
WHERE r1.Stat = 2

问题更新后

更新

WITH ranked AS (
  SELECT
    *,
    rn = ROW_NUMBER() OVER (PARTITION BY RecId ORDER BY Dt DESC)
  FROM TableA
)
SELECT
  Id,
  Dt,
  Stat,
  RecId
FROM ranked
WHERE Stat = 2 AND rn = 1