如何根据其他列的总和为每个组选择一个记录

时间:2015-01-08 21:29:20

标签: sql-server

我有一组SQL-Server数据,其中包含一个或多个组的记录以及一些布尔标志。我想从每个组中选择一个记录,使得每个标志字段的总和基于所选记录为零或1。如果可能有多个结果,我想从每个组中选择具有最小RecordNo值的记录。

以下是一组示例数据:

GroupCode RecordNo Flag1 Flag2 Flag3 Flag4  
   A         1       1     1     0     0  
   A         3       0     0     1     1  
   B         1       1     0     0     0  
   B         2       0     1     0     0  
   B         3       0     0     1     0  
   B         4       0     0     0     1  
   C         1       1     0     0     0  
   C         2       0     1     0     0  
   C         3       0     0     1     0  
   C         4       0     0     0     1  

我的预期结果集是:

GroupCode RecordNo Flag1 Flag2 Flag3 Flag4  
   A         1       1     1     0     0  
   B         3       0     0     1     0  
   C         4       0     0     0     1  

(4个标志字段的总和为1)

我很感激任何帮助或建议。

1 个答案:

答案 0 :(得分:0)

根据你对上述问题的回答,我能够把一些东西放在一起。这假设每个标志列都存储一个int值。

很脏,但这是我在漫长的一天结束时能想到的最好的东西:)

With UnpivotStep AS
(
    -- Unpivot the flag columns so we can partition each groupcode, recordno pair by the flag column
    SELECT GroupCode, RecordNo, Flags, Value
    FROM (SELECT GroupCode, RecordNo, Flag1, Flag2, Flag3, Flag4
          FROM derp.dbo.Table_3) p
    UNPIVOT (Value for Flags in (Flag1, Flag2, Flag3, Flag4)) AS unpvt
),

PartitionStep as
(
    -- Here we partition each groupcode, recordno pair by any flags that are set
    SELECT GroupCode, RecordNo, Flags, Value
    from unpivotStep
),

SolutionPart1 as
(
    -- We're interested in knowing the smallest GroupCode/RecordNum set where each flag is set to 1
    SELECT *
    FROM PartitionStep
    PIVOT (MIN(RecordNo) FOR GroupCode IN (A, B, C)) C
    WHERE VALUE = 1
),

SolutionPart2 as (
    -- To do the cross apply for the next step we need all of the groupcodes in the same column
    SELECT *
    FROM SolutionPart1
    UNPIVOT (RecordNum FOR GroupCode IN (A, B, C)) C
),

SolutionPart3 as (
    -- A bit ugly, but use cross apply and our where clause to find our true answer
    SELECT A1.GroupCode as Flag1GroupCode,
            A1.RecordNum as Flag1RecordNum,
            A1.Flags as Flag1,
            A1.Value as Flag1Value,
            A2.GroupCode as Flag2GroupCode,
            A2.RecordNum as Flag2RecordNum,
            A2.Flags as Flag2,
            A2.Value as Flag2Value,
            A3.GroupCode as Flag3GroupCode,
            A3.RecordNum as Flag3RecordNum,
            A3.Flags as Flag3,
            A3.Value as Flag3Value,
            A4.GroupCode as Flag4GroupCode,
            A4.RecordNum as Flag4RecordNum,
            A4.Flags as Flag4,
            A4.Value as Flag4Value
    FROM SolutionPart2 as A1
    CROSS APPLY SolutionPart2 A2
    CROSS APPLY SolutionPart2 A3
    CROSS APPLY SolutionPart2 A4
    WHERE A1.Flags = 'Flag1' AND ((A1.GroupCode = A2.GroupCode AND A1.RecordNum = A2.RecordNum) OR (A1.GroupCode < A2.GroupCode))
    AND A2.Flags = 'Flag2' AND ((A2.GroupCode = A3.GroupCode AND A2.RecordNum = A3.RecordNum) OR (A2.GroupCode < A3.GroupCode))
    AND A3.Flags = 'Flag3' AND ((A3.GroupCode = A4.GroupCode AND A3.RecordNum = A4.RecordNum) OR (A3.GroupCode < A4.GroupCode))
    AND A4.Flags = 'Flag4'
)

-- This takes the answer we found in SolutionPart3 and returns it into the rowset we want.
SELECT DISTINCT GroupCode, RecordNum, MAX(c.Flag1) as Flag1, MAX(c.Flag2) as Flag2, MAX(c.Flag3) as Flag3, MAX(c.Flag4) as Flag4
FROM SolutionPart3
CROSS APPLY
(
    SELECT Flag1GroupCode, Flag1RecordNum, 1 as Flag1, 0 as Flag2, 0 as Flag3, 0 as Flag4
    UNION
    SELECT Flag2GroupCode, Flag2RecordNum, 0, 1, 0, 0
    UNION
    SELECT Flag3GroupCode, Flag3RecordNum, 0, 0, 1, 0
    UNION
    SELECT Flag4GroupCode, Flag4RecordNum, 0, 0, 0, 1
) c (GroupCode, RecordNum, Flag1, Flag2, Flag3, Flag4)
GROUP BY GroupCode, RecordNum