SQL过滤掉较少的特定行

时间:2014-01-28 15:54:55

标签: sql oracle

我的表数据看起来像

 Col1  | Col2 | Col3
    1  |   2  | NULL
    1  |   2  | 3
    1  | NULL | NULL
    1  |   5  | NULL
    2  | NULL | NULL

我想写一个查询,以便我只获得最具体的条目。即。在上面的示例中,row1是更具体的row3,因为“Col1”的值在两者中都相同,但“Col2”中的值在row1中更具体(非空),类似地,row2比row1更具体。

对于上述数据集,结果应如下所示:

Col1 | Col2 | Col3
  1  |  2   |  3
  1  |  5   | NULL
  2  | NULL | NULL

注意:列的数据类型可以是任何内容。

2 个答案:

答案 0 :(得分:4)

我假设列在查询中是“有序”的,因此您没有col2为空且col3不为空的情况:

select col1, col2, col3
from table t
where (col3 is not null) or
      (col3 is null and col2 is not null and
       not exists (select 1
                   from table t2
                   where t2.col1 = t.col1 and t2.col2 = t.col2 and t2.col3 is not null
                  )
      ) or
      (col2 is null and col1 is not null and
       not exists (select 1
                   from table t2
                   where t2.col1 = t.col1 and t2.col2 is not null
                  )
      );

这背后的逻辑是:

  1. 获取col3不为空的所有行。
  2. 获取col2不为空的所有行,并且没有类似的行,其值为col3
  3. 获取col1不为空的所有行,并且没有类似的行,其值为col2
  4. 编辑:

    在Oracle中,您可以更简单地执行此操作:

    select col1, col2, col3
    from (select t.*,
                 max(col3) over (partition by col1, col2) as maxcol3,
                 max(col2) over (partition by col1) as maxcol2
          from table t
         ) t
    where (col3 is not null) or
          (col2 is not null and maxcol3 is null) or
          (col1 is not null and maxcol2 is null);
    

    编辑II: (澄清了“更具体”的定义。)

    我认为这是逻辑的推断。它需要查看所有组合:

    select col1, col2, col3
    from (select t.*,
                 max(col3) over (partition by col1, col2) as maxcol3_12,
                 max(col2) over (partition by col1, col3) as maxcol2_13,
                 max(col1) over (partition by col2, col3) as maxcol1_23,
                 max(col1) over (partition by col1) as maxcol1_2,
                 max(col1) over (partition by col2) as maxcol1_3,
                 max(col2) over (partition by col1) as maxcol2_1,
                 max(col2) over (partition by col3) as maxcol2_3,
                 max(col3) over (partition by col2) as maxcol3_1,
                 max(col3) over (partition by col2) as maxcol3_2,
          from table t
         ) t
    where (col1 is not null and col2 is not null and col3 is not null) or
          (col1 is not null and col2 is not null and maxcol3 is null) or
          (col1 is not null and col3 is not null and maxcol2 is null) or
          (col2 is not null and col1 is not null and maxcol3 is null) or
          (col2 is not null and col3 is not null and maxcol1 is null) or
          (col3 is not null and col1 is not null and maxcol2 is null) or
          (col3 is not null and col2 is not null and maxcol1 is null) or
          (col1 is not null and maxcol2 is null and maxcol3 is null) or
          (col2 is not null and maxcol1 is null and maxcol3 is null) or
          (col3 is not null and maxcol1 is null and maxcol2 is null);
    

    第一个组合说“如果所有值都不为空,则保留此行”。第二个说:“如果col1和col2不为null且col3从不具有值,则保留此行”。等等,最后一个说:“保持这一行是col3不是null,col1和col2永远不会有值”。

    这可能简化为:

    where not ((col1 is null and maxcol1 is not null) or
               (col2 is null and maxcol2 is not null) or
               (col3 is null and maxcol3 is not null)
              );
    

答案 1 :(得分:0)

划分n Conquer一种方法!

演示: SQL Fiddle

SELECT col1,col2,MAX(col3)
FROM test
WHERE col1 is NOT NULL AND col2 is NOT NULL
GROUP BY col1,col2
UNION
SELECT col1,MAX(col2),col3
FROM test
WHERE col1 is NOT NULL AND col3 is NOT NULL
GROUP BY col1,col3
UNION
SELECT MAX(col1),col2,col3
FROM test
WHERE col2 is NOT NULL AND col3 is NOT NULL
GROUP BY col2,col3
UNION
SELECT col1,NULL,NULL
FROM test
GROUP BY COL1
HAVING COUNT(COL2) = 0 AND COUNT(COL3) = 0