独特的SQL列对

时间:2017-02-02 08:48:41

标签: sql vertica

我有4列,如下所示:

COL1   COL1_TIME  COL2  COL2_TIME
    A   09:20:00    E   09:35:00
    A   09:20:00    F   09:36:00
    A   09:20:00    G   09:40:00
    A   09:20:00    H   09:59:00
    B   09:25:00    E   09:35:00
    B   09:25:00    F   09:36:00
    B   09:25:00    G   09:40:00
    B   09:25:00    H   09:59:00
    C   09:30:00    E   09:35:00
    C   09:30:00    F   09:36:00
    C   09:30:00    G   09:40:00
    C   09:30:00    H   09:59:00
    D   09:50:00    H   09:59:00

我必须从列COL1和COL2中选择唯一的值对。要找到一对,您应该在COL2_TIME中将最近的时间用于COL1_TIME。

因此,A的最短时间是E.对于B,其F-E已经被采用等。

结果应如下所示:

A E
B F
C G
D H

有什么想法吗?

2 个答案:

答案 0 :(得分:0)

如果COL1和COL2不同值的基数始终为1-1且不存在其他特殊情况/ exeptions,则可以执行以下操作:

with  temp1 as (

    select col1
          ,col1_time
          ,row_number() over (partition by col1 order by col1 desc) as rownum1  

), temp2 as(

    select col2
          ,col2_time
          ,row_number() over (partition by col2 order by col2 desc) as rownum2 

)

   select  distinct(temp1.col1)
       ,distinct(temp2.col2) 
   from temp1,temp2    
  where temp1.rownum1 = temp2.rownum2      

答案 1 :(得分:0)

好吧,如果没有递归的WITH Common Table Expression,你需要硬连接一些东西。 如果你有超过4个COL1值,它会变得更加乏味;如果它是一个非常重要的业务问题,请考虑为此编写一个UDx。

但是 - 否则 - 这是一个有效的 - 输入包含在WITH子句的第一个公用表表达式中:

WITH 
input(col1,col1_time,col2,col2_time) AS (
          SELECT 'A',TIME '09:20:00','E',TIME '09:35:00'
UNION ALL SELECT 'A',TIME '09:20:00','F',TIME '09:36:00'
UNION ALL SELECT 'A',TIME '09:20:00','G',TIME '09:40:00'
UNION ALL SELECT 'A',TIME '09:20:00','H',TIME '09:59:00'
UNION ALL SELECT 'B',TIME '09:25:00','E',TIME '09:35:00'
UNION ALL SELECT 'B',TIME '09:25:00','F',TIME '09:36:00'
UNION ALL SELECT 'B',TIME '09:25:00','G',TIME '09:40:00'
UNION ALL SELECT 'B',TIME '09:25:00','H',TIME '09:59:00'
UNION ALL SELECT 'C',TIME '09:30:00','E',TIME '09:35:00'
UNION ALL SELECT 'C',TIME '09:30:00','F',TIME '09:36:00'
UNION ALL SELECT 'C',TIME '09:30:00','G',TIME '09:40:00'
UNION ALL SELECT 'C',TIME '09:30:00','H',TIME '09:59:00'
UNION ALL SELECT 'D',TIME '09:50:00','H',TIME '09:59:00'
)
,
col1_A AS (
SELECT 
  col1
, col2
FROM input
WHERE col1='A'
ORDER BY ABS(TIMESTAMPDIFF('SECOND',col1_time::TIMESTAMP,col2_time::TIMESTAMP))
LIMIT 1
)
,
col1_B AS (
SELECT 
  col1
, col2
FROM input
WHERE col1='B'
  AND col2 NOT IN (
    SELECT col2 FROM col1_A
  )
ORDER BY ABS(TIMESTAMPDIFF('SECOND',col1_time::TIMESTAMP,col2_time::TIMESTAMP))
LIMIT 1
)
,
col1_C AS (
SELECT 
  col1
, col2
FROM input
WHERE col1='C'
  AND col2 NOT IN (
              SELECT col2 FROM col1_A 
    UNION ALL SELECT col2 FROM col1_B
  )
ORDER BY ABS(TIMESTAMPDIFF('SECOND',col1_time::TIMESTAMP,col2_time::TIMESTAMP))
LIMIT 1
)
,
col1_D AS (
SELECT 
  col1
, col2
FROM input
WHERE col1='D'
  AND col2 NOT IN (
              SELECT col2 FROM col1_A 
    UNION ALL SELECT col2 FROM col1_B 
    UNION ALL SELECT col2 FROM col1_C
  )
ORDER BY ABS(TIMESTAMPDIFF('SECOND',col1_time::TIMESTAMP,col2_time::TIMESTAMP))
LIMIT 1
)
          SELECT * FROM col1_A 
UNION ALL SELECT * FROM col1_B 
UNION ALL SELECT * FROM col1_C 
UNION ALL SELECT * FROM col1_D
;

如果它不是你所希望的,我不会感到惊讶......

开心玩......

Marco the Sane