UNION所有具有不同列名的表

时间:2018-02-09 19:13:25

标签: sql amazon-redshift

我有两个相似的表,我试图将它们加入一个联盟,然后可能分组。我希望添加一个null或0列,其中表不重叠。

SELECT count(traffic_volume_1) as traffic_volume_1, 
       traffic_source, 
       timestamp
FROM table_1
UNION ALL 
       count(traffic_volume_2) as traffic_volume_2, 
       traffic_source, 
       timestamp
FROM table_2
...?

我正在寻找一个看起来像的回报:

traffic_volume_1, traffic_volume_2, timestamp, traffic_source
77777           , 0               , 2018-02-09, US
0               , 928320          , 2018-02-09, EU

有什么想法吗?

5 个答案:

答案 0 :(得分:1)

UNION

的两半中为0值列添加占位符
SELECT count(traffic_volume_1) as traffic_volume_1, 
       0 as traffic_volume_2,
       traffic_source, 
       timestamp
FROM table_1
GROUP BY traffic_source, 
         timestamp
UNION ALL 
       0 as traffic_volume_1,
       count(traffic_volume_2) as traffic_volume_2, 
       traffic_source, 
       timestamp
FROM table_2
GROUP BY traffic_source, 
         timestamp

答案 1 :(得分:1)

将查询移至from子句。您可以使用full outer join

组合它们
SELECT COALESCE(t1.traffic_source, t2.traffic_source) as traffic_source,
       COALESCE(t1.timestamp, t2.timestamp) as timestamp,
       t1.traffic_volume_1, t2.traffic_volume_2
FROM (SELECT count(traffic_volume_1) as traffic_volume_1, traffic_source, timestamp
      FROM table_1
      GROUP BY traffic_source, timestamp
     ) t1 FULL OUTER JOIN
     (SELECT count(traffic_volume_2) as traffic_volume_2, traffic_source, timestamp
      FROM table_2
      GROUP BY traffic_source, timestamp
     ) t2
     ON t1.traffic_source = t2.traffic_source AND t1.timestamp = t2.timestamp

答案 2 :(得分:0)

如果您在运行查询之前知道所需的额外列数,答案很简单:您不需要UNION,而是在FROM中的单独子句中对每个表执行查询,并在它们上面加入它们traffic_source和timestamp。

但是,如果您不知道在运行查询之前将拥有多少列,那么您需要的是交叉表 pivot 查询。

透视查询将不同的行值转换为额外的列。您可以将其视为将查询记录集旋转90度;引擎生成新列,而不是生成新行。

Pivot查询语法在SQL中是特定于平台的,因为它是非标准的。不确定您使用的是哪个平台,但检查是否支持交叉表/数据透视查询。

答案 3 :(得分:0)

我认为你真的想在这里加入,就像这样:

SELECT t1.traffic_volume_1, t2.traffic_volume_2,
CASE WHEN t1.traffic_source IS NOT NULL THEN t1.traffic_source
     ELSE t2.traffic_source END AS traffic_source,
CASE WHEN t1.timestamp IS NOT NULL THEN t1.timestamp
     ELSE t2.timestamp END AS timestamp
FROM t1
FULL JOIN t2
ON t1.timestamp = t2.timestamp AND t1.traffic_volume_1 = t2.traffic_volume_2

这是一个简单的例子:

CREATE TABLE t1 (traffic_volume_1 int, traffic_source varchar(2), timestamp date);
CREATE TABLE t2 (traffic_volume_2 int, traffic_source varchar(2), timestamp date);

INSERT INTO t1 VALUES (500, 'US', '2018-01-01'), (250, 'US', '2018-01-02');
INSERT INTO t2 VALUES (400, 'US', '2018-01-01'), (250, 'US', '2018-01-03');

上面的查询会给你:

    traffic_volume_1    traffic_volume_2    traffic_source  timestamp
1   500                 NULL                US              01.01.2018
2   250                 NULL                US              02.01.2018
3   NULL                400                 US              01.01.2018
4   NULL                250                 US              03.01.2018

在这里测试:http://rextester.com/XNV59775

答案 4 :(得分:0)

这是一个如何做的例子

kubectl get nodes,pods,svc --all-namespaces -o wide