具有高效连接的Hive查询

时间:2014-01-30 19:10:22

标签: sql hive

我想知道是否可以重写以下查询,以便第一个where子句生成可以连接到小表并进一步过滤的大表的子集。

SELECT *
FROM big_table x
JOIN small_table y 
ON trim(x.ip_adress) = trim(y.ip_address)
WHERE eventdate = '2013-09-01'
AND unix_timestamp(cast(x.date AS TIMESTAMP)) - unix_timestamp(cast(y.date AS TIMESTAMP)) < 100 LIMIT 5 ;

1 个答案:

答案 0 :(得分:1)

SELECT *
FROM ( SELECT *
       FROM big_table
       WHERE eventdate = '2013-09-01') x
JOIN small_table y ON trim(x.ip_adress) = trim(y.ip_address) AND 
                      unix_timestamp(cast(x.date AS TIMESTAMP)) - 
                          unix_timestamp(cast(y.date AS TIMESTAMP)) < 100
LIMIT 5;