需要帮助来优化此Sql查询

时间:2019-06-17 12:14:26

标签: performance hive query-optimization hiveql

我需要一些帮助来优化此SQL查询。 这完全正常。我只想减少此查询的运行时间

select distinct 
o.usrp_order_number,t.* 
from ms_bvoip_order_extension oe
 inner join ms_order o on oe.ms_order_id = o.ms_order_id
 inner join ms_sub_order so on so.ms_order_id = o.ms_order_id
 inner join ms_job j on j.entity_id = so.ms_sub_order_id
  left join mstask t ON t.wf_job_id = j.wf_job_id
  where
  o.order_type = 900
  and o.entered_date between date_sub(current_date(),53) and
 date_sub(current_date(),3)
  and j.entity_type = 5 and t.name RLIKE 'Error|Correct|Create AOTS Ticket' and t.wf_job_id is not null
  order by
  o.usrp_order_number

3 个答案:

答案 0 :(得分:1)

在Hive中加入后将执行WHERE条件(尽管CBO和PPD可能会更改此行为),请更好地研究两个查询的EXPLAIN输出。您可以将以下条件移动:o.order_type = 900到join ON子句以减少连接时的行数。 Hive中的join ON子句只允许涉及两个表列的非等式条件。表t也是左联接的,但是wheret.name RLIKE 'Error|Correct|Create AOTS Ticket' and t.wf_job_id is null and t.ORIGINAL_START_DATE is not null中的条件将左联接转换为内部联接。检查您是否需要INNER或LEFT JOIN

select distinct 
o.usrp_order_number,t.* 
from ms_bvoip_order_extension oe
 inner join ms_order o 
    on oe.ms_order_id = o.ms_order_id
       and o.order_type = 900
       and and o.entered_date between date_sub(current_date(),53) and date_sub(current_date(),3)                 
 inner join ms_sub_order so on so.ms_order_id = o.ms_order_id
 inner join ms_job j on j.entity_id = so.ms_sub_order_id 
                    and j.entity_type = 5
 left join mstask t on t.wf_job_id = j.wf_job_id 
                    and t.name RLIKE 'Error|Correct|Create AOTS Ticket' 
                    and t.wf_job_id is null
                    and t.ORIGINAL_START_DATE is not null 
order by o.usrp_order_number

也请阅读有关配置设置的以下答案:https://stackoverflow.com/a/48487306/2700344

答案 1 :(得分:0)

确保您在

上具有正确的索引

表ms_order输入的日期,order_type,ms_order_id列上的复合索引

表ms_job在实体entity_type,entity_id列上的复合索引

表mstask在ORIGINAL_START_DATE wf_job_id列上的复合索引

表ms_sub_order在列ms_order_id上的索引

表ms_bvoip_order_extension和ms_order_id列上的索引

答案 2 :(得分:0)

您将需要为过滤依据的列添加索引。

我们不知道每个表保存多少记录,但是t.name RLIKE条件应该作为最后一项进行评估。我将根据以下想法重写您的查询:

select ...
from
(
    select ...
    inner join ...
    inner join ...
    inner join ...
    left join ...
    where ...
) temporary
where temporary.somename RLIKE 'Error|Correct|Create AOTS Ticket'
o.usrp_order_number

如果查询不是非常动态,那么您甚至可以将结果缓存一段时间。