TPCH查询优化

时间:2017-04-29 00:59:18

标签: mysql performance

以下查询到目前为止需要5个小时才能运行:

INSERT $LINEITEM_PUBLIC     SELECT  *
    FROM  LINEITEM
    WHERE  L_PARTKEY  IN ( SELECT P_PARTKEY FROM  $PART_PUBLIC )
      AND  L_SUPPKEY  IN ( SELECT S_SUPPKEY FROM  $SUPPLIER_PUBLIC )
      AND  L_ORDERKEY IN ( SELECT O_ORDERKEY FROM  $ORDERS_PUBLIC );

我添加了所有必需的索引,但似乎没有任何帮助。查询说明计划打印以下内容:

+----+-------------+------------------+------------+--------+--------------------------------+-------------+---------+--------------------------------+----------+----------+-------------+
| id | select_type | table            | partitions | type   | possible_keys                  | key         | key_len | ref                            | rows     | filtered | Extra       |
+----+-------------+------------------+------------+--------+--------------------------------+-------------+---------+--------------------------------+----------+----------+-------------+
|  1 | INSERT      | $LINEITEM_PUBLIC | NULL       | ALL    | NULL                           | NULL        | NULL    | NULL                           |     NULL |     NULL | NULL        |
|  1 | SIMPLE      | $ORDERS_PUBLIC   | NULL       | index  | PRIMARY                        | O_ORDERDATE | 3       | NULL                           | 12826617 |   100.00 | Using index |
|  1 | SIMPLE      | LINEITEM         | NULL       | ref    | PRIMARY,LINEITEM_FK2,L_SUPPKEY | PRIMARY     | 4       | TPCH.$ORDERS_PUBLIC.O_ORDERKEY |        3 |   100.00 | NULL        |
|  1 | SIMPLE      | $SUPPLIER_PUBLIC | NULL       | eq_ref | PRIMARY                        | PRIMARY     | 4       | TPCH.LINEITEM.L_SUPPKEY        |        1 |   100.00 | Using index |
|  1 | SIMPLE      | $PART_PUBLIC     | NULL       | eq_ref | PRIMARY                        | PRIMARY     | 4       | TPCH.LINEITEM.L_PARTKEY        |        1 |   100.00 | Using index |
+----+-------------+------------------+------------+--------+--------------------------------+-------------+---------+--------------------------------+----------+----------+-------------+

有关如何优化此查询的任何建议?

更新 上一个查询中表的大小如下:

  • LINEITEM:60M记录
  • $ ORDERS_PUBLIC:13M记录
  • $ SUPPLIER_PUBLIC:92K记录
  • $ PART_PUBLIC:2M记录

1 个答案:

答案 0 :(得分:0)

确保索引以O_ORDERKEY开头。

IN (SELECT ...)可能效果不佳(取决于版本);试试这个:

INSERT $LINEITEM_PUBLIC
    SELECT  l.*
        FROM LINEITEM AS l
        WHERE  EXISTS( SELECT * FROM  $PART_PUBLIC     WHERE P_PARTKEY  = L_PARTKEY )
          AND  EXISTS( SELECT * FROM  $SUPPLIER_PUBLIC WHERE S_SUPPKEY  = L_SUPPKEY )
          AND  EXISTS( SELECT * FROM  $ORDERS_PUBLIC   WHERE O_ORDERKEY = L_ORDERKEY );