MySQL优化需要解释

时间:2014-09-05 08:59:42

标签: mysql sql optimization

我需要从"最新的"中获取价值。 (即最高记录ID)记录每个字段的值(在这种情况下为server_name)。

我已在server_name_idserver_name添加了id索引。

我的第一次尝试需要几分钟才能完成。

SELECT server_name, state
FROM replication_client as a
WHERE id = (
  SELECT MAX(id) 
  FROM replication_client 
  WHERE server_name = a.server_name)
ORDER BY server_name

我的第二次尝试耗费了0.001秒。

SELECT rep.server_name, state FROM (
  SELECT server_name, MAX(id) AS max_id
  FROM replication_client
  GROUP BY server_name) AS newest,
replication_client AS rep
WHERE rep.id = newest.max_id
ORDER BY server_name

此优化背后的原理是什么? (我希望能够在没有反复试验的情况下编写优化查询。)

P.S。解释如下:

mysql> EXPLAIN
    ->
    ->   SELECT server_name, state
    ->   FROM replication_client as a
    ->   WHERE id = (SELECT MAX(id) FROM replication_client WHERE server_name = a.server_name)
    ->   ORDER BY server_name
    -> ;
+----+--------------------+--------------------+------+----------------+----------------+---------+-------------------+--------+-----------------------------+
| id | select_type        | table              | type | possible_keys  | key            | key_len | ref               | rows   | Extra                       |
+----+--------------------+--------------------+------+----------------+----------------+---------+-------------------+--------+-----------------------------+
|  1 | PRIMARY            | a                  | ALL  | NULL           | NULL           | NULL    | NULL              | 630711 | Using where; Using filesort |
|  2 | DEPENDENT SUBQUERY | replication_client | ref  | server_name_id | server_name_id | 18      | mrg.a.server_name |  45050 | Using index                 |
+----+--------------------+--------------------+------+----------------+----------------+---------+-------------------+--------+-----------------------------+

mysql> explain  
    ->   SELECT rep.server_name, state FROM (
    ->     SELECT server_name, MAX(id) AS max_id
    ->     FROM replication_client
    ->     GROUP BY server_name) AS newest,
    ->   replication_client AS rep
    ->   WHERE rep.id = newest.max_id
    ->   ORDER BY server_name
    -> ;
+----+-------------+--------------------+--------+---------------+----------------+---------+---------------+------+---------------------------------+
| id | select_type | table              | type   | possible_keys | key            | key_len | ref           | rows | Extra                           |
+----+-------------+--------------------+--------+---------------+----------------+---------+---------------+------+---------------------------------+
|  1 | PRIMARY     | <derived2>         | ALL    | NULL          | NULL           | NULL    | NULL          |    2 | Using temporary; Using filesort |
|  1 | PRIMARY     | rep                | eq_ref | PRIMARY       | PRIMARY        | 4       | newest.max_id |    1 |                                 |
|  2 | DERIVED     | replication_client | range  | NULL          | server_name_id | 18      | NULL          |   15 | Using index for group-by        |
+----+-------------+--------------------+--------+---------------+----------------+---------+---------------+------+---------------------------------+

1 个答案:

答案 0 :(得分:2)

嗯,当你在第一个解释计划中看到两个单词时,整个事情都是非常自我解释的:DEPENDENT SUBQUERY

这意味着,对于您的where条件检查的每一行,都会执行子查询。当然,这可能会很慢。

另请注意,执行查询时有一个操作顺序。

FROM clause
WHERE clause
GROUP BY clause
HAVING clause
ORDER BY clause 
SELECT clause

当你可以在FROM子句中过滤时,它比在WHERE子句中过滤更好...