Question

我有一个与解释相关的基本MySQL性能问题。我有两个返回相同结果的查询，我试图理解如何理解执行计划的EXPLAIN。

该表中有50000条记录，我正在进行记录比较。我的第一个查询需要18.625秒才能运行。解释计划如下。

id  select_type table   type    possible_keys                   key         key_len ref                                 rows    filtered    Extra
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
1   SIMPLE      a       ALL     NULL                            NULL        NULL    NULL                                49520   100.00  
1   SIMPLE      b       ref     scoreEvent,eventScore           eventScore  4       olympics.a.eventId                  413     100.00      Using where; Using index; Not exists
1   SIMPLE      c       ref     PRIMARY,scoreEvent,eventScore   scoreEvent  8       olympics.a.score,olympics.a.eventId 4       100.00      Using where; Using index; Not exists

我的下一个查询需要0.106秒才能运行...

id  select_type table       type    possible_keys   key     key_len     ref     rows    filtered    Extra
-----------------------------------------------------------------------------------------------------------------------------------
1   PRIMARY     <derived2>  ALL     NULL            NULL    NULL        NULL    50000   100.00      Using temporary; Using filesort
2   DERIVED     results     ALL     NULL            NULL    NULL        NULL    49520   100.00      Using filesort

在文档中，它说ALL需要全表扫描，这非常糟糕。它还说filesort需要额外的传递来对记录进行排序，它还说Not exists意味着MySQL能够进行LEFT JOIN优化。同样清楚的是，第一种方法是使用索引，而第二种方法则不然。

我正在尝试解决这里发生的事情以及涉及的数学问题。我在测试之间运行RESET QUERY CACHE以确保没有任何不公平的优势。 49520 x 413 x 4比50000 x 49520小很多。

是否与解释计划中的id有关？

当我测试这些和其他查询时，似乎我的观察结果是查询复杂性可以通过将具有相同id的项相乘并将每个id的结果加在一起来近似...这是一个有效的假设吗？ / p>

其他

根据评论中的要求，架构和查询以防万一，但我不是在寻找更好的查询......仅仅是对EXPLAIN的解释。有问题的表格......

CREATE TABLE results (
  resultId INT NOT NULL auto_increment KEY, 
  athleteId INT NOT NULL,
  eventId INT NOT NULL,
  score INT NOT NULL,
  CONSTRAINT FOREIGN KEY (athleteId) REFERENCES athletes(athleteId),
  CONSTRAINT FOREIGN KEY (eventId) REFERENCES events(eventId),
  INDEX eventScore (eventId, score),
  INDEX scoreEvent (score, eventId)
) ENGINE=innodb;

第一个查询...

SELECT a.resultId, a.eventId, a.athleteId, a.score
FROM results a 

-- Find records with matching eventIds and greater scores
LEFT JOIN results b 
ON b.eventId = a.eventId 
AND b.score > a.score

-- Find records with matching scores and lesser testIds
LEFT JOIN results c
ON c.eventId = a.eventId
AND c.score = a.score
AND c.resultId < a.resultId

-- Filter out all records where there were joins
WHERE c.resultId IS NULL 
AND b.resultId IS NULL;

第二个问题......

SELECT resultId, athleteId, eventId, score
FROM (
  SELECT resultId, athleteId, eventId, score
  FROM results
  ORDER BY eventId, score DESC, resultId
) AS a
GROUP BY eventId;

我还注意到，如果我删除索引eventScore，查询下降到2.531秒并且执行计划没有那么大的改变，但possible_keys的顺序发生了变化而且不是Using index对于表b（忽略行计数的细微变化，我每次更改架构时都会生成数据）...

id  select_type table   type    possible_keys               key         key_len ref                                 rows    filtered    Extra
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
1   SIMPLE      a       ALL     NULL                        NULL        NULL    NULL                                47457   100.00  
1   SIMPLE      b       ref     eventId,scoreEvent          eventId     4       olympics.a.eventId                  659     100.00      Using where; Not exists
1   SIMPLE      c       ref     PRIMARY,eventId,scoreEvent  scoreEvent  8       olympics.a.score,olympics.a.eventId 5       100.00      Using where; Using index; Not exists

Answer 1

事实上，当你看到你不应该成倍增加，但总结这个数字。在你的情况下比较（49520 x 413 x 4）和（50000 + 49520）。

Gereral规则很简单：汇总所有段（DERIVED，PRIMARY）并在每个段内乘以行。

id select_type  ... rows
1  PRIMARY           1
1  PRIMARY           2
2  DERIVED           3
2  DERIVED           4
3  DERIVED           5
3  DERIVED           6

复杂性为：1 * 2 + 3 * 4 + 5 * 6

Answer 2

不要过分信任EXPLAIN的“行”语句。与在mysql文档中一样：“估计要检查的行”（http://dev.mysql.com/doc/refman/5.1/en/explain-output.html）。

也许更新索引统计信息会给你一个更好的估计（OPTIMIZE TABLE，http://dev.mysql.com/doc/refman/5.0/en/optimize-table.html）

解释MySQL解释执行计划的数学，两个计划之间的区别

其他

2 个答案: