MySQL查询速度很慢

时间:2015-12-31 00:18:37

标签: mysql sql

我的表格包含以下列:

gamelogs_id (auto_increment primary key)
player_id (int)
player_name (varchar)
game_id (int)
season_id (int)
points (int)

该表具有以下索引

+-----------------+------------+--------------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table           | Non_unique | Key_name           | Seq_in_index | Column_name        | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+------------+--------------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| player_gamelogs |          0 | PRIMARY            |            1 | player_gamelogs_id | A         |      371330 |     NULL | NULL   |      | BTREE      |         |               |
| player_gamelogs |          1 | player_name        |            1 | player_name        | A         |        3375 |     NULL | NULL   | YES  | BTREE      |         |               |
| player_gamelogs |          1 | points          |            1 | points          | A         |         506 |     NULL | NULL   | YES  | BTREE      |         ## Heading ##|               |
| player_gamelogs |          1 | game_id            |            1 | game_id            | A         |       37133 |     NULL | NULL   | YES  | BTREE      |         |               |
| player_gamelogs |          1 | season             |            1 | season             | A         |          30 |     NULL | NULL   | YES  | BTREE      |         |               |
| player_gamelogs |          1 | team_abbreviation  |            1 | team_abbreviation  | A         |          70 |     NULL | NULL   | YES  | BTREE      |         |               |
| player_gamelogs |          1 | player_id          |            1 | game_id            | A         |       41258 |     NULL | NULL   | YES  | BTREE      |         |               |
| player_gamelogs |          1 | player_id          |            2 | player_id          | A         |      371330 |     NULL | NULL   | YES  | BTREE      |         |               |
| player_gamelogs |          1 | player_id          |            3 | dk_points          | A         |      371330 |     NULL | NULL   | YES  | BTREE      |         |               |
| player_gamelogs |          1 | game_player_season |            1 | game_id            | A         |       41258 |     NULL | NULL   | YES  | BTREE      |         |               |
| player_gamelogs |          1 | game_player_season |            2 | player_id          | A         |      371330 |     NULL | NULL   | YES  | BTREE      |         |               |
| player_gamelogs |          1 | game_player_season |            3 | season_id          | A         |      371330 |     NULL | NULL   |      | BTREE      |         |               |
+-----------------+------------+--------------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

我正在尝试计算比赛开始前赛季和球员的积分平均值。因此,对于本赛季的第3场比赛,avg_points将是游戏1和2的平均值。游戏数量按顺序排列,使得较早的游戏比较晚的游戏少。我也可以选择使用日期字段,但我认为数字比较会更快?

我的查询如下:

SELECT game_id, 
       player_id, 
       player_name, 
       (SELECT avg(points) 
          FROM player_gamelogs t2
         WHERE t2.game_id < t1.game_id 
           AND t1.player_id = t2.player_id 
           AND t1.season_id = t2.season_id) AS avg_points
  FROM player_gamelogs t1
 ORDER BY player_name, game_id;

EXPLAIN产生以下输出:

| id | select_type        | table | type | possible_keys                        | key  | key_len | ref  | rows   | Extra                                           |
+----+--------------------+-------+------+--------------------------------------+------+---------+------+--------+-------------------------------------------------+
|  1 | PRIMARY            | t1    | ALL  | NULL                                 | NULL | NULL    | NULL | 371330 | Using filesort                                  |
|  2 | DEPENDENT SUBQUERY | t2    | ALL  | game_id,player_id,game_player_season | NULL | NULL    | NULL | 371330 | Range checked for each record (index map: 0xC8) |

我不确定这是因为涉及的任务的性质还是因为我的查询效率低下。谢谢你的任何建议!

3 个答案:

答案 0 :(得分:7)

请考虑此查询:

SELECT t1.season_id, t1.game_id, t1.player_id, t1.player_name, AVG(COALESCE(t2.points, 0)) AS average_player_points
FROM player_gamelogs t1
        LEFT JOIN player_gamelogs t2 ON 
                t1.game_id > t2.game_id 
            AND t1.player_id = t2.player_id
            AND t1.season_id = t2.season_id 
GROUP BY
    t1.season_id, t1.game_id, t1.player_id, t1.player_name
ORDER BY t1.player_name, t1.game_id;

注意:

  • 为了达到最佳效果,你需要一个额外的索引(season_id,game_id,player_id,player_name)
  • 更好的方法是让播放器表从id中检索名称。对我来说似乎有点多余,我们必须从日志表中获取播放器名称,而且如果它在索引中是必需的。
  • Group by已按分组列排序。如果可以,请避免事后订购,因为它会产生无用的开销。 如评论中所述,这不是一种官方行为,假设其随时间的一致性的结果应该考虑与突然失去分类的风险。

答案 1 :(得分:2)

你的查询没问题如下:

SELECT game_id, player_id, player_name, 
       (SELECT avg(t2.points) 
        FROM player_gamelogs t2
        WHERE t2.game_id < t1.game_id AND
              t1.player_id = t2.player_id AND
              t1.season_id = t2.season_id
      ) AS avg_points
FROM player_gamelogs t1
ORDER BY player_name, game_id;

但是,为获得最佳性能,您需要两个复合索引:(player_id, season_id, game_id, points)(player_name, game_id, season_id)

第一个索引应该加速子查询。第二个是外部order by

答案 2 :(得分:1)

现在您正在查询,您正在为每个玩家运行每个游戏及其下的所有游戏...例如,如果您每人有10个游戏,则每个季节获得以下结果/人

Game 10, Game 10 points, avg of games 1-9
Game 9, Game 9 points, avg of games 1-8...
...
...
Game 2, Game 2 points, avg of thus final game 1 only.

你声明你想要最新的游戏,其中包含所有内容的平均值。也就是说,我假设你并不关心每人每个较低的游戏等级。

您也在进行涵盖所有季节的查询。如果一个季节结束,你关心旧季节吗?或者只是当前的季节。否则你将经历所有赛季,所有球员......

所有这一切,我提供以下内容。首先,使用WHERE子句将查询限制为最新季节,但我特意将季节留在查询/组中,以防您想要其他季节。然后,我将给定人/季的MAXIMUM游戏作为最后1行(每人季节)的基线,然后得到其下的所有内容的平均值。所以,在10场比赛的场景样本中,我不会抓住9-2的基础行,只是根据我的场景返回#10游戏。

select
      pgMax.Player_ID,
      pgMax.Season_ID,
      pgMax.mostRecentGameID,
      pgl3.points as mostRecentGamePoints,
      pgl3.player_name,
      coalesce( avg( pgl2.points ), 0 ) as AvgPointsPriorToCurrentGame
   from
      ( select pgl1.player_id,
               pgl1.season_id,
               max( pgl1.game_id ) as mostRecentGameID
           from
              player_gameLogs pgl1
           where
               pgl1.season_id = JustOneSeason
           group by
              pgl1.player_id,
              pgl1.season_id ) pgMax

         JOIN player_gamelogs pgl pgl2
            on pgMax.player_id = pgl2.player_id
           AND pgMax.season_id = pgl2.season_id
           AND pgMax.mostRecentGameID > pgl2.game_id

         JOIN player_gamelogs pgl pgl3
            on pgMax.player_id = pgl3.player_id
           AND pgMax.season_id = pgl3.season_id
           AND pgMax.mostRecentGameID = pgl3.game_id
   group by
      pgMax.Player_ID,
      pgMax.Season_ID
   order by
      pgMax.Player_ID

现在,为了优化查询,最好使用复合索引 (player_id,season_id,game_id,points)。但是,如果你只是在寻找“当前季节”,那么你的索引(season_id,player_id,game_id,points)将SEASON ID放在第一位置以预先认证WHERE子句。