优化GROUP BY& ORDER BY查询

时间:2010-05-18 10:39:12

标签: mysql query-optimization

我有一个用户上传和观看视频的网页。上周我asked跟踪视频观看的最佳方式是什么,以便我本周可以显示观看次数最多的视频(所有日期的视频)。

现在我需要一些帮助来优化我从数据库中获取视频的查询。相关表格如下:

video (~239371 rows)
VID(int), UID(int), title(varchar), status(enum), type(varchar), is_duplicate(enum), is_adult(enum), channel_id(tinyint)

signup (~115440 rows)
UID(int), username(varchar)

videos_views (~359202 rows after 6 days of collecting data, so this table will grow rapidly)
videos_id(int), views_date(date), num_of_views(int)

表格video包含视频,signup hodls用户和videos_views包含有关视频观看次数的数据(每个视频在该表格中每天可以有一行)。

我有这个查询可以解决这个问题,但需要大约10秒才能执行,而且我认为随着videos_views表的大小增加,这种情况会随着时间的推移而变得更糟。

SELECT
 v.VID, 
 v.title, 
 v.vkey, 
 v.duration, 
 v.addtime, 
 v.UID, 
 v.viewnumber, 
 v.com_num, 
 v.rate, 
 v.THB, 
 s.username,
 SUM(vvt.num_of_views) AS tmp_num
FROM
 video v
  LEFT JOIN videos_views vvt ON v.VID = vvt.videos_id
  LEFT JOIN signup s on v.UID = s.UID
WHERE
 v.status = 'Converted'
 AND v.type = 'public'
 AND v.is_duplicate = '0'
 AND v.is_adult = '0'
 AND v.channel_id <> 10
 AND vvt.views_date >= '2001-05-11'
GROUP BY
 vvt.videos_id
ORDER BY
 tmp_num DESC
LIMIT
 8

所有相关字段都已编入索引。 以下是EXPLAIN结果的屏幕截图: alt text http://img685.imageshack.us/img685/9440/explain.png

那么,我该如何优化呢?

更新 这是我基于Quassnoi答案的查询。它会返回正确的视频,但会在注册表上混淆JOIN。对于某些记录,username字段为NULL,而对于其他记录,它包含错误的用户名。

SELECT
    v.VID,
    v.title,
    v.vkey,
    v.duration,
    v.addtime,
    v.UID,
    v.viewnumber,
    v.com_num,
    v.rate,
    v.THB,
    s.username
FROM
    (SELECT
        videos_id,
        SUM(num_of_views) AS tmp_num
    FROM
        videos_views
    WHERE
        views_date >= '2010-05-13'
    GROUP BY
        videos_id
    ) q
        JOIN video v ON v.VID = q.videos_id
        LEFT JOIN signup s ON s.UID = v.VID
WHERE
    v.type = 'public'
    AND v.channel_id <> 10
    AND v.is_adult = '0'
    AND is_duplicate = '0'
ORDER BY
    tmp_num DESC
LIMIT
    8

以下是结果集: alt text http://img714.imageshack.us/img714/2954/resultu.png

2 个答案:

答案 0 :(得分:2)

是的,计算列上的ORDER BY总是不可索引的。遗憾。

如果您要进行大量此查询,并且希望避免每次必须计算和排序的每个视频的视图,则必须进行非规范化。添加views_in_last_week列,每天在后台从videos_views重新计算,然后将其编入索引(可能在具有其他相关WHERE条件的复合索引中)。

答案 1 :(得分:1)

创建以下索引:

video_views (views_date, videos_id)

,摆脱LEFT JOINvideos之间的views(无论如何,它都不适用于您当前的查询):

SELECT  *
FROM    (
        SELECT  videos_id, SUM(num_of_views) AS tmp_num
        FROM    video_views
        GROUP BY
                videos_id
        ) q
JOIN    videos v
ON      v.vid = q.videos_id
LEFT JOIN
        signup s
ON      s.UID = v.UID
ORDER BY
        tmp_num DESC
LIMIT 8

如果您想为从未查看过的视频返回零,请更改索引中字段的顺序:

video_views (videos_id, views_date)

并重写查询:

SELECT  *,
        (
        SELECT  COALESCE(SUM(num_of_views), 0)
        FROM    video_views vw
        WHERE   vw.videos_id = v.vid
                AND views_date >= '2001-05-11'
        ) AS tmp_num
FROM    videos v
LEFT JOIN
        signup s
ON      s.UID = v.UID
ORDER BY
        tmp_num DESC
LIMIT 8