Mysql查询优化帮助

时间:2011-03-13 00:01:48

标签: mysql

我有这两个表:

CREATE TABLE `cpuinfo` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `usagetime` datetime DEFAULT NULL,
  `cpuusage` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id_UNIQUE` (`id`),
  KEY `idx_usagetime` (`usagetime`),
  KEY `idx_usage` (`cpuusage`));

CREATE TABLE `jobinfo` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `starttime` datetime NOT NULL,
  `endtime` datetime DEFAULT NULL,
  `jobname` text NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id_UNIQUE` (`id`),
  KEY `idx-startime` (`starttime`),
  KEY `idx-endtime` (`endtime`));

使用此查询:

explain SELECT j.id, j.starttime, j.endtime, j.jobname, c.cpuusage
   FROM (SELECT j.id, j.starttime, j.endtime, j.jobname, MAX(c.usagetime) AS usagetime
           FROM jobinfo AS j
      LEFT JOIN cpuinfo AS c ON c.usagetime <= j.starttime
       GROUP BY j.id) AS j
   JOIN cpuinfo AS c ON j.usagetime = c.usagetime
ORDER BY j.starttime

运行大约需要10分钟。

for explain命令,我得到了这个输出

id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra
---------------------------------------------------------------------------
1,PRIMARY,<derived2>,ALL,NULL,NULL,NULL,NULL,4557,"Using filesort"
1,PRIMARY,c,ref,idx_usagetime,idx_usagetime,9,j.usagetime,1,"Using where"
2,DERIVED,j,ALL,NULL,NULL,NULL,NULL,4557,"Using temporary; Using filesort"
2,DERIVED,c,index,idx_usagetime,idx_usagetime,9,NULL,2880,"Using index"

你能给我一些优化这个SQL查询的技巧吗?

这是我的原始帖子:

Mysql join with time matching

3 个答案:

答案 0 :(得分:0)

您加入的不是比较:

c.usagetime <= j.starttime

这意味着每个使用时间小于作业开始时间的cpu记录将加入作业记录。随着时间的推移,这个查询将变得越来越慢,因为如果它存在,它将加入几个月前的信息。您只对作业开始前的最新条目感兴趣。

如果您确信在作业开始时间的某个时间段内有cpuinfo记录,请将其更改为范围搜索。

c.usagetime between j.starttime and date_sub(j.starttime, interval 5 minute)

这应该会大大加快速度。你可以越小越好。

答案 1 :(得分:0)

你可以尝试这个小技巧:

SELECT j.id, j.starttime, j.endtime, j.jobname, c.cpuusage
FROM
(
    SELECT j.id, j.starttime, j.endtime, j.jobname, MAX(c.usagetime) AS usagetime
    FROM jobinfo AS j
    LEFT JOIN cpuinfo AS c
    ON c.usagetime <= j.starttime
    WHERE c.usagetime > DATE_ADD(j.starttime, INTERVAL -1 DAY);
    GROUP BY j.id
) AS j
JOIN cpuinfo AS c
ON j.usagetime = c.usagetime
ORDER BY j.starttime;

这应该导致服务器只占用o cpuinfo表的一部分,也不占整个或一半。

PS:尝试考虑间隔值,也许5分钟就足够了。

答案 2 :(得分:0)

尝试:

SELECT ji.starttime, 
       ji.endtime,
       ji.jobname,
       (SELECT ci.cpuusage
          FROM CPUINFO ci
         WHERE ci.usagetime <= ji.endtime
      ORDER BY ci.usagetime DESC
         LIMIT 1) AS cpuusage
  FROM JOBINFO ji

这是我5.1.49上的EXPLAIN输出:

id   select_type           table type   possible_keys    key   key_len ref   rows Extra
------------------------------------------------------------------------------------------------
'1', 'PRIMARY',            'ji', 'ALL', NULL,            NULL, NULL,   NULL, '12', ''
'2', 'DEPENDENT SUBQUERY', 'ci', 'ALL', 'idx_usagetime', NULL, NULL,   NULL, '6', 'Using where; Using filesort'