Mysql:优化派生表上的计数查询

时间:2014-03-19 18:53:35

标签: mysql performance

我正在尝试对派生表执行计数查询以进行分页。查询如下所示:

SELECT 
    assignment_completions.id as id,
    assignment_completions.first_name,
    assignment_completions.last_name,
    groups.name

FROM
    assignment_completions
        LEFT JOIN
    groups_users ON assignment_completions.user_id = groups_users.user_id
        LEFT JOIN
    groups ON groups_users.group_id = groups.id
WHERE
    assignment_completions.handler = 'course'
GROUP BY assignment_completions.id

计数查询只包含上述查询:

SELECT COUNT(*) FROM (...) AS assignment_count

没有计数的查询在.005秒内执行。 带有计数的查询在1.5秒内执行。

我试过以下但没有运气:

1)使用索引列(此处没有性能提升):

SELECT COUNT (id) FROM (...)

2)我尝试使用SQL_CALC_FOUND_ROWS,但实际上有点慢(2秒左右)。

详细信息:

assignment_completions:200k行

用户:35k行

groups_users:500k行

组:1k行

表格定义

CREATE TABLE `assignment_completions` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `user_id` int(11) DEFAULT NULL,
  `assignment_id` int(11) DEFAULT NULL,
  `handler` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `handler_id` int(11) DEFAULT NULL,
  `time_started` datetime DEFAULT NULL,
  `time_end` datetime DEFAULT NULL,
  `status` int(11) DEFAULT NULL,
  `application_instance_id` int(11) DEFAULT NULL,
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  `first_name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `last_name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `index_assignment_completions_on_first_name` (`first_name`) USING BTREE,
  KEY `index_assignment_completions_on_last_name` (`last_name`) USING BTREE,
  KEY `index_assignment_completions_on_user_id` (`user_id`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=200001 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

CREATE TABLE `users` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `email` varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `encrypted_password` varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `reset_password_token` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `reset_password_sent_at` datetime DEFAULT NULL,
  `remember_created_at` datetime DEFAULT NULL,
  `sign_in_count` int(11) NOT NULL DEFAULT '0',
  `current_sign_in_at` datetime DEFAULT NULL,
  `last_sign_in_at` datetime DEFAULT NULL,
  `current_sign_in_ip` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `last_sign_in_ip` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  `application_instance_id` int(11) DEFAULT NULL,
  `username` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `first_name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `last_name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `status` int(11) DEFAULT NULL,
  `group_list_cache` text COLLATE utf8_unicode_ci,
  PRIMARY KEY (`id`),
  UNIQUE KEY `index_users_on_reset_password_token` (`reset_password_token`) USING BTREE,
  UNIQUE KEY `index_users_on_username_and_application_instance_id` (`username`,`application_instance_id`) USING BTREE,
  KEY `index_users_on_application_instance_id` (`application_instance_id`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=30006 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

CREATE TABLE `groups_users` (
  `group_id` int(11) DEFAULT NULL,
  `user_id` int(11) DEFAULT NULL,
  UNIQUE KEY `index_groups_users_on_group_id_and_user_id` (`group_id`,`user_id`) USING BTREE,
  KEY `index_groups_users_on_group_id` (`group_id`) USING BTREE,
  KEY `index_groups_users_on_user_id` (`user_id`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

CREATE TABLE `groups` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `application_instance_id` int(11) DEFAULT NULL,
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  `description` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `group_type` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1045 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

EXPLAIN for query:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: assignment_completions
         type: index
possible_keys: PRIMARY,index_assignment_completions_on_first_name,index_assignment_completions_on_last_name,index_assignment_completions_on_user_id
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 199088
     filtered: 100.00
        Extra: Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: users
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: lms.assignment_completions.user_id
         rows: 1
     filtered: 100.00
        Extra: Using index
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: groups_users
         type: ref
possible_keys: index_groups_users_on_user_id
          key: index_groups_users_on_user_id
      key_len: 5
          ref: lms.users.id
         rows: 1
     filtered: 100.00
        Extra: NULL
*************************** 4. row ***************************
           id: 1
  select_type: SIMPLE
        table: groups
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: lms.groups_users.group_id
         rows: 1
     filtered: 100.00
        Extra: Using index

EXPLAIN for count query:

*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived2>
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 199088
     filtered: 100.00
        Extra: NULL
*************************** 2. row ***************************
           id: 2
  select_type: DERIVED
        table: assignment_completions
         type: index
possible_keys: PRIMARY,index_assignment_completions_on_first_name,index_assignment_completions_on_last_name,index_assignment_completions_on_user_id
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 199088
     filtered: 100.00
        Extra: Using where
*************************** 3. row ***************************
           id: 2
  select_type: DERIVED
        table: users
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: lms.assignment_completions.user_id
         rows: 1
     filtered: 100.00
        Extra: Using index
*************************** 4. row ***************************
           id: 2
  select_type: DERIVED
        table: groups_users
         type: ref
possible_keys: index_groups_users_on_user_id
          key: index_groups_users_on_user_id
      key_len: 5
          ref: lms.users.id
         rows: 1
     filtered: 100.00
        Extra: NULL
*************************** 5. row ***************************
           id: 2
  select_type: DERIVED
        table: groups
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: lms.groups_users.group_id
         rows: 1
     filtered: 100.00
        Extra: Using index

我需要计算分页目的的总结果。

修改

有时会修改此查询,这就是群组加入的原因。有时会为组添加where子句:

AND groups.name LIKE "%abc%"

因此,必须使用groups表连接。

5 个答案:

答案 0 :(得分:3)

  

要处理SELECT COUNT(*)FROM t语句,InnoDB会扫描表的索引,如果索引不完全在缓冲池中,则需要一些时间。如果您的表不经常更改,使用 MySQL查询缓存是一个很好的解决方案。快速计算。

您还可以强制InnoDB使用索引:

  SELECT COUNT(id) FROM assignment_completions USE INDEX (PRIMARY);

除此之外,我发现你使用了很多索引,这会降低你的查询速度。

尝试仅使用您将依赖的id索引。

  

Innodb没有缓存的行数。所以使用Innodb表没有where子句的count(*)。

this may help you

答案 1 :(得分:2)

我会在数据上覆盖索引,因此引擎不必返回原始数据页面。此外,由于您从用户到组_用户的唯一目的,但组用户无论如何都基于用户ID,因此从分配完成直接到groups_users表

table                   index
assignment_completions  ( handler, id, user_id, first_name, last_name )
groups_users            ( user_id, group_id )
groups                  ( id, name )

SELECT STRAIGHT_JOIN
      assignment_completions.id as id,
      assignment_completions.first_name,
      assignment_completions.last_name,
      groups.name
   FROM
      assignment_completions
         LEFT JOIN groups_users 
            ON assignment_completions.user_id = groups_users.user_id
            LEFT JOIN groups 
               ON groups_users.group_id = groups.id
   WHERE
      assignment_completions.handler = 'course'
   GROUP BY 
      assignment_completions.id

@orourkedd,我再次看了桌子结构。真的,你会期望一个名字(第一个/最后一个)永远超过20-25个字符吗?另外,处理程序字段为255个字符?真?如果这允许数据页面过度膨胀,那么这可能是延迟问题的一部分吗?同样在用户的表中。此外,如果assignment_completion的名字/姓氏是完成分配的用户的名字,那么只需将用户的ID存储到表中就可以更容易长期使用,并缩短分配完成时的磁盘要求。是的,一个额外的连接,但由于它将基于“ID”列,它会很快,特别是如果用户的表是(id,first_name,last_name)上的覆盖索引,但肯定会缩小那些列。与组“名称”列类似。因为我不知道实际索引是如何准备的,如果它构建了每个字段255的分配,那可能会阻碍索引性能。

重新审核自己的评论后。看来你有一个处理程序查找表(通过引用handler_id列)。我会在assignment_completions表上将索引从(handler,...)更改为(handler_id,...)并删除处理程序列,因为复制超过200k行vs单个查找表ID指针将削减页面数量通过。同样,在分配完成表中保存“userID”与完整的名/姓。对于200k记录,这会强制3列,最多255个字符(截至目前,显然会增长)。

答案 2 :(得分:1)

我相信COUNT(DISTINCT ...)聚合函数将解决问题:

SELECT COUNT(DISTINCT assignment_completions.id) AS assignment_count
FROM assignment_completions
    LEFT JOIN users ON assignment_completions.user_id = users.id
    LEFT JOIN groups_users ON users.id = groups_users.user_id
    LEFT JOIN groups ON groups_users.group_id = groups.id
WHERE assignment_completions.handler = 'course';

为了加快速度,您可以将其与其他查询分开运行,如下所示:

SELECT COUNT(DISTINCT id) AS assignment_count
FROM assignment_completions
WHERE assignment_completions.handler = 'course';

答案 3 :(得分:1)

因为您最终GROUP BY assignment_completions.id,并且因为此表对所有其他表格LEFT JOIN,所以您想要的查询等同于:

SELECT COUNT(DISTINCT id) FROM assignment_completions;

最后,id可能是主键(因此是唯一的),所以你想要的只是:

SELECT COUNT(id) FROM assignment_completions;

由于您需要group表来对其进行过滤,因此不需要LEFT JOIN。也许优化者会自己实现它,但如果没有,请用常规LEFT JOIN替换所有[INNER] JOIN

答案 4 :(得分:1)

你可以试试这个:

    CREATE TEMPORARY TABLE `tmp_stuff` SELECT 
        assignment_completions.id as id,
        assignment_completions.first_name,
        assignment_completions.last_name,
        groups.name

    FROM
        assignment_completions
            LEFT JOIN
        groups_users ON assignment_completions.user_id = groups_users.user_id
            LEFT JOIN
        groups ON groups_users.group_id = groups.id
    WHERE
        assignment_completions.handler = 'course'
    GROUP BY assignment_completions.id

然后这个:

    SELECT count(*) FROM `tmp_stuff`

如果您的计数(*)需要更复杂,您可以随时将必要的索引添加到临时表中。

它永远不会变得非常快,但即使使用正确的索引,当复杂的查询开始变慢时,临时表也将永远挽救你的生命。