多个相同的表左连接非常慢

时间:2017-01-11 14:25:41

标签: mysql performance left-join crosstab

我有两个表,只需说一个用户表和一个日期表。它们看起来像这样:

用户

ID_User | Title | Firstname | Surname | JobNumber
1       | Mr    | Bob       | Smith   | JOB001
2       | Mrs   | Bobbi     | Smythe  | JOB001
...
13000

日期

ID_Date | ID_User | DateType | DateAssigned | JobNumber
1       | 1       | Intent   | 21-Jun-2016  | JOB001
2       | 1       | Reg      | 21-Apr-2017  | JOB001
3       | 1       | Flight   | 21-May-2017  | JOB001
4       | 2       | Intent   | 09-Dec-2016  | JOB001
5       | 2       | Flight   | 01-Jan-2017  | JOB001
...
5000

唯一索引是ID_User + DateType + JobNumber。

可能有任何数量的DateTypes。

当我按照以下方式进行查询时,需要很长时间。

select
  ID_User,
  Title,
  Firstname,
  Surname,
  JobNumber,
  DI.DateAssigned as Date_Intent,
  DR.DateAssigned as Date_Reg,
  DF.DateAssigned as Date_Flight
from
  User as U
  left join Dates as DI on U.ID_User = DI.ID_User
    and DI.JobNumber = "JOB001"
    and DI.DateType = "Intent"
  left join Dates as DR on U.ID_User = DR.ID_User
    and DR.JobNumber = "JOB001"
    and DR.DateType = "Reg"
  left join Dates as DF on U.ID_User = DF.ID_User
    and DF.JobNumber = "JOB001"
    and DF.DateType = "Flight"
where
  U.JobNumber = "JOB001"
order by
  U.Surname,
  U.Firstname;

每个JobNumber中只有300人,最多可以说5种不同的日期类型。

为什么需要这么长时间?我们说了2分钟。

还有另一种写作方法吗?

日期表:

CREATE TABLE `ATL_V2_Assigned_Dates` (
  `ID_Date` bigint(7) unsigned NOT NULL AUTO_INCREMENT,
  `JobNumber` varchar(10) NOT NULL DEFAULT '',
  `ID_User` bigint(7) unsigned NOT NULL DEFAULT '0',
  `DateAssigned` datetime NOT NULL,
  `DateType` varchar(100) NOT NULL,
  `Comment` text NOT NULL,
  `Updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `Inserted` datetime NOT NULL,
  PRIMARY KEY (`ID_Date`),
  UNIQUE KEY `ID_Date` (`ID_Date`) USING BTREE,
  UNIQUE KEY `unq_idx` (`JobNumber`,`ID_User`,`DateType`) USING BTREE,
  KEY `JobNumber` (`JobNumber`) USING BTREE,
  KEY `ID_User` (`ID_User`) USING BTREE,
  KEY `DateType` (`DateType`) USING BTREE
) ENGINE=MyISAM AUTO_INCREMENT=3975 DEFAULT CHARSET=utf8;

更新 2017年1月12日

非常奇怪,查询现在在0.06秒内运行,这里的输出来自:

explain select
  U.ID_User,
  U.Title,
  U.Firstname,
  U.Surname,
  U.JobNumber,
  DI.DateAssigned as Date_Intent,
  DR.DateAssigned as Date_Reg,
  DF.DateAssigned as Date_Flight
from
  ATL_Users as U
  left join ATL_V2_Assigned_Dates as DI on U.ID_User = DI.ID_User
    and DI.JobNumber = "ACI001"
    and DI.DateType = "Deadline - Intention"
  left join ATL_V2_Assigned_Dates as DR on U.ID_User = DR.ID_User
    and DR.JobNumber = "ACI001"
    and DR.DateType = "Event - Registration"
  left join ATL_V2_Assigned_Dates as DF on U.ID_User = DF.ID_User
    and DF.JobNumber = "ACI001"
    and DF.DateType = "Deadline - Flight"
where
  U.JobNumber = "ACI001"
order by
  U.Surname,
  U.Firstname;

+----+-------------+-------+--------+------------------------------------+-----------+---------+------------------------------------+------+----------------------------------------------------+
| id | select_type | table | type   | possible_keys                      | key       | key_len | ref                                | rows | Extra                                              |
+----+-------------+-------+--------+------------------------------------+-----------+---------+------------------------------------+------+----------------------------------------------------+
|  1 | SIMPLE      | U     | ref    | JobNumber                          | JobNumber | 32      | const                              |  506 | Using index condition; Using where; Using filesort |
|  1 | SIMPLE      | DI    | eq_ref | unq_idx,JobNumber,ID_User,DateType | unq_idx   | 342     | const,cclliveo_atl.U.ID_User,const |    1 | Using where                                        |
|  1 | SIMPLE      | DR    | eq_ref | unq_idx,JobNumber,ID_User,DateType | unq_idx   | 342     | const,cclliveo_atl.U.ID_User,const |    1 | Using where                                        |
|  1 | SIMPLE      | DF    | eq_ref | unq_idx,JobNumber,ID_User,DateType | unq_idx   | 342     | const,cclliveo_atl.U.ID_User,const |    1 | Using where                                        |
+----+-------------+-------+--------+------------------------------------+-----------+---------+------------------------------------+------+----------------------------------------------------+

我不知道我/我们做了什么可以有人指出我认为你提供答案的人,我会勾选它。谢谢你们。

4 个答案:

答案 0 :(得分:0)

您可能缺少适当的索引。尝试:

create index idx_user (jobnumber, id_user);
create index idx_dates (jobnumber, datetype, id_user, dateassigned);

答案 1 :(得分:0)

这是加入同一个表的最佳方式,不确定所花费的时间。即使您通过30,000条记录查询,也不会花费2分钟。这必须是由于一些其他问题,例如与数据库的多个连接。

答案 2 :(得分:0)

您可以尝试使用条件聚合来避免所有这些连接 给定

drop table if exists Userjobs;
create table userjobs (ID_User int, Title varchar(10), Firstname varchar(10), Surname varchar(10), JobNumber varchar(10));
insert into userjobs values
(1       , 'Mr'   ,  'Bob'  ,      'Smith'  ,  'JOB001'),
(2       , 'Mrs'  ,  'Bobbi',      'Smythe' ,  'JOB001');


drop table if exists jobDates;
create table jobdates(ID_Date int, ID_User int, DateType varchar(10), DateAssigned date, JobNumber varchar(10));
insert into jobdates values
(1       , 1       , 'Intent'   , '2016-06-21'  , 'JOB001'),
(2       , 1       , 'Reg'      , '2017-04-21'  , 'JOB001'),
(3       , 1       , 'Flight'   , '2017-05-21'  , 'JOB001'),
(4       , 2       , 'Intent'   , '2016-12-09'  , 'JOB001'),
(5       , 2       , 'Flight'   , '2017-01-01'  , 'JOB001');

MariaDB [sandbox]> select
    ->   u.ID_User,
    ->   Title,
    ->   Firstname,
    ->   Surname,
    ->   u.JobNumber,
    ->   max(case when datetype = 'intent' then dateassigned else null end) as intent,
    ->   max(case when datetype = 'reg' then dateassigned else null end) reg,
    ->   max(case when datetype = 'flight' then dateassigned else null end) as flight
    -> from
    ->   Userjobs as U
    -> left join jobDates as jd on U.ID_User = jd.ID_User
    ->     and jd.JobNumber = u.jobnumber
    -> where u.jobnumber = 'JOB001'
    -> group by   u.ID_User,
    ->   Title,
    ->   Firstname,
    ->   Surname,
    ->   u.JobNumber;
+---------+-------+-----------+---------+-----------+------------+------------+------------+
| ID_User | Title | Firstname | Surname | JobNumber | intent     | reg        | flight     |
+---------+-------+-----------+---------+-----------+------------+------------+------------+
|       1 | Mr    | Bob       | Smith   | JOB001    | 2016-06-21 | 2017-04-21 | 2017-05-21 |
|       2 | Mrs   | Bobbi     | Smythe  | JOB001    | 2016-12-09 | NULL       | 2017-01-01 |
+---------+-------+-----------+---------+-----------+------------+------------+------------+
2 rows in set (0.00 sec)

答案 3 :(得分:0)

U需要INDEX(JobNumber, Surname, Firstname)。这应该涵盖WHEREORDER BY,从而避免使用'filesort'。

对于Dates,您有UNIQUE(ID_User, DateType, JobNumber),对吗?让我们从该表中删除id,然后将UNIQUE替换为

PRIMARY KEY(JobNumber, ID_User, DateType)

这将使查找更有效,因为BTree的底部将包含DateAssigned 由于PK的“聚类”,所需的三行将是相邻的。 / p>

除非您有其他问题(阅读或修改)触及Dates,否则该表上不应有其他索引。

这些桌子有多大?你意识到你将完全阅读它们。但是我的建议只会导致每行读取一次,而不是很多次。