在没有链接的情况下减少CROSS JOIN + LEFT JOIN的行数

时间:2015-07-01 19:08:40

标签: mysql join cross-join

我一直在努力解决这个问题。我有SQLFiddle with the same approximate contents of this question.

我有三个表,itemsprofiles,以及来自前两个表的关键表posts,带有示例数据的模式:

create table items ( 
  item_id int unsigned primary key auto_increment,
  title varchar(255)
);

insert into items (item_id, title) VALUES(1, 'Item One');
insert into items (item_id, title) VALUES(2, 'Item Two');
insert into items (item_id, title) VALUES(3, 'Item Three');
insert into items (item_id, title) VALUES(4, 'Item Four');
insert into items (item_id, title) VALUES(5, 'Item Five');

create table profiles (
  profile_id int unsigned primary key auto_increment,
  profile_name varchar(255)
);

insert into profiles (profile_id, profile_name) VALUES(1, 'Bob');
insert into profiles (profile_id, profile_name) VALUES(1, 'Mark');
insert into profiles (profile_id, profile_name) VALUES(1, 'Nancy');

create table posts (
  post_id int unsigned primary key auto_increment,
  item_id int unsigned, -- Relates to items.item_id
  profile_id int unsigned, -- Relates to profile.profile_id,
  post_date DATETIME
);

insert into posts (item_id, profile_id, post_date) values(1, 1, NOW());
insert into posts (item_id, profile_id, post_date) values(2, 2, NOW());
insert into posts (item_id, profile_id, post_date) values(2, 2, NOW());

我使用以下查询来生成几乎正确的结果:

SELECT 
  `items`.`item_id`,
  `items`.`title`,
  `profiles`.`profile_id`,
  `profiles`.`profile_name`,
  `posts`.`post_id`, 
  `posts`.`post_date`
  FROM `items` 
  CROSS JOIN `profiles`
  LEFT JOIN  `posts` ON `items`.`item_id` = `posts`.`item_id`
    AND `posts`.`profile_id` = `profiles`.`profile_id`;

对于我的特定应用,这是次优的。我得到了很多额外的'我的特定实现不需要的行。最终结果如下所示:

+------------|------------|---------|-----------+
| Item Name  | Profile ID | Post ID | Post Date |
+------------+------------+---------+-----------+
| Item One   | 1          | 1       | 2015-...  | -- Bob Posted this
| Item One   | 2          | NULL    | NULL      | -- No one else did
| Item One   | 3          | NULL    | NULL      |
| Item Two   | 1          | 2       | 2015-...  | -- Bob posted this
| Item Two   | 2          | 3       | 2015-...  | -- So did mark
| Item Two   | 3          | NULL    | NULL      | -- Nancy didn't
| Item Three | 1          | NULL    | NULL      | 
| Item Three | 2          | NULL    | NULL      |
| Item Three | 3          | 4       | 2015-...  | -- Only nancy posted #3
| Item Four  | 1          | NULL    | NULL      | -- No one posted #4
| Item Four  | 2          | NULL    | NULL      |
| Item Four  | 3          | NULL    | NULL      | 
| Item Five  | 1          | NULL    | NULL      | -- No one posted #5
| Item Five  | 2          | NULL    | NULL      |
| Item Five  | 3          | NULL    | NULL      | 
+------------+------------+---------+-----------+

这完全按照我的要求进行 - 每个项目都返回三次(对应于配置文件计数)。但是,如果项目#4和#5没有链接,那么它们只会返回一次,使用NULL profile_id,如下所示:

+------------|------------|---------|-----------+
| Item Name  | Profile ID | Post ID | Post Date |
+------------+------------+---------+-----------+
| Item One   | 1          | 1       | 2015-...  | -- Bob Posted this
| Item One   | 2          | NULL    | NULL      | -- No one else did
| Item One   | 3          | NULL    | NULL      |
| Item Two   | 1          | 2       | 2015-...  | -- Bob posted this
| Item Two   | 2          | 3       | 2015-...  | -- So did mark
| Item Two   | 3          | NULL    | NULL      | -- Nancy didn't
| Item Three | 1          | NULL    | NULL      | 
| Item Three | 2          | NULL    | NULL      |
| Item Three | 3          | 4       | 2015-...  | -- Nancy posted #3
| Item Four  | NULL       | NULL    | NULL      | -- **No one posted #3 and #4
| Item Five  | NULL       | NULL    | NULL      | -- Only need #3 and #4 once**
+------------+------------+---------+-----------+

虽然在这个示例中,这只会减少4行,但在我的实际应用程序中,有很多项目,但配置文件和帖子不多。因此,这一小改变可能会显着减少服务器端语言处理。

有人能指出我正确的方向限制交叉连接只在我有某种类型的连接吗?

1 个答案:

答案 0 :(得分:2)

SELECT  `items`.`item_id`,
        `items`.`title`,
        `profiles`.`profile_id`,
        `profiles`.`profile_name`,
        `posts`.`post_id`, 
        `posts`.`post_date`
FROM    `items` 
LEFT JOIN
        `profiles`
ON      EXISTS
        (
        SELECT  NULL
        FROM    `posts`
        WHERE   `posts`.`item_id` = `items`.`item_id`
        )
LEFT JOIN
        `posts`
ON      `items`.`item_id` = `posts`.`item_id`
        AND `posts`.`profile_id` = `profiles`.`profile_id`
ORDER BY
        `items`.`item_id`, `profiles`.`profile_id`

http://sqlfiddle.com/#!9/c81b1/41