MySql:加入具有数百万行的同一个表

时间:2021-01-26 07:41:01

标签: mysql join

我有一个包含数百万行的表 (SF_COLLECTIONS)

ID MEMBERID COLLECTIONID CARDID STATE (D / M)
1  1        1            1      D
2  1        1            2      D
3  2        1            1      M
4  2        1            2      M
5  2        1            3      D
6  1        1            3      M

我必须将那些具有 MEMBERID = 1 和 STATE = D 的那些与那些具有 MEMBERID = 2 和 STATE = M 的那些交叉,反之亦然

这是我的查询

SELECT 1
    FROM sf_collections AS rac
    INNER JOIN sf_collections AS myrac 
        ON 
        (myrac.cardid = rac.cardid AND 
            (
                (myrac.state = "M" AND rac.state = "D") OR 
                (myrac.state = "D" AND rac.state = "M")
            )
        ) 
    WHERE
    rac.memberid = 1 AND myrac.memberid = 2
    GROUP BY rac.memberid

(响应时间约 4 秒)

这是一种有效的方法还是有更好的方法来提高性能?

样本数据集:

CREATE TABLE `sf_collections` (
 `id` int(11) NOT NULL auto_increment,
 `memberid` int(11) NOT NULL,
 `collectionid` int(11) NOT NULL,
 `cardid` int(11) NOT NULL,
 `state` varchar(1) NOT NULL,
 PRIMARY KEY  (`id`),
 UNIQUE KEY `sf_collections_pkey` (`memberid`,`collectionid`,`cardid`,`state`),
 KEY `collectionid` (`collectionid`),
 KEY `memberid` (`memberid`),
 KEY `cardid` (`cardid`),
 KEY `state` (`state`)
) ENGINE=MyISAM AUTO_INCREMENT=22627806 DEFAULT CHARSET=latin1

INSERT INTO sf_collections (memberid,collectionid,cardid,state) VALUES
(1,1,1,'D'),
(1,1,2,'D'),
(1,1,3,'M'),
(2,1,1,'M'),
(2,1,2,'M'),
(2,1,3,'D');

SELECT 1
    FROM sf_collections AS rac
    INNER JOIN sf_collections AS myrac 
        ON 
        (myrac.cardid = rac.cardid AND 
            (
                (myrac.state = "M" AND rac.state = "D") OR 
                (myrac.state = "D" AND rac.state = "M")
            )
        ) 
    WHERE
    rac.memberid = 1 AND myrac.memberid = 2
    GROUP BY rac.memberid

and db-fiddle

谢谢

编辑:MySql 是 5.0(很旧,无法升级)

1 个答案:

答案 0 :(得分:0)

INDEX(cardid) 替换为 INDEX(cardid, state)

如果可行,不要同时检查 D->M 和 M->D;只做一个方向。这将减少一半的努力。

通过切换到 SELECT ... GROUP BY ... 来避免 EXISTS ( SELECT 1 ... )。如果有多个匹配的行,这将加快速度。如果您最终会“列出匹配项”,我们不妨看看您想要什么,而不是挑剔 GROUP BY 的不当使用。您会使用 GROUP_CONCAT 吗?

从 MyISAM 迁移到 InnoDB。即使在 5.0 中,这个查询也可能本质上更快。

DROP INDEX(memberid) 因为唯一索引处理这样的。

你需要id吗?如果没有,去掉它并将 4 列 UNIQUE 索引提升为 PRIMARY KEY

如果 state 只是一个标志 (0/1),那么 INDEX(state) 可能永远不会被使用;放下它。

相关问题