SQL IN子句更有效的方法

时间:2014-04-26 15:23:37

标签: mysql sql performance

我有这个SQL查询,这需要很长时间才能完成。我怎样才能加快速度呢?

t_inter_specises_interaction有60k行,t_pathway有100k。 uniprot_id_1,uniprot_id_2,uniprot_id是varchar类型。

在此查询中,我想选择同时出现在t_pathway中的uniprot_id_1和uniprot_id_2:

select distinct uniprot_id_1,uniprot_id_2 from t_intra_species_interaction
where uniprot_id_1 in (select uniprot_id from t_pathway) and
  uniprot_id_2 in (select uniprot_id from t_pathway)

在这一篇中,我想选择uniprot_id,它存在于上面第一个查询返回的set uniprot_ids中。

select distinct uniprot_id,id from t_pathway as t
where uniprot_id in
(
    select distinct uniprot_id_2 from t_intra_species_interaction
    where uniprot_id_1 in (select uniprot_id from t_pathway) and
      uniprot_id_2 in (select uniprot_id from t_pathway)
    union
    select distinct uniprot_id_1 from t_intra_species_interaction
    where uniprot_id_1 in (select uniprot_id from t_pathway) and
      uniprot_id_2 in (select uniprot_id from t_pathway)
)

感谢。

6 个答案:

答案 0 :(得分:2)

您可能想要使用INNER JOIN:

select distinct uniprot_id_1,uniprot_id_2 from t_intra_species_interaction i
inner join t_pathway p1
    on p1.uniprod_id = t.uniprot_id_1 
inner join t_pathway p2
    on p2.uniprod_id = t_uniprot_id_2

答案 1 :(得分:2)

EXISTSJOIN会更有效率。

答案 2 :(得分:2)

试试这个:

select distinct uniprot_id_1,uniprot_id_2
from t_intra_species_interaction I
  join t_pathway P1
    on I.uniprot_id_1 = P1.uniprot_id
  join t_pathway P2
    on I.uniprot_id_2 = P2.uniprot_id

select distinct uniprot_id_1,uniprot_id_2
from t_intra_species_interaction I
where exists (select 1 from t_pathway where uniprot_id = I.uniprot_id_1)
  and exists (select 1 from t_pathway where uniprot_id = I.uniprot_id_2)

答案 3 :(得分:2)

子查询是相同的,因此它们可以合并为一个,并移动到连接

SELECT DISTINCT i.uniprot_id_1, i.uniprot_id_2 
FROM   t_intra_species_interaction i
       INNER JOIN t_pathway p ON p.uniprot_id IN (i.uniprot_id_1, i.uniprot_id_2)

第二次查询
打开一个引用这个问题的新问题会更好,但是看看我之前的查询,应该很容易看出要获得第二个答案,你只需要从t_pathway而不是t_intra_species_interaction

获取列。
SELECT DISTINCT p.uniprot_id, p.id
FROM   t_intra_species_interaction i
       INNER JOIN t_pathway p ON p.uniprot_id IN (i.uniprot_id_1, i.uniprot_id_2)

答案 4 :(得分:1)

有一般指标:

创建三个索引,一个在t_pathway.uniport_id上,另一个在t_intra_species_interaction.uniport_id1上,另一个在t_intra_species_interaction.uniport_id2

通过这种方式,您需要的所有数据都在索引中,并且它应该是快速的

同样将你的in子句转换为他在答案中提到的Tomas的左连接

答案 5 :(得分:1)

这个怎么样:

select distinct uniprot_id_1, uniprot_id_2
from t_intra_species_interaction
where exists (select uniprot_id from t_pathway
              where uniprot_id_1 = uniprot_id) and
      exists (select uniprot_id from t_pathway
              where uniprot_id_2 = uniprot_id)