MySQL使用IN子句和许多ID优化LEFT JOIN

时间:2017-11-28 15:32:12

标签: mysql join

我有两张桌子:itemstatus

对于每个item,我需要使用不同条件汇总来自status表的数据,字段为:ingredientIdstatusexemptionIds,因此需要执行多次left join

我遇到了性能问题,在现代CPU和SSD驱动器上处理500行需要大约7.5秒。

奇怪的是,如果我在最后一次JOIN中发表评论,它需要大约1,2s,如果注释掉最后2个JOIN,则需要大约0.7s。我希望更多的JOIN可以线性地增加时间,但就我而言,情况并非如此; 我实际上需要添加更多的JOIN,这会引起很大的问题。

DESCRIBE确认已使用PRIMARY(复合docIdingredientId索引)

 # id, select_type, table, partitions, type, possible_keys, key, key_len, ref, rows, filtered, Extra
'1', 'PRIMARY', '<derived2>', NULL, 'ALL', NULL, NULL, NULL, NULL, '500', '100.00', 'Using temporary; Using filesort'
'1', 'PRIMARY', 'aaa_psn', NULL, 'ref', 'PRIMARY', 'PRIMARY', '4', 'i.docId', '139', '100.00', 'Using where'
'1', 'PRIMARY', 'aaa_psu', NULL, 'ref', 'PRIMARY', 'PRIMARY', '4', 'i.docId', '139', '100.00', 'Using where'
'1', 'PRIMARY', 'aaa_psu2', NULL, 'ref', 'PRIMARY', 'PRIMARY', '4', 'i.docId', '139', '100.00', 'Using where; Using index'
'1', 'PRIMARY', 'aaa_pse', NULL, 'ref', 'PRIMARY', 'PRIMARY', '4', 'i.docId', '139', '100.00', 'Using where'
'1', 'PRIMARY', 'bbb_psn', NULL, 'ref', 'PRIMARY', 'PRIMARY', '4', 'i.docId', '139', '100.00', 'Using where'
'1', 'PRIMARY', 'bbb_psu', NULL, 'ref', 'PRIMARY', 'PRIMARY', '4', 'i.docId', '139', '100.00', 'Using where'
'1', 'PRIMARY', 'bbb_psu2', NULL, 'ref', 'PRIMARY', 'PRIMARY', '4', 'i.docId', '139', '100.00', 'Using where; Using index'
'1', 'PRIMARY', 'bbb_pse', NULL, 'ref', 'PRIMARY', 'PRIMARY', '4', 'i.docId', '139', '100.00', 'Using where'
'2', 'DERIVED', 'i', NULL, 'ALL', NULL, NULL, NULL, NULL, '2378132', '100.00', NULL

任何想法如何改善它?根据这种数据模型,更好的查询方法是什么?或者可能需要更改数据模型?

status表有大约91.2M行,item表大约有2.4M行。 每个itemstatus表格中最多包含100个条目。

以下是查询:

select i.*
,coalesce(
    if (count(aaa_sn.docId) > 0, 'no', null),
    if (count(aaa_su.docId) > 0, 'unknown', null),
    if (count(aaa_su2.docId) < 6, 'unknown', null),
    if (count(aaa_se.docId) > 0, 'exempt', null),
    'yes'
) as aaaCheck
,coalesce(
    if (count(bbb_sn.docId) > 0, 'no', null),
    if (count(bbb_su.docId) > 0, 'unknown', null),
    if (count(bbb_su2.docId) < 24, 'unknown', null),
    if (count(bbb_se.docId) > 0, 'exempt', null),
    'yes'
) as bbbCheck

from (
    select i.id, i.docId from item i limit 100
) i

left join status aaa_sn on aaa_sn.docId = i.docId and aaa_sn.ingredientId IN (1,2,3,4,5,6)
and (aaa_sn.status = 'no' OR (aaa_sn.status = 'exempt' and aaa_sn.exemptionIds NOT IN (29,38,46,162,167,179,180,182,190,191,192,194,202,206,216,234,163,215,216,123,124,125,126,127,128,129,130,131,132,133,136,137,138,139,140,141,142,143,144,145,146,147,149,150,179,182,183,205,220,222,229,230,11,12,23,29,33,37,39,40,41,42,45,151,152,153,154,155,158,159,164,166,167,171,172,178,179,180,181,182,184,185,186,187,188,189,192,193,194,195,196,197,199,200,201,203,207,208,209,210,211,212,213,214,216,217,218,219,221,223,224,225,226,227,228)))

left join status aaa_su on aaa_su.docId = i.docId and aaa_su.ingredientId IN (1,2,3,4,5,6) and aaa_su.status = 'unknown'

left join status aaa_su2 on aaa_su2.docId = i.docId and aaa_su2.ingredientId IN (1,2,3,4,5,6)

left join status aaa_se on aaa_se.docId = i.docId and aaa_se.ingredientId IN (1,2,3,4,5,6)
and aaa_se.status = 'exempt' and aaa_se.exemptionIds IN (29,38,46,162,167,179,180,182,190,191,192,194,202,206,216,234,163,215,216,123,124,125,126,127,128,129,130,131,132,133,136,137,138,139,140,141,142,143,144,145,146,147,149,150,179,182,183,205,220,222,229,230,11,12,23,29,33,37,39,40,41,42,45,151,152,153,154,155,158,159,164,166,167,171,172,178,179,180,181,182,184,185,186,187,188,189,192,193,194,195,196,197,199,200,201,203,207,208,209,210,211,212,213,214,216,217,218,219,221,223,224,225,226,227,228)



left join status bbb_sn on bbb_sn.docId = i.docId and bbb_sn.ingredientId IN (19,22,23,25,27,28,29,30,31,33,35,38,43,44,45,60,115,163,164,192,324,325,366,367)
and (bbb_sn.status = 'no' OR (bbb_sn.status = 'exempt' and bbb_sn.exemptionIds NOT IN (48,235,47,239,235,48,48,239,235,235,238,236,237,239)))

left join status bbb_su on bbb_su.docId = i.docId and bbb_su.ingredientId IN (19,22,23,25,27,28,29,30,31,33,35,38,43,44,45,60,115,163,164,192,324,325,366,367) and bbb_su.status = 'unknown'

left join status bbb_su2 on bbb_su2.docId = i.docId and bbb_su2.ingredientId IN (19,22,23,25,27,28,29,30,31,33,35,38,43,44,45,60,115,163,164,192,324,325,366,367)

left join status bbb_se on bbb_se.docId = i.docId and bbb_se.ingredientId IN (19,22,23,25,27,28,29,30,31,33,35,38,43,44,45,60,115,163,164,192,324,325,366,367)
and bbb_se.status = 'exempt' and bbb_se.exemptionIds IN (48,235,47,239,235,48,48,239,235,235,238,236,237,239)
group by i.id

1 个答案:

答案 0 :(得分:1)

似乎问题中的查询将生成半笛卡尔(半交叉)产品...将status中的行与status中的其他行匹配,可能会使计数膨胀

我怀疑我们只需要加入status表一次,匹配docId,然后我们可以通过SELECT列表中的表达式中的某些条件测试来运行行。

作为此方法的简化示例(尚未引入聚合,请考虑:

SELECT i.id
     , i.docid

     , s.ingredientId
     , s.status
     , s.exemptionId

     , IF( s.ingredientId IN (1,2,3,4,5,6) AND s.status = 'unknown' ,1,0) AS aaa_su

     , IF( s.ingredientId IN (1,2,3,4,5,6)                          ,1,0) AS aaa_su2

  FROM ( SELECT j.id
              , j.docid
           FROM item j
          ORDER BY j.docid, j.id
          LIMIT 100
       ) i
  LEFT
  JOIN status s
    ON s.docid = i.docid
 ORDER BY i.id, i.docid

对于每个&#34;匹配&#34;从status开始,IF()函数被评估。第一个表达式被计算为布尔值;如果为TRUE,则函数返回第二个表达式,否则返回第三个表达式。

我在此查询中仅包含两个较简单的检查;我省略了更复杂的表达式,只是为了证明这是如何工作的。 (我们可以扩展此模式以在SELECT列表中添加额外的IF()表达式以进行其他检查。

我还在s中添加了一些在条件中检查过的列,因此我们可以验证我们是否按照预期获得了1和0。 (一个更复杂的条件,特别是使用AND和OR,这将有助于我们验证检查是否按照我们的意图进行。

下一步是添加GROUP BY子句,并将这些IF()表达式包含在聚合函数中,例如&#39; SUM()`。

如果我们想使用1来计算&#34;计算&#34;那么0SUM()就很方便了行。

SELECT i.id
     , i.docid

     , SUM(IF( s.ingredientId IN (1,2,3,4,5,6) AND s.status = 'unknown' ,1,0)) AS cnt_aaa_su

     , SUM(IF( s.ingredientId IN (1,2,3,4,5,6)                          ,1,0)) AS cnt_aaa_su2

  FROM ( SELECT j.id
              , j.docid
           FROM item j
          ORDER BY j.docid, j.id
          LIMIT 100
       ) i
  LEFT
  JOIN status s
    ON s.docid = i.docid
 GROUP BY i.id, i.docid

如果我们想使用COUNT()代替SUM(),我们可以返回任何非NULL值作为第二个参数,并且需要返回NULL作为第三个参数,例如:< / p>

     , COUNT(IF( s.ingredientId IN (1,2,3,4,5,6) ,'x',NULL) AS aaa_su2
相关问题