JOIN vs UNION vs IN() - 大表和许多WHERE条件

时间:2014-04-30 22:58:49

标签: mysql sql

我使用MySQL 5.5,我有3个表用于测试:

  1. 属性(entity_id,cid,aid,value) - 索引:ALL
  2. 商品(entity_id,price,currency) - index:entity_id
  3. 费率(currency_from,currency_to,rate) - 指数:无
  4. 我需要计算指定条件的结果(按属性搜索)并选择按某些列排序的X行。 查询应支持在项目属性(属性表)中搜索。

    我最初有这样的查询:

    SELECT i.entity_id, i.price * COALESCE(r.rate, 1) AS final_price 
    FROM items i
    JOIN attributes a ON a.entity_id = i.entity_id
    LEFT JOIN rates r ON i.currency = r.currency_from AND r.currency_to = 'EUR'
    WHERE a.cid = 4 AND ( (a.aid >= 10 AND a.value > 2000) OR (a.aid <= 10 AND a.value > 5) )
    HAVING final_price BETWEEN 0 AND 9000
    ORDER BY final_price DESC
    LIMIT 20
    

    但是在大​​桌子上它很慢。条件可以更大(甚至到30个参数)并且有时使用CAST(a.value as SIGNED)来使用BETWEEN(对于范围值)。

    例如:

    SELECT 
          i.entity_id, 
          i.price * COALESCE(r.rate, 1) AS final_price 
       FROM 
          attributes a 
             JOIN items i
                ON a.entity_id = i.entity_id 
             LEFT JOIN rates r 
                ON i.currency = r.currency_from 
                AND r.currency_to = 'EUR'
       WHERE 
          a.cid = 4 AND ( 
    (a.aid = 10 AND CAST(a.value AS SIGNED) BETWEEN 2000 AND 2014) 
    OR (a.aid = 121 AND CAST(a.value AS SIGNED) BETWEEN 40 AND 60) 
    OR (a.aid = 45 AND CAST(a.value AS SIGNED) BETWEEN 770 AND 1500) 
    OR (a.aid = 95 AND CAST(a.value AS SIGNED) BETWEEN 12770 AND 15500) 
    OR (a.aid = 98 AND a.value = 'some value') 
    OR (a.aid = 199 AND a.value = 'some another value') 
    OR (a.aid = 102 AND a.value = 1)
    OR (a.aid = 112 AND a.value = 42) ) 
       GROUP BY
          i.entity_id
       HAVING 
          COUNT(i.entity_id) = 7 
             AND final_price BETWEEN 0 AND 9000
       ORDER BY 
          final_price DESC
       LIMIT 20
    

    我按COUNT()分组等于7(要搜索的属性数),因为我需要查找具有所有这些属性的项目。

    基本查询的EXPLAIN(第一个):

    id  select_type table   type    possible_keys   key key_len ref rows    Extra   
    1   SIMPLE  a   ALL entity_id,value NULL    NULL    NULL    379999  Using where; Using temporary; Using filesort
    1   SIMPLE  i   eq_ref  PRIMARY PRIMARY 4   testowa.a.entity_id 1   Using where
    1   SIMPLE  r   ALL NULL    NULL    NULL    NULL    2   
    

    我阅读了很多关于比较UNIONJOININ()的主题,最佳结果给出了第二个选项,但它总是太慢。

    有没有办法在这里获得更好的表现?为什么这么慢? 我应该考虑将一些逻辑(将此查询拆分为3个)转移到后端(php / ror)代码吗?

1 个答案:

答案 0 :(得分:1)

我会稍微重构您的查询并首先拥有属性表 然后加入了这些项目。另外,我会有一个覆盖索引 items表via(entity_id,price)和属性表上的索引 ON(cid,aid,value,entity_id)和您的费率表索引 ON(currency_from,currency_to,rate)。这样,所有都覆盖索引 并且引擎不需要转到原始数据页来获取数据,它可以 从已经用于加入/标准的索引中提取它。

SELECT 
      i.entity_id, 
      i.price * COALESCE(r.rate, 1) AS final_price 
   FROM 
      attributes a 
         JOIN items i
            ON a.entity_id = i.entity_id 
         LEFT JOIN rates r 
            ON i.currency = r.currency_from 
            AND r.currency_to = 'EUR'
   WHERE 
      a.cid = 4 AND ( (a.aid >= 10 AND a.value > 2000) OR (a.aid <= 10 AND a.value > 5) )
   HAVING 
      final_price BETWEEN 0 AND 9000
   ORDER BY 
      final_price DESC
   LIMIT 20

所以,虽然这对你提供的查询有帮助,你能否展示一些其他你会有更多标准条件的地方......你提到它可能会比30更多(或更多)。看更多可能会改变查询稍微。

对于具有多个条件的更新查询,我会在“a.cid = 4”之后为所有“aid”值添加一个IN()子句。这样,在它必须达到所有“OR”条件之前,如果它在“援助”上失败而不是你认为的那个,它就永远不必击中那些......例如

      a.cid = 4 
   AND a.id in ( 10, 121, 45, 95, 98, 199, 102 )
   AND  ( rest of the complex aid, casting and between criteria )