优化子查询,使两个查询成为一个

时间:2013-07-12 12:41:00

标签: mysql query-optimization

以下查询用于执行成员搜索,在此示例中,仅使用姓氏。如果搜索完整匹配的名称,查询将在几秒钟内返回;但如果:LastName = 'S',那么查询需要12秒才能返回。

如何加快此查询?如果我可以在一秒钟内通过两个查询来完成它,我不应该只使用一个查询,同样快吗?由于插件和其他方法,我最简单的方法是是一个问题,因此我的问题。

Member表包含了我们曾经拥有的每个成员。该表包含一些我们没有注册的成员,因此它们只存在于此表中,而不是RegistrationRegistration_History中。 Registration_History有关于我想要显示的大多数成员的额外信息。 Registration与RH具有大部分相同的信息(RH有一些Reg没有的字段),但有时候它有RH没有的成员,这就是它在这里加入的原因。 编辑:会员在注册中可以有多行。我想填写Registration_History中的列,但是,一些遗留成员仅存在于Registration中。与其他成员不同,这些遗留成员在注册中只有1行,因此我不需要担心注册的排序方式,只需要从那里抓取1行。

SQL Fiddle with sample database design

MemberID已在所有3个表中编入索引。在我将SELECT RHSubSelect.rehiId子查询放入之前,此查询几乎花了整整一分钟才返回。

如果我将查询拆分为2个查询,请执行以下操作:

SELECT
    MemberID
FROM
    Member
WHERE 
    Member.LastName LIKE CONCAT('%', :LastName, '%')

然后将这些MemberID放入一个数组并将该数组传递给RHSubSelect.MemberID IN ($theArray)(而不是成员子查询),结果很快就会回来(大约一秒钟)。

完整查询(完整的SELECT语句位于小提琴中,SELECT *为了简洁起见)

SELECT
    *
FROM
 Member
    LEFT JOIN
        Registration_History FORCE INDEX (PRIMARY)
            ON
                Registration_History.rehiId = (
                                                SELECT
                                                    RHSubSelect.rehiId
                                                FROM
                                                    Registration_History AS RHSubSelect
                                                WHERE
                                                    RHSubSelect.MemberID IN (
                                                                                SELECT
                                                                                    Member.MemberID
                                                                                FROM
                                                                                    Member
                                                                                WHERE 
                                                                                    Member.LastName LIKE CONCAT('%', :LastName, '%')
                                                                            )                                                                   
                                                ORDER BY 
                                                    RHSubSelect.EffectiveDate DESC
                                                LIMIT 0, 1
                                            )                                   
    LEFT JOIN
        Registration FORCE INDEX(MemberID)
            ON
                Registration.MemberID = Member.MemberID
WHERE 
    Member.LastName LIKE CONCAT('%', :LastName, '%') 
GROUP BY
    Member.MemberID
ORDER BY 
    Relevance ASC,LastName ASC,FirstName asc 
LIMIT 0, 1000

MySQL解释,查询中包含FORCE INDEX() "Mysql Explain"

(如果没有显示解释的图片,它也在这里:http://oi41.tinypic.com/2iw4t8l.jpg

5 个答案:

答案 0 :(得分:1)

您似乎要检查的主要内容是具有领先%的姓氏。这会使该列上的索引无效,并且您的SQL正在搜索它两次。

我不是100%确定你要做什么。您的SQL似乎将所有与名称匹配的成员获得所需的成员,然后获取最后的registration_history记录。你得到的那个可能是来自任何一个匹配的成员,这看起来很奇怪,除非你只期望得到一个成员。

如果是这种情况,下面的小整齐(删除和IN并将其更改为JOIN)可能会略微改善。

SELECT
    COALESCE(NULLIF(Registration_History.RegYear, ''), NULLIF(Registration.Year, '')) AS RegYear,
    COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
    Member.MemberID,
    Member.LastName,
    Member.FirstName,
    CASE
        WHEN Member.LastNameTrimmed = :LastName
        THEN 1
        WHEN Member.LastNameTrimmed LIKE CONCAT(:LastName, '%')
        THEN 2
        ELSE 3
    END AS Relevance 
    FROM Member
    LEFT JOIN Registration_History FORCE INDEX (PRIMARY)
    ON Registration_History.rehiId = 
    (
        SELECT RHSubSelect.rehiId
        FROM Registration_History AS RHSubSelect
        INNER JOIN Member 
        ON RHSubSelect.MemberID = Member.MemberID
        WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
        ORDER BY RHSubSelect.EffectiveDate DESC
        LIMIT 0, 1
    )                                   
    LEFT JOIN Registration FORCE INDEX(MemberID)
    ON  Registration.MemberID = Member.MemberID
    WHERE Member.LastName LIKE CONCAT('%', :LastName, '%') 
    GROUP BY Member.MemberID
    ORDER BY Relevance ASC,LastName ASC,FirstName asc 
    LIMIT 0, 1000

但是,如果这不是您想要的,那么可能会进行进一步的更改。

更多的是清理,消除其中一个带有前导通配符的LIKE: -

SELECT
    COALESCE(NULLIF(Sub2.RegYear, ''), NULLIF(Registration.Year, '')) AS RegYear,
    COALESCE(NULLIF(Sub2.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
    Member.MemberID,
    Member.LastName,
    Member.FirstName,
    CASE
        WHEN Member.LastNameTrimmed = :LastName
        THEN 1
        WHEN Member.LastNameTrimmed LIKE CONCAT(:LastName, '%')
        THEN 2
        ELSE 3
    END AS Relevance 
FROM Member
LEFT OUTER JOIN Registration 
ON  Registration.MemberID = Member.MemberID
LEFT OUTER JOIN
(
    SELECT Registration_History.MemberID, Registration_History.rehiID, Registration_History.RegYear, Registration_History.RegNumber
    FROM Registration_History
    INNER JOIN
    (
        SELECT RHSubSelect.MemberID, MAX(RHSubSelect.EffectiveDate) AS EffectiveDate
        FROM Registration_History AS RHSubSelect
        GROUP BY RHSubSelect.MemberID
    ) Sub1
    ON Registration_History.MemberID = Sub1.MemberID AND Registration_History.EffectiveDate = Sub1.EffectiveDate
) Sub2
ON  Sub2.MemberID = Member.MemberID
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%') 
GROUP BY Member.MemberID
ORDER BY Relevance ASC,LastName ASC,FirstName asc 
LIMIT 0, 1000

这将使所有成员拥有匹配的名称,他们的匹配注册记录以及他们的registration_history记录以及最新的EffectiveDate。

我不认为最后一个GROUP BY是必要的(假设会员和注册之间存在1对1的关系,如果不是,你可能想要使用除GROUP BY之外的其他东西),但是我把它留在了现在

害怕没有表声明和一些相同的数据我无法真正测试它。

编辑 - 游戏的一点点,试图减少它在选择中早先处理的数量: -

SELECT
    COALESCE(NULLIF(Registration_History.RegYear, ''), NULLIF(Sub1.Year, '')) AS RegYear,
    COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Sub1.RegNumber, ''), NULLIF(Sub1.MemberID, '')) AS RegNumber,
    Sub1.MemberID,
    Sub1.LastName,
    Sub1.FirstName,
    CASE
        WHEN Sub1.LastName = :LastName
        THEN 1
        WHEN Sub1.LastName LIKE CONCAT(:LastName, '%')
        THEN 2
        ELSE 3
    END AS Relevance 
FROM
(
    SELECT 
        Member.MemberID,
        Member.LastName,
        Member.FirstName,
        Registration.Year,
        Registration.RegNumber,
        MAX(Registration_History.EffectiveDate) AS EffectiveDate
    FROM Member
    LEFT OUTER JOIN Registration 
    ON  Registration.MemberID = Member.MemberID
    LEFT OUTER JOIN Registration_History 
    ON Registration_History.MemberID = Member.MemberID
    WHERE Member.LastName LIKE CONCAT('%', :LastName, '%') 
    GROUP BY Member.MemberID,
        Member.LastName,
        Member.FirstName,
        Registration.Year,
        Registration.RegNumber
) Sub1
LEFT OUTER JOIN Registration_History
ON Registration_History.MemberID = Sub1.MemberID AND Registration_History.EffectiveDate = Sub1.EffectiveDate
ORDER BY Relevance ASC,LastName ASC,FirstName asc 
LIMIT 0, 1000

再次编辑。

试一试。您正在排序的项目都来自成员表,因此可能有必要尽早在子选择中排除。

SELECT
    COALESCE(NULLIF(Registration_History2.EffectiveDate, ''), NULLIF(Registration2.Year, '')) AS RegYear,
    COALESCE(NULLIF(Registration_History2.RegNumber, ''), NULLIF(Registration2.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
    Member.MemberID,
    Member.LastName,
    Member.FirstName,
    Member.Relevance 
    FROM
    (
        SELECT Member.MemberID,
                Member.LastName,
                Member.FirstName,
                CASE
                    WHEN Member.LastName = :LastName
                    THEN 1
                    WHEN Member.LastName LIKE CONCAT(:LastName, '%')
                    THEN 2
                    ELSE 3
                END AS Relevance 
        FROM Member
        WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
        ORDER BY Relevance ASC,LastName ASC,FirstName asc 
        LIMIT 0, 1000
    ) Member
    LEFT OUTER JOIN 
    (
        SELECT MemberID, MAX(EffectiveDate) AS EffectiveDate
        FROM Registration_History 
        GROUP BY MemberID
    ) Registration_History
    ON Registration_History.MemberID = Member.MemberID
    LEFT OUTER JOIN Registration_History Registration_History2
    ON Registration_History2.MemberID = Registration_History.MemberID
    AND Registration_History2.EffectiveDate = Registration_History.EffectiveDate
    LEFT OUTER JOIN 
    (
        SELECT MemberID, MAX(Year) AS Year
        FROM Registration 
        GROUP BY MemberID
    ) Registration
    ON Registration.MemberID = Member.MemberID
    LEFT OUTER JOIN 
    (
        SELECT MemberID, Year, MAX(RegNumber) AS RegNumber
        FROM Registration 
        GROUP BY MemberID, Year
    ) Registration2
    ON Registration2.MemberID = Member.MemberID
    AND Registration2.Year = Registration.Year

再次编辑

没有对以下内容进行测试,所以这更像是为了尝试解决这个问题的另一种方法,使用GROUP_CONCAT的小技巧: -

SELECT
    COALESCE(NULLIF(Registration_History.EffectiveDate, ''), NULLIF(Registration.Year, '')) AS RegYear,
    COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
    Member.MemberID,
    Member.LastName,
    Member.FirstName,
    Member.Relevance 
    FROM
    (
        SELECT Member.MemberID,
                Member.LastName,
                Member.FirstName,
                CASE
                    WHEN Member.LastName = :LastName
                    THEN 1
                    WHEN Member.LastName LIKE CONCAT(:LastName, '%')
                    THEN 2
                    ELSE 3
                END AS Relevance 
        FROM Member
        WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
        ORDER BY Relevance ASC,LastName ASC,FirstName asc 
        LIMIT 0, 1000
    ) Member
    LEFT OUTER JOIN 
    (
        SELECT MemberID, 
                SUBSTRING_INDEX(GROUP_CONCAT(EffectiveDate ORDER BY EffectiveDate DESC), ",", 1) AS EffectiveDate,
                SUBSTRING_INDEX(GROUP_CONCAT(RegNumber ORDER BY EffectiveDate DESC), ",", 1) AS RegNumber
        FROM Registration_History 
        GROUP BY MemberID
    ) Registration_History
    ON Registration_History.MemberID = Member.MemberID
    LEFT OUTER JOIN 
    (
        SELECT MemberID, 
                SUBSTRING_INDEX(GROUP_CONCAT(Year ORDER BY Year DESC), ",", 1) AS Year,
                SUBSTRING_INDEX(GROUP_CONCAT(RegNumber ORDER BY Year DESC), ",", 1) AS RegNumber
        FROM Registration 
        GROUP BY MemberID
    ) Registration
    ON Registration.MemberID = Member.MemberID

答案 1 :(得分:1)

我的建议是这样的查询:

SELECT *
FROM Member
LEFT JOIN Registration USING (MemberID)
LEFT JOIN Registration_History ON rehiID = (
  SELECT rehiID
  FROM Registration_History AS RHSubSelect
  WHERE RHSubSelect.MemberID = Member.MemberID
  ORDER BY EffectiveDate DESC
  LIMIT 1
)
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')

它的工作方式是,您首先选择与 LastName 匹配的 Member 表。然后,您可以在 Registration 表中使用简单LEFT JOIN,因为特定成员在该表中最多只能有1个条目。最后,您LEFT JOIN Registration_History 表,并附带一个子选择。

子选择查找与当前 MemberID 匹配的最新 EffectiveDate ,并返回该记录的 rehiID 。然后LEFT JOIN必须与 rehiID 确切匹配。如果该成员的 Registration_History 中没有条目,则不会加入任何内容。

理论上,这应该相对较快,因为您只在主查询中执行LIKE比较。 注册连接应该很快,因为该表已在 MemberID 上建立索引。但是,我怀疑你需要在 Registration_History 上添加一个额外的索引才能获得最佳性能。

您已经获得了主键 rehID ,已编入索引,这是 rehID LEFT JOIN所需的内容。但是,子查询需要匹配WHERE子句中的 MemberID 以及 EffectiveDate 的排序。为了获得最佳性能,我认为您需要一个结合 MemberID EffectiveDate 列的附加索引。

请注意,我的示例查询只是保持简单的最低要求。您显然需要将*替换为您想要返回的所有字段(与原始查询相同)。您还需要添加ORDER BYLIMIT条款。但是,不应要求GROUP BY

SQL小提琴链接:http://sqlfiddle.com/#!2/4a947a/1

上面的小提示显示完整的查询,但它的姓氏是硬编码的。我已修改原始样本数据以包含更多记录并更改了一些值。我还在 Registration_History 表中添加了额外的索引。

优化LIMIT

如果您要再次进行计时运行,我很想知道在使用 Kickstart 建议的修改对成员<进行子选择时我的查询效果如何/ em>表首先,在加入注册 Registration_History 表之前。

SELECT
    COALESCE(NULLIF(Registration_History.RegYear, ''), NULLIF(Registration.Year, '')) AS RegYear,
    COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
    Member.MemberID,
    Member.LastName,
    Member.FirstName,
    Member.Relevance
FROM (
  SELECT MemberID, LastName, FirstName,
    CASE
      WHEN Member.LastNameTrimmed = :LastName THEN 1
      WHEN Member.LastNameTrimmed LIKE CONCAT(:LastName, '%') THEN 2
      ELSE 3
    END AS Relevance 
  FROM Member
  WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
  ORDER BY Relevance ASC,LastName ASC,FirstName ASC
  LIMIT 0, 1000
) Member
LEFT JOIN Registration USING (MemberID)
LEFT JOIN Registration_History ON rehiID = (
  SELECT rehiID
  FROM Registration_History AS RHSubSelect
  WHERE RHSubSelect.MemberID = Member.MemberID
  ORDER BY EffectiveDate DESC
  LIMIT 1
)

使用LIMIT时,这应该比我原来的查询表现得更好,因为它不必为LIMIT排除的记录执行一堆不必要的连接。

答案 2 :(得分:0)

如果我正确理解您的问题(您只需要选择特定用户及其最新历史记录 - 是否正确)?如果是,那么您的问题实际上很容易变成greatest record per group problem。不需要任何子查询:

查询#1

SELECT Member.*, rh1.*
FROM Member
LEFT JOIN Registration_History AS rh1 USING (MemberID)
LEFT JOIN Registration_History AS rh2
    ON rh1.MemberId = rh2.MemberId AND rh1.EffectiveDate < rh2.EffectiveDate
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%') 
    AND rh2.MemberId IS NULL
ORDER BY Relevance ASC,LastName ASC,FirstName ASC
LIMIT 0, 1000

查询#3

(#2被删除,在这里采取#3以避免评论混淆)

SELECT Member.*, max(rh1.EffectiveDate), rh1.*
FROM Member
LEFT JOIN Registration_History AS rh1 USING (MemberID)
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%') 
GROUP BY Member.MemberID
ORDER BY Relevance ASC,LastName ASC,FirstName ASC
LIMIT 0, 1000

查询#4

这个受到了James查询的启发,但删除了limitorder by(请注意,您应该在EffectiveDate上定义索引,不仅仅是这个,而且所有查询都是有效的!)

select *
from Member
left join Registration_History AS rh1 on rh1.MemberID = Member.MemberID
    and rh1.EffectiveDate = (select max(rh2.EffectiveDate)
                             from Registration_History as rh2
                             where rh2.MemberID = Member.MemberID)
                        )
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%') 
ORDER BY Relevance ASC,LastName ASC,FirstName ASC
LIMIT 0, 1000

请在您的数据库中发布实际持续时间!

答案 3 :(得分:0)

尝试此查询:

set @lastname = 'Smith1';

-- explain extended
SELECT  
    COALESCE(NULLIF(Registration_History.RegYear, ''), NULLIF(Registration.Year, '')) AS RegYear,
    COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
    Member.MemberID,
    Member.LastName,
    Member.FirstName,
    CASE
      WHEN Member.LastNameTrimmed = 'Smith' THEN 1
      WHEN Member.LastNameTrimmed LIKE CONCAT(@lastname, '%') THEN 2
      ELSE 3
    END AS Relevance 
FROM (
    SELECT  Member.*,
        ( SELECT RHSubSelect.rehiId
            FROM  Registration_History AS RHSubSelect
            WHERE RHSubSelect.MemberID = Member.MemberID                                         
            ORDER BY RHSubSelect.EffectiveDate DESC
            LIMIT 0,1
         ) rh_MemberId
    FROM Member
    WHERE Member.LastName LIKE CONCAT('%', @lastname, '%')
) Member
LEFT JOIN  Registration_History 
    ON Registration_History.rehiId = Member.rh_MemberId
LEFT JOIN Registration -- FORCE INDEX(MemberID)
    ON Registration.MemberID = Member.MemberID
GROUP BY Member.MemberID
ORDER BY Relevance ASC,LastName ASC,FirstName asc 
LIMIT 0, 1000
;

答案 4 :(得分:0)

好的,这是我的镜头,我使用了各种各样的作品。一,我不得不从一个“相​​关性”字段中取出,因为你没有说明如何使它起作用。接下来,既然您想要给定成员的注册历史记录中的最新条目(如果它们存在于R / H中),则看起来生效日期与ReHiID相关联,因此我使用了它,因为它看起来是一个很好的关键为后续的左连接工作。

因此,内部查询仅对您要查找的名称的条件进行初步传递,应用相关性并限制其中的1000个条目。这样就不需要在外层进行20,000个条目并加入......只有1000个可以获得资格的条目。

然后将该结果左键连接到指示的其他表格...仅注册单个条目(如果存在)并左键连接到成员上的R / H和最大ReHiID。

要应用您要查找的名称,只需在查询中更改(选择@LookForMe:='S')sqlvars行...

select *
   from
      ( select
              M.*,
              max( RH.EffectiveDate ) as MaxEffectiveDate,
              max( R.RegNumber ) as MaxRegNumber,
              CASE WHEN M.LastNameTrimmed = @LookForMe THEN 1
              WHEN M.LastNameTrimmed LIKE CONCAT(@LookForMe, '%') THEN 2
              ELSE 3 END AS Relevance 
           from
              ( select @LookForMe := 'S' ) sqlvars,
              Member M
                 LEFT JOIN Registration_History RH
                    on M.MemberID = RH.MemberID
                 LEFT JOIN Registration R
                    on M.MemberID = R.MemberID
           where 
              M.LastName LIKE CONCAT('%', 'S', '%')
           group by
              M.MemberID
           order by
              Relevance, 
              M.LastName,
              M.FirstName
           limit
              0,1000 ) PreQuery
      LEFT JOIN Registration R2
         on PreQuery.MemberNumber = R2.MemberNumber
         AND PreQuery.MaxRegNumber = R2.RegNumber
      LEFT JOIN Registration_History RH2
         ON PreQuery.MemberNumber = RH2.MemberNumber
        AND PreQuery.MaxEffectiveDate = RH2.EffectiveDate

让我们看看这与您的生产数据的运行速度有多快,以及我们的距离。