巨大的JOIN结果集崩溃服务器

时间:2013-10-03 15:43:10

标签: php mysql

我有一个非常大的结果集,包含分布在几个表中的近2 GB的产品数据,每个表总共有大约500,000条记录。我需要处理每个记录以导出到一组文件。

以下内容会在尝试保存结果集时使服务器崩溃,因此我不得不切换到仅创建查询以仅获取与查询结果匹配的每条记录的主ID,然后对每个记录执行第二次查询获得该单个产品的主要ID。由于所有这些次要查询,这非常低效且数据库密集。

这是崩溃它的查询和代码。我怎么能不成功呢?

$query =
    "SELECT SQL_NO_CACHE SQL_BIG_RESULT
        products.*,
        inventory.*,
        pricing.*,
        markets.*
    FROM
        products,
        categories,
        markets,
        pricing,
        inventory
    WHERE
        products.catid = categories.id AND
        markets.id = products.marketid AND
        pricing.productid = products.id AND
        inventory.productid = products.id AND
        inventory.all_stock > 0 AND
        products.sale = 'Y' AND
        categories.active = 'Y' AND
        inventory.last_update > UNIX_TIMESTAMP(NOW() - INTERVAL 1 DAY)
    GROUP BY
        products.id";

$Db = new DbConnector();

$r = $Db->query($query); // !Never gets past this point!

while ($product = $r->fetch(PDO::FETCH_ASSOC)) {
    // Stuff gets done here.
}

2 个答案:

答案 0 :(得分:0)

查询是否仅在数据库服务器上运行?如果是这样,瓶颈很可能与您的Web服务器有关,并且它与您的数据库服务器进行通信。如果您正在提取大量数据或者您被迫运行大量查询(如果您必须为检索到的每个ID运行其他查询),我建议使用存储过程(mysql将它们称为“例程” “)。 您可以在这里开始:http://net.tutsplus.com/tutorials/an-introduction-to-stored-procedures/

答案 1 :(得分:0)

你是不是只是将id字段放入临时表中然后“保湿”并分批处理完整行?

首先是只有id的临时表:

CREATE TEMPORARY TABLE tempy
SELECT SQL_NO_CACHE SQL_BIG_RESULT
    products.id  AS product_id,
    inventory.id AS inventory_id,
    pricing.id   AS pricing_id,
    markets.id   AS markets_id
FROM
    products,
    categories,
    markets,
    pricing,
    inventory
WHERE
    products.catid = categories.id AND
    markets.id = products.marketid AND
    pricing.productid = products.id AND
    inventory.productid = products.id AND
    inventory.all_stock > 0 AND
    products.sale = 'Y' AND
    categories.active = 'Y' AND
    inventory.last_update > UNIX_TIMESTAMP(NOW() - INTERVAL 1 DAY)
GROUP BY
    products.id

重复此查询直到处理完所有内容,但在每个步骤中增加OFFSET值:

SELECT SQL_NO_CACHE SQL_BIG_RESULT
    products.*,
    inventory.*,
    pricing.*,
    markets.*
FROM
    ( SELECT *
      FROM tempy
      LIMIT  1000     -- slice size
      OFFSET 1000*123 -- slice number
      ORDER BY whatever.you.want
    ) AS t,
    products,
    inventory,
    pricing,
    markets
WHERE
    products.id  = t.products_id
    inventory.id = t.inventory_id
    pricing.id   = t.pricing_id
    markets.id   = t.markets_id