从postgres临时表中删除的时间差异很大

时间:2017-08-29 07:52:45

标签: postgresql

我的查询需要很长时间才能运行,因此我重写了它,现在几乎没有时间运行 - 但我不明白为什么。

我能理解一个小小的差异,但有人可以帮助解释运行这两个(看似非常相似)陈述所花费的时间差异吗?

首先:

DELETE FROM     t_old where company_id not in (select company_id from t_prop);

第二

DELETE FROM     t_old a
using           t_prop b 
where           a.company_id=b.company_id
and             b.company_id is null;

执行计划从第一次开始:

'[
  {
    "Plan": {
  "Startup Cost": 0,
  "Plans": [
    {
      "Filter": "(NOT (SubPlan 1))",
      "Startup Cost": 0,
      "Plans": [
        {
          "Startup Cost": 0,
          "Plans": [
            {
              "Startup Cost": 0,
              "Node Type": "Seq Scan",
              "Plan Rows": 158704,
              "Relation Name": "t_prop",
              "Alias": "t_prop",
              "Parallel Aware": false,
              "Parent Relationship": "Outer",
              "Plan Width": 4,
              "Total Cost": 2598.04
            }
          ],
          "Node Type": "Materialize",
          "Plan Rows": 158704,
          "Parallel Aware": false,
          "Parent Relationship": "SubPlan",
          "Plan Width": 4,
          "Subplan Name": "SubPlan 1",
          "Total Cost": 4011.56
        }
      ],
      "Node Type": "Seq Scan",
      "Plan Rows": 21760,
      "Relation Name": "t_old",
      "Alias": "t_old",
      "Parallel Aware": false,
      "Parent Relationship": "Member",
      "Plan Width": 6,
      "Total Cost": 95923746.03
    }
  ],
  "Node Type": "ModifyTable",
  "Plan Rows": 21760,
  "Relation Name": "t_old",
  "Alias": "t_old",
  "Parallel Aware": false,
  "Operation": "Delete",
  "Plan Width": 6,
  "Total Cost": 95923746.03
}

} ]'

第二次执行计划

'[
  {
    "Plan": {
      "Startup Cost": 0.71,
      "Plans": [
        {
          "Startup Cost": 0.71,
          "Plans": [
            {
              "Startup Cost": 0.42,
              "Scan Direction": "Forward",
              "Plan Width": 10,
              "Node Type": "Index Scan",

          "Index Cond": "(company_id IS NULL)",
          "Plan Rows": 1,
          "Relation Name": "t_prop",
          "Alias": "b",
          "Parallel Aware": false,
          "Parent Relationship": "Outer",
          "Total Cost": 8.44,
          "Index Name": "t_prop_idx2"
        },
        {
          "Startup Cost": 0.29,
          "Scan Direction": "Forward",
          "Plan Width": 10,
          "Node Type": "Index Scan",
          "Index Cond": "(company_id = b.company_id)",
          "Plan Rows": 5,
          "Relation Name": "t_old",
          "Alias": "a",
          "Parallel Aware": false,
          "Parent Relationship": "Inner",
          "Total Cost": 8.38,
          "Index Name": "t_old_idx"
        }
      ],
      "Node Type": "Nested Loop",
      "Plan Rows": 5,
      "Join Type": "Inner",
      "Parallel Aware": false,
      "Parent Relationship": "Member",
      "Plan Width": 12,
      "Total Cost": 16.86
    }
  ],
  "Node Type": "ModifyTable",
  "Plan Rows": 5,
  "Relation Name": "t_old",
  "Alias": "a",
  "Parallel Aware": false,
  "Operation": "Delete",
  "Plan Width": 12,
  "Total Cost": 16.86
}

} ]'

1 个答案:

答案 0 :(得分:1)

您的第二个查询将不会删除任何内容,这就是为什么它会更快。

编辑:我想我应该解释为什么它什么都不会删除。所以......

你想要做的事实是:

DELETE FROM     t_old a
using           t_old a2
LEFT JOIN       t_prop b ON b.company_id = a2.company_id
where           a.company_id=a2.company_id
and             b.company_id is null;

它可能比你的第一个查询更快,更慢或更快,但它会做同样的事情。

但是,如果t_old中的行与t_prop匹配,则您的第二个查询只会删除company_id中的行,因为您在那里制作了INNER JOIN。但是还有一个附加条件b.company_id is null,它会将t_prop中的其他行仅限制为只有{\ n}}列的行,但NULL运算符不适用于{= 1}}值并且永远不会评估为NULL,因此如果您满足第二个条件,您的第一个条件将始终失败。考虑到它们之间存在true,两者都必须得到满足,这是不可能的。

什么可行,并且会删除AND中同样满足相同条件的t_prop中的行:

t_old WHERE company_id IS NULL

但它仍然不会做第一次查询所做的事情。