Question

我有一张桌子：

debts (
    name       text,
    inv_no     integer,
    inv_type   text,
    status     text,
);

我有以下选择：

SELECT COUNT(*) FROM debts WHERE name = '...' AND inv_no = 100 AND inv_type = '...';

为了优化其他内容，我补充道：

CREATE INDEX ON debt (status);

在SELECT中我没有参考状态，但正在运行......

EXPLAIN SELECT COUNT(*)... (as above)

...创建索引之前和之后给我成本16.65..16.66变成1.25..1.26。为什么呢？

之前/之后的完整explain (analyze, verbose)：

在：

QUERY PLAN
----------
 Aggregate  (cost=16.65..16.66 rows=1 width=0) (actual time=0.126..0.128 rows=1 loops=1)
   Output: count(*)
   ->  Seq Scan on ab123456.debts  (cost=0.00..16.65 rows=1 width=0) (actual time=0.106..0.106 rows=0 loops=1)
         Output: name, inv_no, inv_type, status
         Filter: ((debts.name = '...'::text) AND (debts.inv_type = '...'::text) AND (debts.inv_no = 100))
 Total runtime: 0.387 ms

后：

QUERY PLAN
----------
 Aggregate  (cost=1.25..1.26 rows=1 width=0) (actual time=0.031..0.033 rows=1 loops=1)
   Output: count(*)
   ->  Seq Scan on ab123456.debts  (cost=0.00..1.25 rows=1 width=0) (actual time=0.024..0.024 rows=0 loops=1)
         Output: name, inv_no, inv_type, status
         Filter: ((debts.name = '...'::text) AND (debts.inv_type = '...'::text) AND (debts.inv_no = 100))
 Total runtime: 0.118 ms

Answer 1

一些实用程序语句（包括CREATE INDEX！）更新表统计信息。 The manual:

出于效率原因，reltuples和relpages未更新在运行中，因此它们通常包含一些过时的值。它们由VACUUM，ANALYZE和一些DDL命令更新，例如 CREATE INDEX 。

大胆强调我的。因此，即使您的索引看起来完全不相关，更新的表统计信息也会产生影响 - 尤其是count()，这主要取决于所提到的两个统计信息。

为什么创建不相关的索引会使我的查询更快？

1 个答案: