有没有办法在不使用OR的情况下查询整数值或NULL?

时间:2013-05-09 12:36:00

标签: postgresql

我想查询(列表)值或NULL但不使用OR。尝试不使用OR的原因是,我需要在该字段上使用索引来加速查询。

一个简单的例子来说明我的问题:

CREATE TABLE fruits
(
  name text,
  quantity integer
);

(真实表有很多额外的整数列。)

我不满意的查询是

SELECT * FROM fruits WHERE quantity IN (1,2,3,4) OR quantity IS NULL;

我希望的查询类似于

SELECT * FROM fruits WHERE quantity MAGIC (1,2,3,4,NULL);

我正在使用Postgresql 9.1。

据我所知,从文档(例如http://www.postgresql.org/docs/9.1/static/functions-comparisons.html)和测试中无法做到这一点。但我希望你们中的一个人有一些神奇的洞察力。

3 个答案:

答案 0 :(得分:1)

丑陋的黑客攻击COALESCE

SELECT * 
FROM fruits
 WHERE COALESCE(quantity,1) IN (1,2,3,4)
   ;

请检查生成的计划。 IIRC,优化者在这种情况下知道COALESCE()

更新:备选:使用EXISTS(NOT EXISTS(NOT IN))技巧(在此处生成不同的计划)

-- EXPLAIN ANALYZE
SELECT *
FROM fruits fr
WHERE EXISTS (
        SELECT * FROM fruits ex
        WHERE ex.id = fr.id
        AND NOT EXISTS (
        SELECT * FROM fruits nx
                WHERE nx.id = ex.id
                AND nx.quantity NOT IN (1,2,3,4)
                )
        )
   ;
BTW:在测试时,(最多100万行,只有4 +几个符合条件),第一个查询(不使用索引)总是比第二个查询快(它使用索引和散列反连接) YMMV。

更新2:原始查询IS NULL OR IN()在这里显然是赢家:

-- EXPLAIN ANALYZE
SELECT *
FROM fruits
 WHERE quantity IS NULL
    OR quantity IN (1,2,3,4)
   ;

答案 1 :(得分:1)

100k行的测试表:

create table fruits (name text, quantity integer);
insert into fruits (name, quantity)
select left(md5(i::text), 6), i
from generate_series(1, 10000) s(i);

使用普通的数量索引:

create index fruits_index on fruits(quantity);
analyze fruits;

or的查询:

explain analyze
SELECT * FROM fruits WHERE quantity IN (1,2,3,4) OR quantity IS NULL;
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on fruits  (cost=21.29..34.12 rows=4 width=11) (actual time=0.032..0.032 rows=4 loops=1)
   Recheck Cond: ((quantity = ANY ('{1,2,3,4}'::integer[])) OR (quantity IS NULL))
   ->  BitmapOr  (cost=21.29..21.29 rows=4 width=0) (actual time=0.025..0.025 rows=0 loops=1)
         ->  Bitmap Index Scan on fruits_index  (cost=0.00..17.03 rows=4 width=0) (actual time=0.019..0.019 rows=4 loops=1)
               Index Cond: (quantity = ANY ('{1,2,3,4}'::integer[]))
         ->  Bitmap Index Scan on fruits_index  (cost=0.00..4.26 rows=1 width=0) (actual time=0.004..0.004 rows=0 loops=1)
               Index Cond: (quantity IS NULL)
 Total runtime: 0.089 ms

没有or

explain analyze
SELECT * FROM fruits WHERE quantity IN (1,2,3,4);
                                                      QUERY PLAN                                                       
-----------------------------------------------------------------------------------------------------------------------
 Index Scan using fruits_index on fruits  (cost=0.00..21.07 rows=4 width=11) (actual time=0.026..0.038 rows=4 loops=1)
   Index Cond: (quantity = ANY ('{1,2,3,4}'::integer[]))
 Total runtime: 0.085 ms

wildplasser提出的合并版本会导致顺序扫描:

explain analyze
SELECT * 
FROM fruits
WHERE COALESCE(quantity, -1) IN (-1,1,2,3,4);
                                             QUERY PLAN                                              
-----------------------------------------------------------------------------------------------------
 Seq Scan on fruits  (cost=0.00..217.50 rows=250 width=11) (actual time=0.023..4.358 rows=4 loops=1)
   Filter: (COALESCE(quantity, (-1)) = ANY ('{-1,1,2,3,4}'::integer[]))
   Rows Removed by Filter: 9996
 Total runtime: 4.395 ms

除非创建了合并表达式索引:

create index fruits_coalesce_index on fruits(coalesce(quantity, -1));
analyze fruits;

explain analyze
SELECT * 
FROM fruits
WHERE COALESCE(quantity, -1) IN (-1,1,2,3,4);
                                                           QUERY PLAN                                                           
--------------------------------------------------------------------------------------------------------------------------------
 Index Scan using fruits_coalesce_index on fruits  (cost=0.00..25.34 rows=5 width=11) (actual time=0.112..0.124 rows=4 loops=1)
   Index Cond: (COALESCE(quantity, (-1)) = ANY ('{-1,1,2,3,4}'::integer[]))
 Total runtime: 0.172 ms

但它仍然比普通的or查询更糟糕,因为它有一个普通的数量索引。

答案 2 :(得分:0)

这不是您确切问题的答案,但您可以为您的查询构建一个部分索引:

CREATE INDEX idx_partial (quantity) ON fruits
WHERE quantity IN (1,2,3,4) OR quantity IS NULL;

来自文档:http://www.postgresql.org/docs/current/interactive/indexes-partial.html

然后,您的查询应该使用此索引并加快速度。