PostgreSQL全文搜索部分单词

时间:2016-06-27 02:44:47

标签: postgresql optimization indexing full-text-search query-optimization

当我在下面使用此查询时,搜索时间大约为60-100毫秒

SELECT *
FROM cmis.membership_masterfile 
WHERE textsearchable_index_col @@ to_tsquery ('ALO:*');

但是当我在下面使用这个查询时,它会上升到2000毫秒(与上面相同的结果)

SELECT *
FROM cmis.membership_masterfile 
WHERE textsearchable_index_col @@ utility.ts_query ('ALO');

utility.ts_query定义为:

-- Function: utility.ts_query(character varying)

-- DROP FUNCTION utility.ts_query(character varying);

CREATE OR REPLACE FUNCTION utility.ts_query(this_search_value character varying)
  RETURNS tsquery AS
$BODY$
    BEGIN
        RETURN to_tsquery(
            'simple', 
            utility.format_search_value(this_search_value)
            );
        END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;
ALTER FUNCTION utility.ts_query(character varying)
  OWNER TO postgres;

utility.format_search_value(text)是:

-- Function: utility.format_search_value(text)

-- DROP FUNCTION utility.format_search_value(text);

CREATE OR REPLACE FUNCTION utility.format_search_value(this_search_value text)
   RETURNS text AS
$BODY$
    DECLARE 
        words text[];
        word text;
        formatted_text text = '';
        count int = 0;
        length int;
    BEGIN
        SELECT regexp_split_to_array(TRIM(this_search_value), E'\\s+' ) INTO words ;
        length = array_length(words, 1);

        FOREACH word IN ARRAY words
         LOOP
            count = count + 1;
            IF count = length THEN 
                formatted_text = formatted_text || word || ':*';
            ELSE
                formatted_text = formatted_text || word || ':*&';
            END IF;
         END LOOP; 

        RETURN formatted_text;  
    END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;
ALTER FUNCTION utility.format_search_value(text)
  OWNER TO postgres;

编辑:textsearchable_index_col定义为

ALTER TABLE cmis.membership_masterfile ADD COLUMN textsearchable_index_col tsvector;
UPDATE cmis.membership_masterfile SET textsearchable_index_col =  to_tsvector('simple', coalesce(membership_name) || ' '     ||coalesce(membership_no) || ' ' || coalesce(membership_id));

我认为utility.ts_query不使用与使用索引的to_tsquery不同的索引。我应该如何使utility.ts_query使用索引,或者如何在不使用to_tsquery的情况下加快速度?

编辑:第一个查询计划是

"Bitmap Heap Scan on membership_masterfile  (cost=24.23..134.64 rows=29     width=351)"
"  Recheck Cond: (textsearchable_index_col @@ to_tsquery('ALO:*'::text))"
"  ->  Bitmap Index Scan on textsearch_idx  (cost=0.00..24.22 rows=29 width=0)"
"        Index Cond: (textsearchable_index_col @@ to_tsquery('ALO:*'::text))"

第二个查询是

"Seq Scan on membership_masterfile  (cost=0.00..27597.94 rows=410 width=351)"
"  Filter: (textsearchable_index_col @@ utility.ts_query('ALO'::character varying))"

编辑:

SELECT * FROM ts_debug('simple', utility.format_search_value('ALO'));的结果是 "asciiword";"Word, all ASCII";"ALO";"{simple}";"simple";"{alo}" "blank";"Space symbols";":*";"{}";"";""SELECT * FROM ts_debug('ALO:*');"asciiword";"Word, all ASCII";"ALO";"{english_stem}";"english_stem";"{alo}" "blank";"Space symbols";":*";"{}";"";""

0 个答案:

没有答案