用于处理德语变音符号的Sphinx配置

时间:2015-04-10 13:27:48

标签: php mysql sphinx

我正在使用此索引配置:

index humans
{
    source                = src_humans
    path                  = /usr/local/sphinx/var/data/humans
    charset_table         = 0..9, A..Z->a..z, _, a..z, U+C4->U+E4, U+D6->U+F6, U+DC->U+FC, U+DF, U+E4, U+F6, U+FC
    html_strip            = 1
    html_index_attrs      = img=src,alt; a=href,title
    morphology            = libstemmer_de
    min_infix_len         = 3
    stopwords             = /tmp/stopwords_de.txt
}

我的索引器贯穿:

Sphinx 2.3.1-id64-beta (r4926)
Copyright (c) 2001-2015, Andrew Aksyonoff
Copyright (c) 2008-2015, Sphinx Technologies Inc (http://sphinxsearch.com)

using config file '/usr/local/sphinx/etc/sphinx.conf'...
indexing index 'humans'...
WARNING: index 'humans': dict=keywords and prefixes and morphology enabled, forcing index_exact_words=1
WARNING: Attribute count is 0: switching to none docinfo
collected 2 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 2 docs, 989 bytes
total 0.043 sec, 22888 bytes/sec, 46.28 docs/sec
total 3 reads, 0.000 sec, 2.0 kb/call avg, 0.0 msec/call avg
total 9 writes, 0.000 sec, 1.9 kb/call avg, 0.0 msec/call avg
rotating indices: successfully sent SIGHUP to searchd (pid=8908).

当我使用$sc->Query('*gef*')进行搜索时,我在其说明中找到了一个包含“Gefährlich”的文档,但在我使用$sc->Query('*gefä*')搜索时则没有。

我做错了什么? 我的整个MySQL-DB和属于该项目的每个文件都是UTF-8编码的。

提前致谢!

2 个答案:

答案 0 :(得分:1)

我修复了此行为 sql_query_pre = SET NAMES utf8

答案 1 :(得分:0)

当我使用Sphinx时,我在我的searchd配置中有类似

的内容

collation_server = utf8_general_ci

在我的index配置中:

charset_type = utf-8

我希望,它可以帮到你