Lucene搜索:区分大小写,变音符号

时间:2015-07-05 12:04:24

标签: java search lucene hibernate-search

我对Lucene很新,但我想搜索某个领域,但有不同的选择。 假设我有以下几个字:

  1. 测试
  2. 测试
  3. 测试
  4. 测试
  5. 测试
  6. 测试
  7. 我希望能够搜索搜索值'test',并且我希望搜索能够区分大小写和变音符号。给我结果4。

    但我也希望能够搜索搜索值'test',我希望搜索不区分大小写并且不会发现变音符号。给我结果1-6。

    也应该可以搜索不区分大小写和变音符号的搜索。 就像任何其他组合一样。

    处理此问题的最佳方法是什么?

    我已经尝试过注释我的字段了:

    @Fields({
          @Field(analyze = Analyze.YES, store = Store.YES, analyzer = @Analyzer(impl = WhitespaceAnalyzer.class)),
          @Field(name = "DiacriticUnaware", analyze = Analyze.YES, store = Store.YES, analyzer = @Analyzer(definition = "diacritics"))
      })
    private String fieldToSearchIn;
    

    有这个AnalyzerDef:

    @AnalyzerDefs({
        @AnalyzerDef(name = "diacritics",
            tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
            filters = {
                @TokenFilterDef(factory = ASCIIFoldingFilterFactory.class)
            })
    })
    

    但是这只允许我搜索区分大小写或区分大小写并结合变音符号。 我可以为这个属性创建多个索引字段,但我不知道这是否是正确的方法,因为我最终将索引属性并将其存储到Lucene的10个或更多不同的索引字段中。

    像这样:

    @Fields({
          @Field(analyze = Analyze.YES, store = Store.YES, analyzer = @Analyzer(impl = WhitespaceAnalyzer.class)),
          @Field(name = "DiacriticUnaware", analyze = Analyze.YES, store = Store.YES, analyzer = @Analyzer(definition = "diacritics")),
          @Field(name = "DiacriticUnawareCaseSenstitive", analyze = Analyze.YES, store = Store.YES, analyzer = @Analyzer(definition = "diacriticsCaseSensitive")),
          @Field(name = "DiacriticUnawareCaseInsenstitive", analyze = Analyze.YES, store = Store.YES, analyzer = @Analyzer(definition = "diacriticsCaseInsensitive")),
          @Field(name = "DiacriticAwareCaseSenstitive", analyze = Analyze.YES, store = Store.YES, analyzer = @Analyzer(definition = "diacriticAwareCaseSenstitive")),
          @Field(name = "DiacriticAwareCaseInsenstitive", analyze = Analyze.YES, store = Store.YES, analyzer = @Analyzer(definition = "diacriticAwareCaseInsenstitive")),
          ...
      })
    

0 个答案:

没有答案