搜索在solr

时间:2016-03-30 12:57:16

标签: solr

我在solr中有一个多值字段,其中包含用户名称

{
    "counsel_for_department": [
      "mr  a g  srivastava with mr xyz doe,
      " mr  johh david and mr john deo",
      " mr  n p  smith and mr  ng smith",

    ]
  },

当我查询fq=counsel_for_department:a g srivastava时,它不会返回任何结果。我在这个领域使用标准的标记器

此字段的字段类型为text_general

如果我们需要为多值字段配置不同的设置,请告诉我。

我正在关注json对象

  {
  "responseHeader": {
    "status": 0,
    "QTime": 20,
    "params": {
      "q": "*:*",
      "indent": "true",
      "fl": "counsel_for_department",
      "fq": [
        "doc_type:source_analysis",
        "counsel_for_department:*g*c*Srivastava*"
      ],
      "rows": "100",
      "wt": "json",
      "debugQuery": "true",
      "_": "1459351342391"
    }
  },
  "response": {
    "numFound": 0,
    "start": 0,
    "docs": []
  },
  "debug": {
    "rawquerystring": "*:*",
    "querystring": "*:*",
    "parsedquery": "MatchAllDocsQuery(*:*)",
    "parsedquery_toString": "*:*",
    "explain": {},
    "QParser": "LuceneQParser",
    "filter_queries": [
      "doc_type:source_analysis",
      "counsel_for_department:*g*c*Srivastava*"
    ],
    "parsed_filter_queries": [
      "doc_type:source_analysis",
      "counsel_for_department:*g*c*srivastava*"
    ],
    "timing": {
      "time": 20,
      "prepare": {
        "time": 16,
        "query": {
          "time": 16
        },
        "facet": {
          "time": 0
        },
        "facet_module": {
          "time": 0
        },
        "mlt": {
          "time": 0
        },
        "highlight": {
          "time": 0
        },
        "stats": {
          "time": 0
        },
        "expand": {
          "time": 0
        },
        "debug": {
          "time": 0
        }
      },
      "process": {
        "time": 3,
        "query": {
          "time": 3
        },
        "facet": {
          "time": 0
        },
        "facet_module": {
          "time": 0
        },
        "mlt": {
          "time": 0
        },
        "highlight": {
          "time": 0
        },
        "stats": {
          "time": 0
        },
        "expand": {
          "time": 0
        },
        "debug": {
          "time": 0
        }
      }
    }
  }
}

提前致谢

2 个答案:

答案 0 :(得分:1)

不分析通配符查询,因此在大多数情况下最好远离它们,而是使用术语匹配。这样你就可以匹配文件而不管术语的顺序如何,所以“john oliver”也会匹配“oliver john”,“john oliver”会根据短语匹配得到提升。

要扩展,通配符匹配将发生的唯一方法是,如果基础数据集中的实际令牌匹配 - 并且如果您有一个令牌化器和过滤器链,通常,它不会在您抛出空格时立即混合。

删除通配符并使用正确的匹配(这是Solr真正做得很好的)。

答案 1 :(得分:0)

对于纯文本搜索,您应该去:

fq=counsel_for_department:*a g  srivastava* 

//OR you can also use : 

fq=counsel_for_department:*a*g*srivastava*

首先使用这样的。但它在SOLR中是一个相对昂贵/缓慢的查询。 作为改进,如果此查询非常昂贵(花费太多时间),则应在1个统一字段中转换多值字段。并查询该字段而不是多值字段。