ElasticSearch:使用匹配的搜索词标记文档

时间:2016-03-07 04:51:18

标签: elasticsearch

我正在使用elasticsearch 1.7,并且需要一种方法来标记文档,这些文档与query_string查询的哪一部分相匹配。

我一直在尝试突出显示,但发现在某些情况下它会变得有些混乱。我希望用匹配的搜索字词标记文档。

以下是我正在使用的查询:(注意这是一个后来被编码为JSON的ruby哈希)

{
  query: {
    query_string: {
      fields: ["title^10", "keywords^4", "content"],
      query: query_string,
      use_dis_max: false
    }
  },
  size: 20,
  from: 0,
  sort: [
    { pub_date: { order: :desc }},
    { _score:   { order: :desc }}
  ]
}

query_string变量基于用户关注的主题,可能如下所示:"(the AND walking AND dead) OR (iphone) OR (video AND games)"

我是否可以使用任何选项,以便返回的文档具有与the walking dead(the AND walking AND dead)

等搜索字词匹配的属性

1 个答案:

答案 0 :(得分:2)

如果您已准备好切换到使用from pprint import pprint users = {"Ricky": {"Bob Seger": 4.75, "CCR": 4.5, "Beatles": 5, "The Who": 4.25, "Taylor Swift": 4}, "Meg": {"Bob Seger": 4, "CCR": 3, "Beatles": 5, "The Who": 2, "Taylor Swift": 1}, "Jake": {"Bob Seger": 4, "CCR": 3, "Beatles": 5, "The Who": 3, "Taylor Swift": 3} } def pearson(ratingsUser1, ratingsUser2): # Summation over all attributes for both objects sum_ratingsUser1 = sum(ratingsUser1.values()) sum_ratingsUser2 = sum(ratingsUser2.values()) # Sum the squares square_sum1 = 0 square_sum2 = 0 for value in ratingsUser1.values(): square_sum1 += pow(value, 2) for value in ratingsUser2.values(): square_sum2 += pow(value, 2) # Add up the products product = sum_ratingsUser1 * sum_ratingsUser2 #Calculate Pearson Correlation score numerator = product - (sum_ratingsUser1*sum_ratingsUser2/len(ratingsUser1)) denominator = ((square_sum1 - pow(sum_ratingsUser1,2)/len(ratingsUser1)) * (square_sum2 - pow(sum_ratingsUser2,2)/len(ratingsUser1))) ** 0.5 # Can"t have division by 0 if denominator == 0: return 0 result = numerator/denominator return result # Compute the Summation # Compute the numerator # Compute the denominator def nearest_neighbor(username, ratings): distances = [] for other_user in ratings: if other_user != username: distances.append( (pearson(ratings[username], ratings[other_user]), other_user) ) distances.sort() return distances pprint(nearest_neighbor("Ricky", users)) 查询,则可以在每个字段上拆分匹配并使用named queries,然后在结果中您将获得名称匹配的查询。

它基本上是这样的:在TextView textView = new TextView(getActivity()); textView.setLayoutParams(new ViewGroup.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT, ViewGroup.LayoutParams.WRAP_CONTENT)); textView.setText("GO SYNC CONTACTS!!!!!"); textView.setTextSize(30); 查询中,您为每个字段添加一个bool/should查询并命名该查询以便标识该字段(例如bool/should为{query_string 1}}字段等)

title_query

在结果中,您将获得title另一个名为{ "query": { "bool": { "should": [ { "query_string": { "fields": [ "title^10" ], "query": "query_string", "use_dis_max": false, "_name": "title_query" } }, { "query_string": { "fields": [ "keywords^4" ], "query": "query_string", "use_dis_max": false, "_name": "keywords_query" } }, { "query_string": { "fields": [ "content" ], "query": "query_string", "use_dis_max": false, "_name": "content_query" } } ] } } } 的数组,其中包含与返回文档匹配的查询名称。

_source