Question

让我们说我正在为Elasticsearch索引一堆产品可用的Product和Store。例如，文档类似于：

{
  name: "iPhone 6s",
  price: 600.0,
  stores: [
    {
      name: "Apple Store Union Square",
      location: "San Francisco, CA"
    },
    {
      name: "Target Cupertino",
      location: "Cupertino, CA"
    },
    {
      name: "Apple Store 5th Avenue",
      location: "New York, NY"
    }
    ...
  ]
}

并使用nested类型，映射将为：

"mappings" : {
  "product" : {
    "properties" : {
      "name" : {
        "type" : "string"
      },
      "price" : {
        "type" : "float"
      },
      "stores" : {
        "type" : "nested",
        "properties" : {
          "name" : {
            "type" : "string"
          },
          "location" : {
            "type" : "string"
          }
        }
      }
    }
  }
}

我想创建一个查询来查找某些位置可用的所有产品，比如＆＃34; CA＆＃34;，然后按匹配的商店数量排序。我知道Elasticsearch有一个inner hit功能，允许我在嵌套的Store文档中找到匹配，但是根据内部命中的Product排序doc_count？并进一步扩展问题，是否可能基于某些内部聚合对父文档进行排序？提前谢谢。

Answer 1

你想要实现的目标是可能的。目前您没有获得预期的结果，因为nested query中的默认score_mode参数为avg，因此，如果5个商店与给定产品匹配，则可能会得分低于仅匹配2个商店的商店因为_score是按平均值计算的。

通过将summing指定为inner hits，score_mode所有sum可以解决此问题。一个小问题可能是field length norm，即匹配较短的字段获得较高的分数比较大的字段。因此，在您的示例 Cupertino中，CA 将比旧金山，CA 高出score。您可以使用inner hits检查此行为。要解决此问题，您需要禁用field norms。将location mapping更改为

"location": {
    "type": "string",
    "norms": {
        "enabled": false
    }
}

之后，此查询将为您提供所需的结果。我添加了inner hits来为每个匹配的嵌套文档演示equal score。

{
  "query": {
    "nested": {
      "path": "stores",
      "query": {
        "match": {
          "stores.location": "CA"
        }
      },
      "score_mode": "sum",
      "inner_hits": {}
    }
  }
}

这将sort根据存储的匹配数量生成产品。

希望这有帮助！

Elasticsearch按内部命中排序父级doc count

1 个答案: