Question

上下文

基本上我拥有的是一个大型名称数据集，分为pty_firstname和pty_surname。我将这些数据从Informix数据库索引到ElasticSearch，这一切都运行正常。然而，我未能实现的是这个表结构的逻辑映射，以及从映射中受益的查询。我似乎遇到问题的地方是名称实际上被分成两列，对我来说，查询有点难以返回声音结果集。

如果有人可以给我一些帮助，为了让我返回一个结果集，其中最接近但不完全匹配在顶部，并且当我们在结果集中前进时，结果变得越不相似。

映射

我试图在这里获得我的映射和查询的一些灵感，但有一些改动，但我似乎无法得到我需要/想要的结果 - http://goo.gl/hm9ISL

{
   "mappings":{
      "user":{
         "properties":{
            "pty_forename":{
               "type":"multi_field",
               "fields":{
                  "name":{
                     "type":"string",
                     "index":"analyzed"
                  },
                  "exact":{
                     "type":"string",
                     "index":"not_analyzed"
                  }
               }
            },
            "pty_surname":{
               "type":"multi_field",
               "fields":{
                  "name":{
                     "type":"string",
                     "index":"analyzed"
                  },
                  "exact":{
                     "type":"string",
                     "index":"not_analyzed"
                  }
               }
            },
            "pty_minute_ref":{
               "type":"integer",
               "index":"not_analyzed"
            },
            "pty_deed_code":{
               "type":"string",
               "index":"not_analyzed"
            },
            "pty_name_prefix":{
               "type":"string",
               "index":"not_analyzed"
            },
            "pty_name_suffix":{
               "type":"string",
               "index":"not_analyzed"
            },
            "pty_address":{
               "type":"string",
               "index":"not_analyzed"
            },
            "pty_desig_suffix":{
               "type":"string",
               "index":"not_analyzed"
            },
            "pty_mc_ind":{
               "type":"string",
               "index":"not_analyzed"
            },
            "pty_of_ind":{
               "type":"string",
               "index":"not_analyzed"
            },
            "pty_or_ind":{
               "type":"integer",
               "index":"not_analyzed"
            },
            "pty_date_entered":{
               "type":"basic_date",
               "index":"not_analyzed"
            },
            "pty_data":{
               "type":"string",
               "index":"not_analyzed"
            },
            "pty_type":{
               "type":"string",
               "index":"not_analyzed"
            }
         }
      }
   }
}

查询

{
   "query":{
      "bool":{
         "must":[
            {
               "multi_match":{
                  "query":"Nathan Smith",
                  "fields":[
                     "pty_forename",
                     "pty_surname"
                  ]
               }
            }
         ],
         "should":[
            {
               "term":{
                  "pty_forename.exact":{
                     "value":"Nathan Smith",
                     "boost":15
                  }
               }
            },
            {
               "prefix":{
                  "pty_forename.exact":{
                     "value":"Nathan Smith",
                     "boost":10
                  }
               }
            },
            {
               "match_phrase":{
                  "pty_forename":{
                     "query":"Nathan Smith",
                     "slop":0,
                     "boost":5
                  }
               }
            }
         ]
      }
   }
}

结论

我回来的结果集不是在两个字段中查询，即pty_forename和pty_surname，而是返回姓氏为Nathan等的人。非常感谢任何帮助。

更新 - 链接到Gist

Link to Gist

Answer 1

这是你想要的东西吗？

"bool" : {
    "should" : {
        "match" : { "pty_forename" : "nathan" }
    },
    "should" : {
        "match" : { "pty_surname" : "smith" }
    }
}

也就是说，pty_forename =＆＃34; nathan＆＃34;或pty_surname =＆＃34;史密斯＆＃34; （两者都得分较高）。

Answer 2

在您链接的StackOverflow示例中，在the elasticsearch Multifield documentation中，在“访问字段”标题下，多字段类型中“字段”下列出的第一个字段应与字段本身。因此，在SO和文档示例中，“name”是“fields”下给出的第一个名称，因为“name”是multi_type字段的名称。在您的示例中，您的映射应该是
```
    "pty_forename":{
       "type":"multi_field",
       "fields":{
          "pty_forename":{
             "type":"string",
             "index":"analyzed"
          },
          "exact":{
             "type":"string",
             "index":"not_analyzed"
          }
       }
    },
    "pty_surname":{
       "type":"multi_field",
       "fields":{
          "pty_surname":{
             "type":"string",
             "index":"analyzed"
          },
          "exact":{
             "type":"string",
             "index":"not_analyzed"
          }
       }
    },
```
现在你的映射是你的“bool”“必须”查询可能没有做任何事情，因为你的多字段“字段”当前都没有被命名为“pty_forename”或“pty_surname”。我说可能因为我不知道Elasticsearch是否仍在其名称下保存multi_field，即使您没有在“字段”部分使用该名称。

您的“bool”“应该”查询需要搜索“pty_forename”和“pty_surname”，如 femtoRgon建议，也许是这样的：


      "multi_match" : {
            "fields" : ["pty_forename.exact", "pty_surname.exact"],
            "value" : "Nathan Smith",
            "type" : "term"
      },
     "multi_match" : {
            "fields" : ["pty_forename.exact", "pty_surname.exact"],
            "value" : "Nathan Smith",
            "type" : "prefix"
      },
      "multi_match" : {
            "fields" : ["pty_forename.exact", "pty_surname.exact"],
            "query" : "Nathan Smith",
            "slop":0,
            "boost":5,
            "type" : "match_phrase"
      }

我在这里看了javanna的答案Elasticsearch phrase prefix query on multiple fields。

弹性搜索中的映射+查询

2 个答案: