【问题标题】:Search for an array of values搜索值数组
【发布时间】:2018-02-09 21:53:58
【问题描述】:

我在 elasticsearch 中有一个索引,其中正文包含一个字段数组和一个值数组。例如:

{
"took": 0,
"timed_out": false,
"_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
},
"hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
    {
        "_index": "families",
        "_type": "family",
        "_id": "o8qxd2EB9CizMt-k15mv",
        "_score": 1,
        "_source": {
        "names": [
            "Jefferson Erickson",
            "Bailee Miller",
            "Ahmed Bray"
        ]
        }
    },
    {
        "_index": "families",
        "_type": "family",
        "_id": "osqxd2EB9CizMt-kfZlJ",
        "_score": 1,
        "_source": {
        "names": [
            "Nia Walsh",
            "Jefferson Erickson",
            "Darryl Stark"
        ]
        }
    },
    {
        "_index": "families",
        "_type": "family",
        "_id": "pMrEd2EB9CizMt-kq5m-",
        "_score": 1,
        "_source": {
        "names": [
            "lia shelton",
            "joanna shaffer",
            "mathias little"
        ]
        }
    }
    ]
}
}

现在我需要一个搜索查询,我可以在其中从一组值中搜索文档,如下所示:

GET /families/_search
{
"query" : {
    "bool" : {
    "filter" : {
        "bool" : {
        "should" : [
            {"match_phrase" : {"names" : ["ahmed bray", "nia walsh"]}}
        ]
        }
    }
    }
}
}

它应该返回包含这些名称的 2 个文档,如下所示:

{
"took": 0,
"timed_out": false,
"_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
},
"hits": {
    "total": 2,
    "max_score": 0,
    "hits": [
    {
        "_index": "families",
        "_type": "family",
        "_id": "o8qxd2EB9CizMt-k15mv",
        "_score": 0,
        "_source": {
        "names": [
            "Jefferson Erickson",
            "Bailee Miller",
            "Ahmed Bray"
        ]
        }
    },
    {
        "_index": "families",
        "_type": "family",
        "_id": "osqxd2EB9CizMt-kfZlJ",
        "_score": 0,
        "_source": {
        "names": [
            "Nia Walsh",
            "Jefferson Erickson",
            "Darryl Stark"
        ]
        }
    }
    ]
}
}

如何进行这样的查询?我尝试使用“terms”关键字,但“terms”只允许我从数组中搜索单个单词,如下所示: {“条款”:{“名称”:[“布雷”,“尼亚”]}}

但我需要使用这样的全名: {"names" : ["ahmed bray", "nia walsh"]}}

【问题讨论】:

标签: elasticsearch elasticsearch-5


【解决方案1】:

您遇到的“问题”与 Elasticsearch 处理文本字段的行为有关。默认情况下,每个文本字段都使用 Standard Tokenizer 进行标记,正如您在文档中看到的那样,将文本拆分为单词。

实现此目的的一个选项是改进默认设置和映射。您需要做的就是添加multi field(在我们的例子中为entire-phrase),它将以不同的方式进行分析并通过它进行搜索。

首先使用以下设置/映射创建索引:

{
  "settings": {
    "analysis": {
      "normalizer": {
        "case_and_accent_insensitive": {
          "filter": [
            "lowercase",
            "asciifolding"
          ]
        }
      }
    }
  },
  "mappings": {
    "family": {
      "properties": {
        "names": {
          "type": "text",
          "fields": {
            "entire-phrase": {
              "type": "keyword",
              "normalizer": "case_and_accent_insensitive"
            }
          }
        }
      }
    }
  }
}

然后您可以通过以下方式搜索您期望的内容:

{
  "query": {
    "terms": {
      "names.entire-phrase": [
        "ahmed bray",
        "nia walsh"
      ]
    }
  }
}

请注意,此搜索只会按名字或姓氏为您找到任何结果。仅匹配整个短语。如果你想同时实现这两个,你必须同时搜索namesnames.entire-phrase这两个字段。

【讨论】:

    猜你喜欢
    • 2017-01-19
    • 1970-01-01
    • 1970-01-01
    • 2018-10-07
    • 2019-03-11
    • 2022-01-12
    相关资源
    最近更新 更多