使用 multi_match 突出显示 Elasticsearch答案

【问题标题】：Elasticsearch highlighting using multi_match使用 multi_match 突出显示 Elasticsearch
【发布时间】：2017-05-22 10:29:54
【问题描述】：

我已经通过简单的匹配查询使用 ES 突出显示：

GET /_search
{
   "query": {
      "match": {
         "Text": "key words here"
      }
   },
   "highlight": {
      "pre_tags" : ["<span class='highlighter'>"],
      "post_tags" : ["</span>"],
      "fields": {
         "Text": {
            "fragment_size": 400,
            "number_of_fragments": 1,
            "no_match_size" : 20
         }
      }
   }
}

这很好用，并在结果中返回带有指定标签的高亮文本。

我想在这样的 multi_match 查询上使用突出显示：

GET /_search
{
   "query": {
      "multi_match": {
         "query": "GB RAM",
         "operator": "AND",
         "fields": "_all"
      }
   },
   "highlight": {
         "pre_tags": [
            "<span class='highlighter'>"
         ],
         "post_tags": [
            "</span>"
         ],
         "fields": {
            "Text": {
               "fragment_size": 400,
               "number_of_fragments": 1,
               "no_match_size": 20
            }
         }
      }
}

这不太行，返回的高亮文本是 20 个字符长（no_match_size），像这样：

 "highlight": {
    "Text": ["        DVD-RAM"]
  }

我在这里做错了什么？

【问题讨论】：

标签： elasticsearch

【解决方案1】：

您必须先修改映射才能在映射中启用 store:true。由于 highlight need exact string 值作为 _all 字段不包含在源中。

更改映射为 _all 设置 store:true

PUT highlight_index_2

{
  "mappings": {
    "type_1" : {
     "_all": {
        "enabled": true,
        "store": true
      }
    }
  }
}

接下来，您需要稍微调整一下您的查询。当您的查询指定 lucene 荧光笔仅突出显示文本字段时，您将获得仅文本字段的突出显示。你可以修改你的查询如下

{
    "query": {
        "query_string": {
            "fields": [
                "_all"
            ],
            "query": "this is the"
        }
    },
    "highlight": {
        "pre_tags": [
            "<span class='highlighter'>"
        ],
        "post_tags": [
            "</span>"
        ],
        "fields": {
            "_all": {
                "fragment_size": 400,
                "number_of_fragments": 2,
                "no_match_size": 20
            }
        }
    }
}

确保调整片段数量以突出显示多个字段。

【讨论】：

非常感谢您的帮助。我想我会使用您留下的链接中的突出显示原始字段。我的集群大小超过 420GB（90M 文档），因此存储 _all 字段并不是一个真正的选择，因为存储。
是的，但您也可以修改 _all 字段以仅存储少数相关字段。

【解决方案2】：

感谢 user3775217 的回答。这是最终对我有用的查询

GET /_search
{
   "_source": {
        "exclude": [ "Text" ]
   },
   "query": {
        "multi_match": {
            "query": "DICTIONARY",
            "operator": "AND",
            "fields": "_all"
        }
    },
   "highlight": {
      "pre_tags": [
         "<span class='highlighter'>"
      ],
      "post_tags": [
         "</span>"
      ],
      "fields": {
         "Text": {
            "require_field_match": false,
            "fragment_size": 400,
            "number_of_fragments": 1,
            "no_match_size": 20
         }
      }
   },
   "size": 100
}

http://kempe.net/blog/2015/02/25/elasticsearch-query-full-text-search.html

【讨论】：

如果您只在 "fields": "_all" 中搜索，那么您不需要 multi_match 查询。您可以使用 match 查询