如何获取 top_hits 聚合中的字段答案

【问题标题】：How to get fields in top_hits aggregation如何获取 top_hits 聚合中的字段
【发布时间】：2015-12-21 15:19:24
【问题描述】：

我有弹性搜索集群 - 版本 1.3.0。该集群的索引文档没有启用_source，因此在检索命中时，我通常根据请求中的“字段”参数获取。

现在我正在为重复分组功能实现top_hits 聚合。我想获得 top_hits 结果中的字段，因为我的映射中默认未启用 _source，所以我现在无法这样做。您能否建议我在不更改现有映射的情况下实现此目的的选项/解决方法？

我在top-hits aggregation doc 中没有找到它。非常感谢您对此的任何帮助。

谢谢！

【问题讨论】：

标签： elasticsearch aggregation duplicate-removal

【解决方案1】：

使用script fields:

  "aggs": {
    "sample": {
      "top_hits": {
        "size": 1,
        "script_fields": {
          "field1": {
            "script": "doc['field1']"
          },
          "field2": {
            "script": "doc['field2']"
          }
          ...
        }
      }
    }
  }

但是，如果分析了field1 或field2，您将需要一个子字段来保留该字段的not_analyzed 版本。为什么？因为，如果以任何方式分析普通字段，doc['field'] 调用将返回分析的术语，而不是索引的初始内容。

类似这样的：

  "mappings": {
    "test": {
      "_source": {
        "enabled": false
      }, 
      "properties": {
        "field1": {
          "type": "string",
          "fields": {
            "notAnalyzed": {
              "type": "string",
              "index": "not_analyzed"
            }
          }
        }
      }
    }
  }

还有查询：

  "aggs": {
    "sample": {
      "top_hits": {
        "size": 1,
        "script_fields": {
          "field1": {
            "script": "doc['field1.notAnalyzed']"
          }
        }
      }
    }
  }

【讨论】：

完美答案。非常感谢。不过，这会在 elasticsearch 节点缓存文档字段时增加内存的负载吗？
嗯，我认为这会影响查询的响应时间，因为它会使用更长的 CPU 时间。脚本没那么快。
@AndreiStefan 我怎样才能获得嵌套字段？即在 script_fields 如果我想获得一个嵌套在另一个字段中的字段。
@AndreiStefan 我已经使用了 params._source 并且它得到了解决。