【问题标题】:Aggregation date using Elasticsearch RestHighLevelClient使用 Elasticsearch RestHighLevelClient 聚合日期
【发布时间】:2020-06-28 07:32:03
【问题描述】:

我正在使用 Elasticsearch RestHighLevelClient 并尝试在下面实现 sql:

select format(date,'yyyy-MM-dd'), count(*) from order group by 格式(日期,'yyyy-MM-dd')

    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    sourceBuilder.query(QueryBuilders.matchAllQuery());
    sourceBuilder.aggregation(AggregationBuilders.terms("date_count").field("time").format("yyyy-MM-dd"));
    SearchRequest searchRequest = new SearchRequest("order_*");
    searchRequest.source(sourceBuilder);
    SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
    Aggregations aggregations = searchResponse.getAggregations();

我得到了结果,但有许多重复日期结果,例如:

buckets:
  bucket1:
    key: 1583390009000
    keyAsString: 2020-03-05
    docCount: 4
  bucket2:  
    key: 1583391748000
    keyAsString: 2020-03-05
    docCount: 5

我应该如何避免重复日期?我的意思是同一个日期(比如2020-03-05)算一个桶。

这里有一些示例文档

  {
    "_index": "order_202003",
    "_type": "_doc",
    "_id": "1568274527051",
    "_score": 1,
    "_source": {
      "auto": 0,
      "collection_currency": "RMB",
      "collection_value": 0,
      "customerid": "",
      "freight": 0,
      "gs": 201101,
      "status": "accept",
      "time": "2020-03-05T10:01:23.000+0800"
    }
  },
  {
    "_index": "order_202003",
    "_type": "_doc",
    "_id": "1568274527078",
    "_score": 1,
    "_source": {
      "auto": 0,
      "collection_currency": "RMB",
      "collection_value": 0,
      "customerid": "",
      "freight": 0,
      "gs": 413572,
      "status": "accept",
      "time": "2020-03-05T17:29:53.000+0800"
    }
  }

【问题讨论】:

  • 你能提供你的映射和一些示例文件
  • 也许你应该使用 date histgram aggs

标签: java elasticsearch aggregation


【解决方案1】:

您应该改用date histogram aggregation

SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(QueryBuilders.matchAllQuery());
//sourceBuilder.aggregation(AggregationBuilders.terms("date_count").field("time").format("yyyy-MM-dd"));
sourceBuilder.aggregation(AggregationBuilders.dateHistogram("date_count").calendarInterval(DateHistogramInterval.DAY).field("time"));
SearchRequest searchRequest = new SearchRequest("order_*");
searchRequest.source(sourceBuilder);
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
Aggregations aggregations = searchResponse.getAggregations();

【讨论】:

  • 为什么有很多记录(bucket)'docCount=0'?
  • 那是因为那一天没有任何文件。为了忽略空桶集.minDocCount(1)以及dateHistogram聚合。
猜你喜欢
  • 2017-12-28
  • 1970-01-01
  • 2016-07-16
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2014-08-04
相关资源
最近更新 更多