【问题标题】:Elasticsearch - getting aggregated data based on unique values from fieldElasticsearch - 根据字段中的唯一值获取聚合数据
【发布时间】:2021-06-12 03:00:34
【问题描述】:

在我的 elasticsearch (7.13) 索引中,我有以下数据集:

maid      site_id    date         hour
m1        1300       2021-06-03   1
m1        1300       2021-06-03   2
m1        1300       2021-06-03   1
m2        1300       2021-06-03   1

我正在尝试从上表中获取每个日期和 site_id 的唯一记录数。想要的结果是

maid      site_id   date        count        
m1        1300      2021-06-03  1
m2        1300      2021-06-03  1

每个 site_id 我都有数百万个女佣,日期跨度为两年。我在女佣上使用以下代码和cardinality,假设它将返回唯一的女佣。

GET /r_2332/_search
{
  "size":0,
  "aggs": {
    "site_id": {
      "terms": {
        "field": "site_id",
        "size":100,
        "include": [
          1171, 1048
        ]
      },"aggs" : {
          "bydate" : {
            "range" : {
              "field": "date","ranges" : [
                {
                  "from": "2021-04-08",
                  "to": "2021-04-22" 
                }
                ]
            },"aggs" : {
              "rdate" : {
                "terms" : {
                  "field":"date" 
                },"aggs" :{
                  "maids" : {
                    "cardinality": {
                      "field": "maid"
                    }
                  }
              } 
            } 
          } 
        }
      }
    }
  }
}

这仍然返回具有所有重复值的数据。如何将 maid 字段包含到我的查询中,以便根据唯一的 maid 值过滤数据。

【问题讨论】:

    标签: elasticsearch elasticsearch-aggregation


    【解决方案1】:

    如果您想获得基于site_idmaid 的唯一文档,您可以使用multi terms aggregationcardinality aggregation

        {
      "size": 0,
      "query": {
        "bool": {
          "filter": [
            {
              "terms": {
                "site_id": [
                  "1300",
                  "1301"
                ]
              }
            },
            {
              "range": {
                "date": {
                  "gte": "2021-06-02",
                  "lte": "2021-06-03"
                }
              }
            }
          ]
        }
      },
      "aggs": {
        "group_by": {
          "multi_terms": {
            "terms": [
              {
                "field": "site_id"
              },
              {
                "field": "maid.keyword"
              }
            ]
          },
          "aggs": {
            "type_count": {
              "cardinality": {
                "field": "site_id"
              }
            }
          }
        }
      }
    }
    

    搜索结果将是

    "aggregations": {
        "group_by": {
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 0,
          "buckets": [
            {
              "key": [
                1300,
                "m1"
              ],
              "key_as_string": "1300|m1",
              "doc_count": 3,
              "type_count": {
                "value": 1           // note this
              }
            },
            {
              "key": [
                1300,
                "m2"
              ],
              "key_as_string": "1300|m2",
              "doc_count": 1,
              "type_count": {
                "value": 1            // note this
              }
            }
          ]
        }
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2017-09-07
      • 2015-12-11
      • 1970-01-01
      • 2023-03-06
      • 1970-01-01
      • 2016-04-14
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多