【问题标题】:Number of Documents Per Day bucket and applied some filters每天存储桶的文档数并应用了一些过滤器
【发布时间】:2020-07-06 04:35:30
【问题描述】:

我在 elasticsearch 中有文档,其中每个文档如下所示:

{
  "id": "T12890ADSA12",
  "status": “CREATED”,
  "type": “ABC”,
  "updatedAt": "2020-05-29T18:18:08.483Z",
   "createdAt": "2020-04-30T13:41:25.862Z"
}

对于这个文档结构,我想获取所有状态为 CREATED 或 SCHEDULED 且 TYPE 为 ABC 的文档。在这些过滤的文档中,我想根据 currentDate - createdAt 在天桶中聚合文档的数量。例如。

  • 创建日期为今天的日期 -> 今天创建的文档计数
  • 创建日期为昨天的日期 -> 昨天创建的文档计数

过去 7 天也是如此。

有没有一种简单的方法可以在单个查询中执行此操作?

【问题讨论】:

  • 您是否尝试过创建查询?你被困在哪里了?
  • @Opster ES Ninja Nishant,感谢您的提问。我对这里的聚合有点困惑。我知道我可以为它编写脚本语言吗?但是不确定如何使日期地图轻松计数,以及如何返回?我对ES还是很天真。尝试了一些基本的过滤查询和术语聚合查询。但我认为这有点复杂。对于这个问题也是如此:stackoverflow.com/questions/62749495/…

标签: elasticsearch elasticsearch-aggregation elasticsearch-dsl


【解决方案1】:

请找到以下映射、示例文档、聚合查询和响应:

映射:

PUT my_date_index
{
  "mappings": {
    "properties": {
      "id": {
        "type": "keyword"
      },
      "status": {
        "type": "keyword"
      },
      "type": {
        "type": "keyword"
      },
      "updatedAt": {
        "type": "date"
      },
      "createdAt": {
        "type": "date"
      }
    }
  }
}

示例文件:

POST my_date_index/_doc/1
{
  "id": "T12890ADSA12",
  "status": "CREATED",
  "type": "ABC",
  "updatedAt": "2020-05-29T18:18:08.483Z",
  "createdAt": "2020-07-06T05:00:00.000Z"
}

POST my_date_index/_doc/2
{
  "id": "T12890ADSA13",
  "status": "SCHEDULED",
  "type": "ABC",
  "updatedAt": "2020-05-29T18:18:08.483Z",
  "createdAt": "2020-07-05T13:41:25.862Z"
}

POST my_date_index/_doc/3
{
  "id": "T12890ADSA14",
  "status": "SCHEDULED",
  "type": "ABC",
  "updatedAt": "2020-05-29T18:18:08.483Z",
  "createdAt": "2020-07-04T06:00:00.000Z"
}

POST my_date_index/_doc/4
{
  "id": "T12890ADSA15",
  "status": "SCHEDULED",
  "type": "ABC",
  "updatedAt": "2020-05-29T18:18:08.483Z",
  "createdAt": "2020-07-03T07:00:00.000Z"
}

查询请求:

POST my_date_index/_search
{
  "size": 0,               <----- Remove this to return documents too
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "type": "ABC"
          }
        },
        {
          "range": {
            "createdAt": {
              "gte": "now-7d",
              "lte": "now"
            }
          }
        }
      ],
      "should": [
        {
          "term": {
            "status": "SCHEDULED"
          }
        },
        {
          "term": {
            "status": "CREATED"
          }
        }
      ],
      "minimum_should_match": 1
    }
  },
  "aggs": {
    "my_date": {
      "date_histogram": {
        "field": "createdAt",
        "calendar_interval": "day",
        "order": {
          "_key": "desc"
        }
      }
    }
  }
}

请注意,我首先根据您提供的日期和条件过滤了文档。

这将返回所有文档。我已应用 date histogram 查询以获取该日期范围内每一天的文档的帖子。

回应:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "my_date" : {
      "buckets" : [
        {
          "key_as_string" : "2020-07-06T00:00:00.000Z",
          "key" : 1593993600000,
          "doc_count" : 1
        },
        {
          "key_as_string" : "2020-07-05T00:00:00.000Z",
          "key" : 1593907200000,
          "doc_count" : 1
        },
        {
          "key_as_string" : "2020-07-04T00:00:00.000Z",
          "key" : 1593820800000,
          "doc_count" : 1
        },
        {
          "key_as_string" : "2020-07-03T00:00:00.000Z",
          "key" : 1593734400000,
          "doc_count" : 1
        }
      ]
    }
  }
}

希望这会有所帮助!

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-04-19
    • 2021-08-04
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多