【发布时间】:2017-09-06 14:38:39
【问题描述】:
我正在从设备收集数据,我想了解新设备何时上线。文件格式为:
{
"device_id": "ue-0000"
}
我可以通过使用嵌套术语聚合进行日期直方图聚合来查询一段时间内的活动设备,但我不知道如何表达“从索引中较早出现device_id 的存储桶中过滤掉匹配项”的逻辑.
这是我当前的查询:
{
"query": {
"filtered": {
"filter": {
"range": {
"timestamp": {
"gte": "2015/12/08",
"lte": "2016/01/08"
}
}
}
}
},
"aggregations": {
"over_time": {
"aggregations": {
"app_count": {
"terms": {
"field": "app"
}
}
},
"date_histogram": {
"field": "timestamp",
"interval": "day",
"min_doc_count": 0,
"extended_bounds": {
"min": "2015/12/08",
"max": "2016/01/08"
}
}
}
}
}
我有这样的文档:
{
"timestamp": "2015/12/15",
"device_id": "1"
}
{
"timestamp": "2015/12/16",
"device_id": "2"
}
{
"timestamp": "2015/12/20",
"device_id": "1"
}
我想返回类似的东西:
{
"aggregations": {
"over_time": {
"buckets": [
{
"key_as_string":"2015/12/15 00:00:00",
"key":1449532800000,
"doc_count":1,
"new_devices":{
"doc_count_error_upper_bound":0,
"sum_other_doc_count":0,
"buckets":[{"device_id": "1"}]}
},
{
"key_as_string":"2015/12/16 00:00:00",
"key":1449532800000,
"doc_count":1,
"new_devices":{
"doc_count_error_upper_bound":0,
"sum_other_doc_count":0,
"buckets":[{"device_id": "2"}]}
},
// [[ SNIP ]]
{
"key_as_string":"2015/12/20 00:00:00",
"key":1449532800000,
"doc_count":0, // there are no new device_ids on this date
"new_devices":{
"doc_count_error_upper_bound":0,
"sum_other_doc_count":0,
"buckets":[]}
}
]
}
}
}
【问题讨论】:
-
device_id 出现在索引的前面 是什么意思?您能否举一个示例文档之类的示例以及您期望的输出类型?
-
@ChintanShah25 好主意,我已经添加了示例索引文档和所需的输出类型。
-
解决方案是否有助于解决问题?
标签: elasticsearch histogram rollup nosql