【问题标题】:Aggregate to Get totals of key value pairs in array, grouping on the field values for names聚合以获取数组中键值对的总数,对名称的字段值进行分组
【发布时间】:2015-10-17 04:24:16
【问题描述】:

我有如下结构的文档,其中每个数组元素包含“k”和“v”作为不同类型数据的键和值。我需要将“facility”、“ip”和“num”的“k”值组合起来,并计算集合中不同组合的总数。

{ 
    "_id" : 1, 
    "logs" : [
        { "n" : "facility", "v" : 26 },
        { "n" : "num", "v" : 6 },
        { "n" : "ip", "v" : "137.68.151.104" },
        { "n" : "protocol", "v" : "55902/udp" },
        { "n" : "port", "v" : "53" } 
    ]
},
{ 
    "_id" : 2, 
    "logs" : [ 
        { "n" : "facility", "v" : 26 }, 
        { "n" : "num", "v" : 6 },
        { "n" : "ip", "v" : "137.68.160.51" }, 
        { "n" : "protocol", "v" : "13438/tcp" }, 
        { "n" : "port", "v" : "13438" } 
    ]
},
{ 
    "_id" : 3,
    "logs" : [
        { "n" : "facility", "v" : 26 },
        { "n" : "num", "v" : 6 }, 
        { "n" : "ip", "v" : "137.68.160.51" }, 
        { "n" : "protocol", "v" : "13434/tcp" },
        { "n" : "port", "v" : "53" } 
    ]
},
{ 
    "_id" : 4,
    "logs" : [
        { "n" : "facility", "v" : 26 },
        { "n" : "num", "v" : 6 }, 
        { "n" : "ip", "v" : "137.68.160.184" },
        { "n" : "protocol", "v" : "61662/udp" },
        { "n" : "port", "v" : "53" } 
    ]
},
{ 
    "_id" : 5, 
    "logs" : [ 
        { "n" : "facility", "v" : 26 },
        { "n" : "num", "v" : 6 }, 
        { "n" : "ip", "v" : "137.68.160.51" }, 
        { "n" : "protocol", "v" : "13435/tcp" }, 
        { "n" : "port", "v" : "13435" } 
    ]
},
{ 
    "_id" : 6,
    "logs" : [ 
        { "n" : "facility", "v" : 26 },
        { "n" : "num", "v" : 6 },
        { "n" : "ip", "v" : "137.68.160.51" },
        { "n" : "protocol", "v" : "61662/udp" },
        { "n" : "port", "v" : "53" }
    ]

}

我不想要的查询选择条件是:

  1. 端口是 53
  2. 协议是“udp”或“tcp”
  3. 按 [facility, num, ip] 分组

那应该选择那里六个文档中的四个。那部分正在工作

我想要这样的结果。

{facility : 26, num : 6, ip : 137.68.151.104 , count : 1}
{facility : 26, num : 6, ip : 137.68.160.51 , count : 2}
{facility : 26, num : 6, ip : 137.68.160:184 , count : 1}

这是我目前所拥有的:

db.agg.aggregate ([
{
'$match' : { 'logs' : {'$all' : [{'$elemMatch' : {'n' : "port", "v" : "53"}}, {'$elemMatch' : {'n' : "protocol", "v" : {"$in" :[/udp/,/tcp/]}}}   ]}}     },
{ '$unwind' : '$logs' },
{ '$match' : {"logs.n" : "ip"}},
{ '$group' : { _id : { 'ip' : '$logs.v'}, count : {$sum : 1}}}
])

但我不知道如何获取其中的所有字段,而且我目前只得到“ip”的结果。

【问题讨论】:

    标签: mongodb mongodb-query aggregation-framework


    【解决方案1】:

    请检查以下内容:

    db.exp.aggregate([
     { $match : { logs : {"$all" : [{"$elemMatch" : 
               {"n" : "port", "v" : "53"}
     }, 
     { "$elemMatch" : {"n" : "protocol", "v" : {"$in" :[/udp/,/tcp/]}}}]}}
     },
     { $unwind: "$logs"},
     { $project: { facility : 
                    { $cond:
                      { if :{ $eq: [ "$logs.n", "facility" ] }, 
                        then : "$logs.v", else : null}} , 
                          num : {$cond:{if : { $eq:  [ "$logs.n", "num" ] }, 
                        then : "$logs.v", else : null}}, 
                          ip : {$cond:{if : { $eq: [ "$logs.n", "ip" ] }, 
                        then : "$logs.v", else : null}} } },
     { $group: {_id:"$_id" , facility : {"$max" : "$facility"},
               num : {"$max": "$num"} , ip : {"$max" : "$ip"}}
     },
     { $group : {_id: {facility :"$facility" , 
               num : "$num" , ip : "$ip"} , count : {"$sum":1}}
     }
        ]);
    

    上面的查询会得到你想要的结果:

    { "_id" : { "facility" :26, "num" : 6,
        "ip" : "137.68.151.104" }, "count" : 1 
    }
    { "_id" : { "facility" : 26, "num" : 6,
        "ip" : "137.68.160.51" }, "count" : 2 
    }
    { "_id" : { "facility" : 26, "num" : 6,
        "ip" : "137.68.160.184" }, "count" : 1 
    }
    

    【讨论】:

      【解决方案2】:

      你在$unwind 之后尝试匹配的逻辑出错了。由于这些项目不再在数组中,因此您需要将所需的所有键值匹配为字段。

      然后通过$cond 运算符和一些创意分组将它们转换为字段:

      db.agg.aggregate([
          { "$match": {
             "logs" : {
                 "$all": [
                     { "$elemMatch": { "n": "port", "v": "53" } },
                     { "$elemMatch": { "n": "protocol", "v": { "$in" :[/udp/,/tcp/] } } }
                 ]
             }
          }},
          { "$unwind": "$logs" },
          { "$match": { "logs.n": { "$in": ["ip","facility","num"] } } },
          { "$group": {
              "_id": "$_id",
              "facility": {
                  "$min": {
                      "$cond": [
                          { "$eq": [ "$logs.n", "facility" ] },
                          "$logs.v",
                          false
                      ]
                  }
              },
              "ip": {
                  "$min": {
                      "$cond": [
                          { "$eq": [ "$logs.n", "ip" ] },
                          "$logs.v",
                          false
                      ]
                  }
              },
              "num": {
                  "$min": {
                      "$cond": [
                          { "$eq": [ "$logs.n", "num" ] },
                          "$logs.v",
                          false
                      ]
                  }
              }
          }},
          { "$group": {
             "_id": {
                 "facility": "$facility",
                 "ip": "$ip",
                 "num": "$num"
             },
             "count": { "$sum": 1 }
          }}
       ])
      

      $min 累加器用于丢弃 false 值,只为“字段”留下所需的值。

      结果如下:

      { "_id" : { "facility" : 26, "ip" : "137.68.151.104", "num" : 6 }, "count" : 1 }
      { "_id" : { "facility" : 26, "ip" : "137.68.160.184", "num" : 6 }, "count" : 1 }
      { "_id" : { "facility" : 26, "ip" : "137.68.160.51", "num" : 6 }, "count" : 2 }
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2023-01-20
        • 2018-06-08
        • 1970-01-01
        • 1970-01-01
        • 2016-07-03
        • 2019-04-13
        • 1970-01-01
        相关资源
        最近更新 更多