Krloypower

统计Mongo数组中相同对象的属性之和

需求

需要统计app端用户的行为,按天分表,存入mongo。每次用户进行操作的时候,将数据存入app本地,下次用户启动的时候,提交存入mongo,删除app本地缓存。那么用户这个行为的文档就算是很复杂了。举个例子,存入mongo中文档为

/* 1 */
{
    "_id" : ObjectId("5b8c996e5f814eb3c37eb49b"),
    "userId" : 12323.0,
    "appPlatform" : "ios",
    "shortcutEntrance" : [ 
        {
            "entranceName" : "入口1",
            "gmtCreate" : "2018-09-03 10:00:00"
        }, 
        {
            "entranceName" : "入口2",
            "gmtCreate" : "2018-09-03 10:00:00"
        }
    ],
    "classData" : [ 
        {
            "classId" : 123.0,
            "position" : "left",
            "gmtCreate" : "2018-09-03 10:00:00"
        }, 
        {
            "classId" : 12356.0,
            "position" : "right",
            "gmtCreate" : "2018-09-03 10:00:00"
        }
    ],
    "channelVisit" : [ 
        {
            "channelType" : 2.0,
            "gmtCreate" : "2018-09-03 10:00:00"
        }, 
        {
            "channelType" : 3.0,
            "gmtCreate" : "2018-09-03 10:00:00"
        }
    ]
}

/* 2 */
{
    "_id" : ObjectId("5b8e163df0fad825a708bc75"),
    "userId" : 12323.0,
    "appPlatform" : "ios",
    "shortcutEntrance" : [ 
        {
            "entranceName" : "入口1",
            "gmtCreate" : "2018-09-03 10:00:00"
        }, 
        {
            "entranceName" : "入口2",
            "gmtCreate" : "2018-09-03 10:00:00"
        }
    ],
    "classData" : [ 
        {
            "classId" : 123.0,
            "position" : "left",
            "gmtCreate" : "2018-09-03 10:00:00"
        }, 
        {
            "classId" : 12356.0,
            "position" : "right",
            "gmtCreate" : "2018-09-03 10:00:00"
        }
    ],
    "channelVisit" : [ 
        {
            "channelType" : 2.0,
            "gmtCreate" : "2018-09-03 10:00:00"
        }, 
        {
            "channelType" : 3.0,
            "gmtCreate" : "2018-09-03 10:00:00"
        }
    ]
}

/* 3 */
{
    "_id" : ObjectId("5b8e1642f0fad825a708bc76"),
    "userId" : 12323.0,
    "appPlatform" : "ios",
    "shortcutEntrance" : [ 
        {
            "entranceName" : "入口1",
            "gmtCreate" : "2018-09-03 10:00:00"
        }, 
        {
            "entranceName" : "入口2",
            "gmtCreate" : "2018-09-03 10:00:00"
        }
    ],
    "classData" : [ 
        {
            "classId" : 1234.0,
            "position" : "left",
            "gmtCreate" : "2018-09-03 10:00:00"
        }, 
        {
            "classId" : 12356.0,
            "position" : "right",
            "gmtCreate" : "2018-09-03 10:00:00"
        }
    ],
    "channelVisit" : [ 
        {
            "channelType" : 2.0,
            "gmtCreate" : "2018-09-03 10:00:00"
        }, 
        {
            "channelType" : 3.0,
            "gmtCreate" : "2018-09-03 10:00:00"
        }
    ]
}

那么我需要统计classData.classId 相同的有多少。

解决方案

  1. 先对过滤出自己需要的数据
   {"$project":{"classData.classId":1,"classData.position":1}}

获得得数据如下:

/* 1 */
{
    "_id" : ObjectId("5b8c996e5f814eb3c37eb49b"),
    "classData" : [ 
        {
            "classId" : 123.0,
            "position" : "left"
        }, 
        {
            "classId" : 12356.0,
            "position" : "right"
        }
    ]
}

/* 2 */
{
    "_id" : ObjectId("5b8e163df0fad825a708bc75"),
    "classData" : [ 
        {
            "classId" : 123.0,
            "position" : "left"
        }, 
        {
            "classId" : 12356.0,
            "position" : "right"
        }
    ]
}

/* 3 */
{
    "_id" : ObjectId("5b8e1642f0fad825a708bc76"),
    "classData" : [ 
        {
            "classId" : 1234.0,
            "position" : "left"
        }, 
        {
            "classId" : 12356.0,
            "position" : "right"
        }
    ]
}

_id 默认为1 也就是

   {"$project":{"_id":1,"classData.classId":1,"classData.position":1}}
  1. 对classData数组进行分类切分
 {"$unwind":"$classData"}

获取数据如下:

/* 1 */
{
    "_id" : ObjectId("5b8c996e5f814eb3c37eb49b"),
    "classData" : {
        "classId" : 123.0,
        "position" : "left"
    }
}

/* 2 */
{
    "_id" : ObjectId("5b8c996e5f814eb3c37eb49b"),
    "classData" : {
        "classId" : 12356.0,
        "position" : "right"
    }
}

/* 3 */
{
    "_id" : ObjectId("5b8e163df0fad825a708bc75"),
    "classData" : {
        "classId" : 123.0,
        "position" : "left"
    }
}

/* 4 */
{
    "_id" : ObjectId("5b8e163df0fad825a708bc75"),
    "classData" : {
        "classId" : 12356.0,
        "position" : "right"
    }
}

/* 5 */
{
    "_id" : ObjectId("5b8e1642f0fad825a708bc76"),
    "classData" : {
        "classId" : 1234.0,
        "position" : "left"
    }
}

/* 6 */
{
    "_id" : ObjectId("5b8e1642f0fad825a708bc76"),
    "classData" : {
        "classId" : 12356.0,
        "position" : "right"
    }
}

那么这样了就很好去获取classId得总数了

{"$group":{_id:"$classData.classId","classIdSum":{"$sum":1}}}

获取数据如下

/* 1 */
{
    "_id" : 1234.0,
    "classIdSum" : 1.0
}

/* 2 */
{
    "_id" : 12356.0,
    "classIdSum" : 3.0
}

/* 3 */
{
    "_id" : 123.0,
    "classIdSum" : 2.0
}

总命令如下:

db.getCollection('SHORTCUT_ENTRANCE_20180903').aggregate(
    [
        {"$project":{"classData.classId":1,"classData.position":1}},
        {"$unwind":"$classData"},
        {"$group":{_id:"$classData.classId","classIdSum":{"$sum":1}}}
    ])

数据统计存入业务表

结合Spring,使用mongoTemplate:

  • 配置数据源
 <!-- 数据源 -->
    <mongo:db-factory id="mongoDbFactory" uri="mongodb://${mongo.database}:${mongo.password}@${mongo.host}:${mongo.port}/${mongo.username}?maxPoolSize=2000"/>
    <bean id="mongoTemplate" class="org.springframework.data.mongodb.core.MongoTemplate">
        <constructor-arg name="mongoDbFactory" ref="mongoDbFactory"/>
    </bean>
  • 从mongo获取统计数据存入业务表

核心代码如下:

Aggregation agg = Aggregation.newAggregation(
    Aggregation.project("classData.classId","classData.position"),
    Aggregation.unwind("$classData"),
    Aggregation.group("$classData.classId").count().as("sim")
)

AggregationResults<BasicDBObject> objects = mongoTemplate.aggregate(agg,collection,BasicDBObject.class);//获取需要的统计数据
Iterator<BasicDBObject> iterator = objects.iterator();
while(iterator.hasNext()){
    DBObject object = iterator.next();//获取统计数据中单个集合
    ........ //业务逻辑 存入业务表
}

相关文章: