如何查询 mongo 集合以返回包含子文档计算值的虚拟字段的完整文档？答案

【问题标题】：How to query a mongo collection to return the full document with virtual fields containing calculated values from the sub-document?如何查询 mongo 集合以返回包含子文档计算值的虚拟字段的完整文档？
【发布时间】：2014-07-24 01:39:01
【问题描述】：

我正在尝试查询包含子文档的特定文档的集合。子文档包含我想要获取的值该子文档的最高和最低分数，并将结果作为虚拟字段返回到原始文档。

我有以下数据集：

{
"_id" : "d0e78492342f9f-f843ec7-4bd14g3h-bh34j3a9-02d6ah32k8e6b79e",
"name" : "Addison Hunt",
"tests" : [
    {
        "name"  : "lorem",
        "score" : 79
    },
    {
        "name"  : "vallum",
        "score" : 100
    },
    {
        "name"  : "ipsum",
        "score" : 65
    }
],
"created_at" : 1401488865684,
"class" : "dolor sit amit",
"user_id" : "005G5635231325O4VIAU"

}

在 mongo 2.4 中，如何查询一次 mongo 以返回以下结果：

{
    "_id" : "d0e78492342f9f-f843ec7-4bd14g3h-bh34j3a9-02d6ah32k8e6b79e",
    "name" : "Addison Hunt",
    "tests" : [
        {
            "name"  : "lorem",
            "score" : 79
        },
        {
            "name"  : "vallum",
            "score" : 100
        },
        {
            "name"  : "ipsum",
            "score" : 65
        }
    ],
    "created_at" : 1401488865684,
    "class" : "dolor sit amit",
    "user_id" : "005G5635231325O4VIAU",
    "worst_test": {
        "name"  : "ipsum",
        "score" : 65
    },
    "best_test": {
        "name"  : "vallum",
        "score" : 100
    }
}

其中“best_test”和“worst_test”是虚拟字段，分别代表得分最高和最低的测试。我尝试了许多不同的方法，我得到的最接近的是这个查询：

db.students.aggregate([
    { $match: { 
        '_id': 'd0e78492342f9f-f843ec7-4bd14g3h-bh34j3a9-02d6ah32k8e6b79e'
    }},
    { $unwind: '$tests' },
    { $sort: {'tests.score': 1} },
    { $group: { 
        _id: '$_id',
        student_tests: {$push: "$$ROOT"},
        worst_test: {$first: '$tests'},
        best_test: { $last: '$tests' } 
    }}
]);

产生这个结果：

{
"_id" : "d0e78492342f9f-f843ec7-4bd14g3h-bh34j3a9-02d6ah32k8e6b79e",
"student_tests" : [
    {
        "name" : "Addison Hunt",
        "tests" : [
            {
                "name"  : "ipsum",
                "score" : 65
            }
        ],
        "created_at" : 1401488865684,
        "class" : "dolor sit amit",
        "user_id" : "005G5635231325O4VIAU",
    },
    {
        "name" : "Addison Hunt",
        "tests" : [
            {
                "name"  : "lorem",
                "score" : 79
            }
        ],
        "created_at" : 1401488865684,
        "class" : "dolor sit amit",
        "user_id" : "005G5635231325O4VIAU",
    },
    {
        "name" : "Addison Hunt",
        "tests" : [
            {
                "name"  : "vallum",
                "score" : 100
            }
        ],
        "created_at" : 1401488865684,
        "class" : "dolor sit amit",
        "user_id" : "005G5635231325O4VIAU",
    },
],
"worst_test": {
    "name"  : "ipsum",
    "score" : 65
},
"best_test": {
    "name"  : "vallum",
    "score" : 100
}

}

【问题讨论】：

标签： mongodb mongodb-query aggregation-framework

【解决方案1】：

如果您使用的是$$ROOT，那么实际上您使用的是 MongoDB 2.6，因为这是仅在该版本中引入的聚合变量。

虽然对各种事情都很方便，但它所做的只是在使用的管道的当前阶段代表整个文档。要执行您想要的操作并返回未修改但带有其他字段的原始文档，您可以在$unwind 之前的$project 阶段使用它来分配给_id 字段，但实际上您没有与您在末尾仍需要 $project 的相同文档，以便从这些元素中获得正确的文档形状。

您最好只投影字段，但在应用任何 $sort 之前保留数组的未更改副本：

db.students.aggregate([
    { "$match": { 
        "_id": "d0e78492342f9f-f843ec7-4bd14g3h-bh34j3a9-02d6ah32k8e6b79e"
    }},
    { "$project": {
         "name": 1,
         "tests": 1,
         "created_at": 1,
         "class": 1,
         "user_id": 1,
         "testCopy": "$tests" 
    }},
    { "$unwind": "$testCopy" },
    { "$sort": { "testCopy.score": 1 } },
    { "$group": { 
        "_id: "$_id",
        "tests": { "$first": "$tests" },
        "created_at": { "$first": "$created_at" },
        "class": { "$first": "$class" },
        "user_id": { "$first": "$user_id" },
        "worst_test": { "$first": "$testCopy" },
        "best_test": { "$last": "$testCopy" } 
    }}
]);

或者使用前面提到的$$ROOT，或者只是将_id下的字段单独放在$project中：

db.students.aggregate([
    { "$match": { 
        "_id": "d0e78492342f9f-f843ec7-4bd14g3h-bh34j3a9-02d6ah32k8e6b79e"
    }},
    { "$project": {
        "_id": "$$ROOT",
        "tests": 1
    }},
    { "$unwind": "$tests" },
    { "$sort": { "tests.score": 1 } },
    { "$group": { 
        "_id": "$_id",
        "aworst_test": { "$first": "$tests" },
        "abest_test": { "$last": "$tests" } 
    }},
    { "$project": {
        "_id": "$_id._id",
        "tests": "$_id.tests",
        "created_at": "$_id.created_at",
        "class": "$_id.class",
        "user_id": "$_id.user_id",
        "worst_test": "$aworst_test",
        "best_test": "$abest_test"
    }}
]);

但是如您所见，您仍在某处进行$project 工作以获得您想要的结构，以及“重命名字段”以维护您想要的字段顺序，否则$project 将“优化”和“保留”任何尚未重命名的字段，并在现有字段之后“追加”新字段。

确实没有简单的方法可以像您最初找到它们时那样“获取所有字段”。像 $project 和 $group 这样的操作是“全有或全无”的事情，它们只明确地产生你告诉他们的东西。

【讨论】：

关于$project 和$group 的“全有或全无”，如果您的架构在应用层已知，您可以将每个对象“渲染”到聚合管道中，而不是硬对属性进行编码。
@zamnuts 说得好，我很高兴你做到了。我已经无数次重复了这个概念。 MongoDB 查询和聚合查询只是数据结构。所以生成它们，这也比生成诸如 SQL 之类的声明性语言更容易。在任何答案中提及，您会立即获得我的个人 +1
@zamnuts 你能举例说明“将每个对象渲染到聚合管道而不是硬编码属性”是什么意思吗？
@user3871070 这意味着您的应用程序有一些“模式”概念，您实际上定义了某种形式的结构，可以自省以形成这样的查询。但这总是有“规则”的。这实际上回答了你的问题。如果您想知道如何自省某种“模式”，那么您真的需要问另一个问题。 StackOverflow 有一个一个仅问题模型，这是有充分理由的。
@user3871070 Neil Lunn 是正确的，形成一个新问题。但是，如果您在术语和想法方面需要一些帮助，Mongoose 是一个出色的对象建模器 (mongoosejs.com/docs/guide.html)。