文档架构 - 性能与修改异常 - 权衡答案

【问题标题】：Document schema - Performance vs Modification anomaly - tradeoff文档架构 - 性能与修改异常 - 权衡
【发布时间】：2018-01-28 10:36:08
【问题描述】：

为以下应用程序设计文档架构。

一种方法是，以下 MongoDB 文档设计主要基于应用程序的数据访问模式的匹配（上）。

> db.posts.find().pretty()
{
    "_id": ObjectId("5099f5eabcf1bf2d90ea41ad"), // post 1
    "author": "xyz",
    "body" : "This is a test body",
    "comments": [
            {
                "body": "this is a comment",
                "email": "alan@tech.com",
                "author": "Alan Donald"
            },
            {
                "body": "this is another comment\r\n",
                "email": "alan@tech.com",
                "author": "Alan Donald"

            }           
            ],
    "date" : ISODate("2012-11-07T05:47:22,9412"),
    "permalink": "This_is_a_test_Post",
    "tags":[
        "cycling",
        "mongodb",
        "swimming"  
     ],
    "title": "This is a test post"
}

以上架构允许应用程序数据访问模式，到，

1) 为博客主页收集最新的博客条目

2) 收集所有信息以显示单个帖子

3) 单个作者收集所有cmets

但不是，

按标签提供目录

另一种方法是，文档模式，关系方法倾向看起来像，

> db.posts.find().pretty()
{
    "_id": "Post1", // use ObjectId BSON type
    "title": "This is a test post",
    "body": "This is a test body",
    "date": ISODate("2012-11-07T05:47:22,9412")

}

> db.comments.find().pretty()
{
    "_id": 3, // use ObjectId BSON type
    "post_id": "Post1",
    "author": "Alan Donald",
    "author_email": "alan@tech.com",
    "nth": 0
    "body": "this is a comment"
},
{
    "_id": 4, // use ObjectId BSON type
    "post_id": "Post1",
    "author": "Alan Donald",
    "author_email": "alan@tech.com",
    "nth": 1,
    "body": "this is another comment\r\n"
},

> db.tags.find().pretty()
{
    "_id": 5, // use ObjectId BSON type
    "tag": "cycling"
    "post_id": "Post1"
},
{
    "_id": 6, // use ObjectId BSON type
    "tag": "mongodb"
    "post_id": "Post1"
},
{
    "_id": 7, // use ObjectId BSON type
    "tag": "swimming"
    "post_id": "Post1"
}

比较：

1) MongoDB 本身并不支持集合之间的连接操作。因此，方法 1 看起来更好。因为方法 2 需要多个查询并加入这些多个查询的结果。

2) 方法 1 在文档中嵌入（预连接）comments 看起来更好，即使 MongoDB 缺少外键约束，也可以保持数据一致。

3) MongoDB 不支持事务，但支持单个文档级别的原子操作。所以，方法 1 看起来更好

post 和 cmets 是一对多的关系。很多会很大也可能很少。

问题：

使用方法 1，db.posts 集合中的每个文档（帖子）由多个带有冗余数据的 comments 组成。增强性能但容易出现修改异常。有没有更好的模式设计方法？

【问题讨论】：

标签： mongodb database-design database-schema normalization rdbms

【解决方案1】：

在评论方法 1 中，您使用的数组在 MongoDB 中存在限制。 https://docs.mongodb.com/manual/reference/limits

从聚合管道中的Mongodb 3.2 开始，您可以使用$lookup 进行连接。

【讨论】：

当你参考$lookup加入时，你推荐第二种方法吗？
是的，因为在第一种方法中，如果您的数组大小超过阈值限制。那就有问题了。
你的意思是，16MB 的文档限制？
是的，如果您确定您的应用程序不能超过限制，那么方法 1 是可以的。在方法 2 中，不需要将标签放在单独的集合中。
但是，我了解到here，在使用 MongoDB 设计应用程序的数据库模式时，最重要的一个因素是匹配应用程序的数据访问模式。我们不是在方法 2 中打破它吗？