【问题标题】:mongodb service crashes after saying serverStatus was very slowmongodb 服务在说 serverStatus 非常慢后崩溃
【发布时间】:2017-02-20 12:09:02
【问题描述】:

我在 EC2 上有一个 mongodb 服务。它会在一段时间后自动崩溃。 当我执行systemctl status mongodb 时,它给了我以下输出:

● mongodb.service - 高性能、无模式、面向文档 数据库加载:加载(/etc/systemd/system/mongodb.service; 启用;供应商预设:启用)活动:失败(结果:信号) 自世界标准时间 2017 年 2 月 17 日星期五 02:00:00 起; 3 小时 37 分钟前 主 PID:1152 (代码=杀死,信号=杀死)

2 月 17 日 02:00:00 ip-172-31-29-240 systemd[1]: mongodb.service: Main 进程退出,code=killed,status=9/KILL Feb 17 02:00:00 ip-172-31-29-240 systemd[1]: mongodb.service: 单位输入失败 状态。 2 月 17 日 02:00:00 ip-172-31-29-240 systemd[1]: mongodb.service: 结果“信号”失败。警告:日志已被轮换 单位已启动。日志输出不完整或不可用。

当我检查/var/log/mongodb/mongod.log 文件时,我得到了这个:

2017-02-20T11:11:05.624+0000 I CONTROL  [main] ***** SERVER RESTARTED *****
2017-02-20T11:11:05.638+0000 I CONTROL  [initandlisten] MongoDB starting : pid=29177 port=27017 dbpath=/var/lib/mongodb 64-bit host=ip-172-31-29-240
2017-02-20T11:11:05.638+0000 I CONTROL  [initandlisten] db version v3.2.11
2017-02-20T11:11:05.638+0000 I CONTROL  [initandlisten] git version: 009580ad490190ba33d1c6253ebd8d91808923e4
2017-02-20T11:11:05.638+0000 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.2g  1 Mar 2016
2017-02-20T11:11:05.638+0000 I CONTROL  [initandlisten] allocator: tcmalloc
2017-02-20T11:11:05.638+0000 I CONTROL  [initandlisten] modules: none
2017-02-20T11:11:05.638+0000 I CONTROL  [initandlisten] build environment:
2017-02-20T11:11:05.638+0000 I CONTROL  [initandlisten]     distmod: ubuntu1604
2017-02-20T11:11:05.638+0000 I CONTROL  [initandlisten]     distarch: x86_64
2017-02-20T11:11:05.638+0000 I CONTROL  [initandlisten]     target_arch: x86_64
2017-02-20T11:11:05.638+0000 I CONTROL  [initandlisten] options: { config: "/etc/mongod.conf", net: { bindIp: "127.0.0.1", port: 27017 }, storage: { dbPath: "/var/lib/mongodb", journal: { enabled: true } }, systemLog: { destination: "file", logAppend: true, path: "/var/log/mongodb/mongod.log", quiet: true } }
2017-02-20T11:11:05.665+0000 I -        [initandlisten] Detected data files in /var/lib/mongodb created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2017-02-20T11:11:05.665+0000 W -        [initandlisten] Detected unclean shutdown - /var/lib/mongodb/mongod.lock is not empty.
2017-02-20T11:11:05.665+0000 W STORAGE  [initandlisten] Recovering data from the last clean checkpoint.
2017-02-20T11:11:05.665+0000 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=1G,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2017-02-20T11:11:06.246+0000 I CONTROL  [initandlisten] 
2017-02-20T11:11:06.246+0000 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
2017-02-20T11:11:06.246+0000 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2017-02-20T11:11:06.246+0000 I CONTROL  [initandlisten] 
2017-02-20T11:11:06.246+0000 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
2017-02-20T11:11:06.246+0000 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2017-02-20T11:11:06.246+0000 I CONTROL  [initandlisten] 
2017-02-20T11:11:06.280+0000 I FTDC     [initandlisten] Initializing full-time diagnostic data capture with directory '/var/lib/mongodb/diagnostic.data'
2017-02-20T11:11:06.280+0000 I NETWORK  [HostnameCanonicalizationWorker] Starting hostname canonicalization worker
2017-02-20T11:11:06.283+0000 I NETWORK  [initandlisten] waiting for connections on port 27017
2017-02-20T11:11:28.189+0000 I COMMAND  [conn3] command myDataBase.myCollection command: aggregate { aggregate: "myCollection", pipeline: [ { $match: { myFlag: true, timestamp: { $lt: new Date(1485907200000) } } }, { $group: { _id: null, startDate: { $min: "$timestamp" }, previousCount: { $sum: 1 } } }, { $project: { _id: 0, startDate: 1, previousCount: 1 } } ] } keyUpdates:0 writeConflicts:0 numYields:236 reslen:114 locks:{ Global: { acquireCount: { r: 478 } }, Database: { acquireCount: { r: 239 } }, Collection: { acquireCount: { r: 239 } } } protocol:op_query 2605ms
2017-02-20T11:11:28.192+0000 I COMMAND  [conn8] command myDataBase.myCollection command: aggregate { aggregate: "myCollection", pipeline: [ { $match: { myFlag: true, timestamp: { $lt: new Date(1485907200000) } } }, { $group: { _id: null, startDate: { $min: "$timestamp" }, previousCount: { $sum: 1 } } }, { $project: { _id: 0, startDate: 1, previousCount: 1 } } ] } keyUpdates:0 writeConflicts:0 numYields:244 reslen:114 locks:{ Global: { acquireCount: { r: 494 } }, Database: { acquireCount: { r: 247 } }, Collection: { acquireCount: { r: 247 } } } protocol:op_query 2764ms
2017-02-20T11:11:28.235+0000 I COMMAND  [conn5] command myDataBase.myCollection command: aggregate { aggregate: "myCollection", pipeline: [ { $match: { timestamp: { $gte: new Date(1484910685417) } } }, { $group: { _id: { $dateToString: { format: "%Y-%m-%d", date: "$timestamp" } }, count: { $sum: 1 } } }, { $sort: { _id: 1 } } ] } keyUpdates:0 writeConflicts:0 numYields:244 reslen:538 locks:{ Global: { acquireCount: { r: 494 } }, Database: { acquireCount: { r: 247 } }, Collection: { acquireCount: { r: 247 } } } protocol:op_query 2817ms
2017-02-20T11:11:28.303+0000 I COMMAND  [conn8] command myDataBase.myCollection command: aggregate { aggregate: "myCollection", pipeline: [ { $match: { timestamp: { $gte: new Date(1484910685581) } } }, { $group: { _id: { $dateToString: { format: "%Y-%m-%d", date: "$timestamp" } }, count: { $sum: 1 } } }, { $sort: { _id: 1 } } ] } keyUpdates:0 writeConflicts:0 numYields:144 reslen:538 locks:{ Global: { acquireCount: { r: 294 } }, Database: { acquireCount: { r: 147 } }, Collection: { acquireCount: { r: 147 } } } protocol:op_query 110ms
2017-02-20T11:11:28.438+0000 I COMMAND  [conn3] command myDataBase.myCollection command: aggregate { aggregate: "myCollection", pipeline: [ { $match: { timestamp: { $gte: new Date(1484910688306) } } }, { $group: { _id: { $dateToString: { format: "%Y-%m-%d", date: "$timestamp" } }, count: { $sum: 1 } } }, { $sort: { _id: 1 } } ] } keyUpdates:0 writeConflicts:0 numYields:144 reslen:538 locks:{ Global: { acquireCount: { r: 294 } }, Database: { acquireCount: { r: 147 } }, Collection: { acquireCount: { r: 147 } } } protocol:op_query 129ms
2017-02-20T11:11:28.444+0000 I COMMAND  [conn5] command myDataBase.myCollection command: aggregate { aggregate: "myCollection", pipeline: [ { $match: { timestamp: { $gte: new Date(1484910688272) } } }, { $group: { _id: { $dateToString: { format: "%Y-%m-%d", date: "$timestamp" } }, count: { $sum: 1 } } }, { $sort: { _id: 1 } } ] } keyUpdates:0 writeConflicts:0 numYields:144 reslen:538 locks:{ Global: { acquireCount: { r: 294 } }, Database: { acquireCount: { r: 147 } }, Collection: { acquireCount: { r: 147 } } } protocol:op_query 154ms
2017-02-20T11:12:15.504+0000 I COMMAND  [conn1] command myDataBase.myCollection command: aggregate { aggregate: "myCollection", pipeline: [ { $match: { timestamp: { $gte: new Date(1484910735374) } } }, { $group: { _id: { $dateToString: { format: "%Y-%m-%d", date: "$timestamp" } }, count: { $sum: 1 } } }, { $sort: { _id: 1 } } ] } keyUpdates:0 writeConflicts:0 numYields:143 reslen:538 locks:{ Global: { acquireCount: { r: 292 } }, Database: { acquireCount: { r: 146 } }, Collection: { acquireCount: { r: 146 } } } protocol:op_query 128ms
2017-02-20T11:12:15.582+0000 I COMMAND  [conn5] command myDataBase.myCollection command: aggregate { aggregate: "myCollection", pipeline: [ { $match: { timestamp: { $gte: new Date(1484910735439) } } }, { $group: { _id: { $dateToString: { format: "%Y-%m-%d", date: "$timestamp" } }, count: { $sum: 1 } } }, { $sort: { _id: 1 } } ] } keyUpdates:0 writeConflicts:0 numYields:145 reslen:538 locks:{ Global: { acquireCount: { r: 296 } }, Database: { acquireCount: { r: 148 } }, Collection: { acquireCount: { r: 148 } } } protocol:op_query 140ms
2017-02-20T11:12:15.588+0000 I COMMAND  [conn8] command myDataBase.myCollection command: aggregate { aggregate: "myCollection", pipeline: [ { $match: { timestamp: { $gte: new Date(1484910735428) } } }, { $group: { _id: { $dateToString: { format: "%Y-%m-%d", date: "$timestamp" } }, count: { $sum: 1 } } }, { $sort: { _id: 1 } } ] } keyUpdates:0 writeConflicts:0 numYields:144 reslen:538 locks:{ Global: { acquireCount: { r: 294 } }, Database: { acquireCount: { r: 147 } }, Collection: { acquireCount: { r: 147 } } } protocol:op_query 155ms
2017-02-20T11:12:24.765+0000 I COMMAND  [conn3] query myDataBase.myCollection query: { $query: {}, orderby: { timestamp: -1 } } planSummary: COLLSCAN, COLLSCAN cursorid:47222766812 ntoreturn:200 ntoskip:0 keysExamined:0 docsExamined:18423 hasSortStage:1 keyUpdates:0 writeConflicts:0 numYields:145 nreturned:37 reslen:4578802 locks:{ Global: { acquireCount: { r: 292 } }, Database: { acquireCount: { r: 146 } }, Collection: { acquireCount: { r: 146 } } } 177ms
2017-02-20T11:12:25.268+0000 I COMMAND  [conn3] getmore myDataBase.myCollection planSummary: COLLSCAN, COLLSCAN cursorid:47222766812 ntoreturn:200 keysExamined:0 docsExamined:18423 keyUpdates:0 writeConflicts:0 numYields:151 nreturned:200 reslen:2664936 locks:{ Global: { acquireCount: { r: 304 } }, Database: { acquireCount: { r: 152 } }, Collection: { acquireCount: { r: 152 } } } 321ms
2017-02-20T11:12:25.518+0000 I COMMAND  [conn3] command myDataBase.myCollection command: aggregate { aggregate: "myCollection", pipeline: [ { $match: { myFlag: true, timestamp: { $lt: new Date(1485907200000) } } }, { $group: { _id: null, startDate: { $min: "$timestamp" }, previousCount: { $sum: 1 } } }, { $project: { _id: 0, startDate: 1, previousCount: 1 } } ] } keyUpdates:0 writeConflicts:0 numYields:144 reslen:114 locks:{ Global: { acquireCount: { r: 294 } }, Database: { acquireCount: { r: 147 } }, Collection: { acquireCount: { r: 147 } } } protocol:op_query 173ms
2017-02-20T11:12:25.691+0000 I COMMAND  [conn3] command myDataBase.compressionstats command: aggregate { aggregate: "compressionstats", pipeline: [ { $group: { _id: null, originalDataSum: { $sum: "$originalDataSize" }, compressedDataSum: { $sum: "$compressedDataSize" } } } ] } keyUpdates:0 writeConflicts:0 numYields:2 reslen:125 locks:{ Global: { acquireCount: { r: 10 } }, Database: { acquireCount: { r: 5 } }, Collection: { acquireCount: { r: 5 } } } protocol:op_query 127ms
2017-02-20T11:12:26.525+0000 I WRITE    [conn3] update myDataBase.sessions query: { _id: "doOtZATLM5R61__wpN-PmFH_80BMIpFH" } update: { _id: "doOtZATLM5R61__wpN-PmFH_80BMIpFH", session: "{"cookie":{"originalMaxAge":null,"expires":null,"httpOnly":true,"path":"/"},"passport":{"user":"587cc6630bd9ff5c39e54e94"}}", expires: new Date(1488798745083) } keysExamined:1 docsExamined:1 nMatched:1 nModified:1 keyUpdates:1 writeConflicts:0 numYields:1 locks:{ Global: { acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w: 2 } }, Collection: { acquireCount: { w: 2 } } } 601ms
2017-02-20T11:12:26.532+0000 I COMMAND  [conn2] command secondDB.secondCollection command: getMore { getMore: 16340341641, collection: "secondCollection", batchSize: 1000 } planSummary: COLLSCAN cursorid:16340341641 keysExamined:0 docsExamined:0 keyUpdates:0 writeConflicts:0 numYields:0 nreturned:0 reslen:103 locks:{ Global: { acquireCount: { r: 4 } }, Database: { acquireCount: { r: 2 } }, Collection: { acquireCount: { r: 2 } } } protocol:op_query 1372ms
2017-02-20T11:12:26.570+0000 I COMMAND  [conn3] command myDataBase.$cmd command: update { update: "sessions", writeConcern: { w: 1 }, ordered: true, updates: [ { q: { _id: "doOtZATLM5R61__wpN-PmFH_80BMIpFH" }, u: { _id: "doOtZATLM5R61__wpN-PmFH_80BMIpFH", session: "{"cookie":{"originalMaxAge":null,"expires":null,"httpOnly":true,"path":"/"},"passport":{"user":"587cc6630bd9ff5c39e54e94"}}", expires: new Date(1488798745083) }, multi: false, upsert: true } ] } keyUpdates:0 writeConflicts:0 numYields:0 reslen:55 locks:{ Global: { acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w: 2 } }, Collection: { acquireCount: { w: 2 } } } protocol:op_query 702ms
2017-02-20T11:12:26.905+0000 I COMMAND  [conn3] query myDataBase.users query: { _id: ObjectId('587cc6630bd9ff5c39e54e94') } planSummary: IDHACK ntoskip:0 keysExamined:1 docsExamined:1 idhack:1 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:1 nreturned:1 reslen:524 locks:{ Global: { acquireCount: { r: 4 } }, Database: { acquireCount: { r: 2 } }, Collection: { acquireCount: { r: 2 } } } 204ms
2017-02-20T11:12:27.126+0000 I COMMAND  [conn3] killcursors myDataBase.myCollection keyUpdates:0 writeConflicts:0 numYields:0 locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } 86ms
2017-02-20T11:12:33.002+0000 I COMMAND  [ftdc] serverStatus was very slow: { after basic: 120, after asserts: 170, after connections: 370, after extra_info: 580, after globalLock: 690, after locks: 990, after network: 1070, after opcounters: 1170, after opcountersRepl: 1240, after storageEngine: 1680, after tcmalloc: 1910, after wiredTiger: 2390, at end: 2820 }

注意:服务状态输出和日志来自不同的日子,但是问题是一样的,我只是这次没有复制服务状态输出。

【问题讨论】:

  • 我也有同样的问题
  • 尝试增加内存,这就是我发布问题后立即做的事情。没有再次遇到问题。
  • 这是您的存储空间。我的意思是内存
  • 这是一个很大的数字。您是否也在上面运行应用程序或只是 mongodb?因为您的应用程序也会消耗内存来处理请求。插入不会占用太多内存,但其他查询会。如果您只是将它用于 mongodb 而没有其他用途,我建议您使用 mongodb atlas
  • 您可以将应用程序和 mongodb 保留在同一台服务器上,这样可以节省一些时间,但是您肯定需要更好的配置。多少,这不能仅用数据插入来说明,如果有其他 API 可以读取大量具有差异条件的数据,它将需要 RAM。你将不得不与试验和错误有关。发布一个详细的单独问题会有所帮助。

标签: node.js linux mongodb ubuntu mongoose


【解决方案1】:

我遇到了同样的问题。增加内存后错误消失。

【讨论】:

  • 我也有同样的问题
  • 这就是我发布问题后一天所做的,但忘记在这里更新。感谢您发布此内容,我会将其标记为答案:)
  • 我完全看不出为什么这个答案被否决了。我会赞成它以消除反对票。好博伊格德。
  • @Gerd我想知道你有没有用vmstat之类的工具监控内存使用情况,是不是明显显示内存快用完了?
  • 什么意思? RAM 大小或 WiredTigerCache 大小?
【解决方案2】:

问题的根本原因可能是在 mongo 上运行了一些缓慢的查询。

所以要找出慢查询,请按照以下步骤操作:

  1. 登录 mongo 控制台。

  2. 使用你的数据库。

  3. 使用以下命令查找速度较慢的查询或操作:

    db.currentOp({"secs_running": {$gte: 3}})

    因此,这将给出执行时间超过 3 秒的查询结果。输出可能是这样的:

    rs:SECONDARY> db.currentOp({"secs_running": {$gte: 3}});
    {
    "inprog" : [
        {
            "desc" : "tttt",
            "threadId" : "140558301292288",
            "connectionId" : 307551,
            "client" : "101.01.01.11:49128",
            "appName" : "MongoDB Shell",
            "clientMetadata" : {
                "application" : {
                    "name" : "MongoDB Shell"
                },
                "driver" : {
                    "name" : "MongoDB Internal Client",
                    "version" : "3.0.5-18-g7e327a9"
                },
                "os" : {
                    "type" : "Darwin",
                    "name" : "Mac OS X",
                    "architecture" : "x86_64",
                    "version" : "19.3.0"
                }
            },
            "active" : true,
            "opid" : 912509900,
            "secs_running" : 7711,
            "op" : "query",
            "ns" : "mydb.mytable",
            "query" : {
                "find" : "mytable",
                "filter" : {
                    "name" : {
                        "$type" : 3
                    }
                }
                ]
            },
            "planSummary" : "COLLSCAN",
            "numYields" : 333308,
            "locks" : {
                "Global" : "r",
                "Database" : "r",
                "Collection" : "r"
            },
            "waitingForLock" : false,
            "lockStats" : {
                "Global" : {
                    "acquireCount" : {
                        "r" : NumberLong(666618)
                    },
                    "acquireWaitCount" : {
                        "r" : NumberLong(254219)
                    },
                    "timeAcquiringMicros" : {
                        "r" : NumberLong("3454835364")
                    }
                },
                "Database" : {
                    "acquireCount" : {
                        "r" : NumberLong(333309)
                    }
                },
                "Collection" : {
                    "acquireCount" : {
                        "r" : NumberLong(333309)
                    }
                }
            }
        }
    ],
    "ok" : 1
    

    }

  4. 所以现在你可以通过从上面的列表中选择“opid”来终止这些需要很长时间的操作,如下所示:

    db.killOp(912509900)
    
  5. 完全希望整个系统负载能够下降。而且您不会再遇到“serverStatus was very slow”错误。

  6. 你的系统也不应该崩溃

【讨论】:

    【解决方案3】:

    我们面临的问题是,我们的 storage.dbpath 值被设置为存在于已安装的驱动器上,而该驱动器没有在 production operations checklist 中推荐的名为 noatime 的标志安装。

    【讨论】:

      猜你喜欢
      • 2023-01-18
      • 2012-11-21
      • 2014-12-05
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-01-09
      相关资源
      最近更新 更多