【发布时间】:2019-01-16 07:35:16
【问题描述】:
我在树莓派 (Linux) 上运行 python 程序,它将数据记录到 mongodb(使用 pymongo 模块)。我无法理解 mongodb 服务何时停止运行或它为什么会停止。
现在,我已经设置了我的程序功能,如果它们无法访问 mongodb(获取 pymongo 连接异常),它们会尝试重新启动服务,等待十秒钟然后重新尝试操作。这些函数是这样递归的:
def get_database_collection():
try:
# code to get document
return document
except Exception:
# code to log exception in my log files
start_mongo_service()
get_database_collection()
这就是 start_mongo_service() 函数的样子:
def start_mongo_service():
try:
subprocess.call(["sudo", "service", "mongodb", "start"])
time.sleep(10)
return True
except Exception:
# code to log exception in my log files (Could not start_mongo_service)
database_logger = logging.getLogger('database_thread')
database_logger.exception("Could not start_mongo_service")
time.sleep(10)
return False
现在,我知道捕获所有异常是不好的做法,但我这样做是因为我不希望我的代码崩溃,并且我会记录发生的任何异常,以便检查代码的行为。
所以昨天我的程序崩溃了,控制台上给出的错误是:MaximumRecursionDepth 超出,我假设这意味着它循环了 1000 次,但仍然无法逃脱它的异常。这是程序日志的样子:
2019-01-15 18:12:50,000 - ERROR - database_thread - Could not start_mongo_service
Traceback (most recent call last):
File "gateway-embedded-code/database.py", line 89, in update_status_collection
AttributeError: 'NoneType' object has no attribute 'update'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "pymongo/mongo_client.py", line 266, in __init__
File "pymongo/mongo_client.py", line 641, in __find_node
pymongo.errors.AutoReconnect: could not connect to localhost:27017: [Errno 110] Connection timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "gateway-embedded-code/database.py", line 35, in get_database_collection
File "pymongo/mongo_client.py", line 269, in __init__
pymongo.errors.ConnectionFailure: could not connect to localhost:27017: [Errno 110] Connection timed out
上面的pymongo异常发生了很多次,我把它们缩短了在这里发布。然后我尝试在 /var/log/mongodb 中查看 mongodb 自己的日志,最后一个条目是在 1 月 15 日 17:51:54!在那之后,那天的日志中没有任何内容......我想服务停止了,我的程序也无法重新启动它,所以它在 18:12:50 崩溃了......
Tue Jan 15 17:51:54.391 [conn11125] end connection 127.0.0.1:53052 (1 connection now open)
Tue Jan 15 17:51:54.393 [initandlisten] connection accepted from 127.0.0.1:53054 #11126 (2 connections now open)
Tue Jan 15 17:51:54.408 [conn11126] end connection 127.0.0.1:53054 (1 connection now open)
Tue Jan 15 17:51:54.410 [initandlisten] connection accepted from 127.0.0.1:53056 #11127 (2 connections now open)
Wed Jan 16 04:33:04.994 [signalProcessingThread] got signal 15 (Terminated), will terminate after current cmd ends
Wed Jan 16 04:33:04.994 [signalProcessingThread] now exiting
Wed Jan 16 04:33:04.994 dbexit:
Wed Jan 16 04:33:04.994 [signalProcessingThread] shutdown: going to close listening sockets...
Wed Jan 16 04:33:04.994 [signalProcessingThread] closing listening socket: 9
Wed Jan 16 04:33:04.994 [signalProcessingThread] closing listening socket: 10
Wed Jan 16 04:33:04.994 [signalProcessingThread] closing listening socket: 11
Wed Jan 16 04:33:04.994 [signalProcessingThread] removing socket file: /tmp/mongodb-27017.sock
Wed Jan 16 04:33:04.994 [signalProcessingThread] shutdown: going to flush diaglog...
Wed Jan 16 04:33:04.994 [signalProcessingThread] shutdown: going to close sockets...
Wed Jan 16 04:33:04.994 [signalProcessingThread] shutdown: waiting for fs preallocator...
Wed Jan 16 04:33:04.994 [signalProcessingThread] shutdown: lock for final commit...
Wed Jan 16 04:33:04.994 [signalProcessingThread] shutdown: final commit...
Wed Jan 16 04:33:04.995 [signalProcessingThread] shutdown: closing all files...
Wed Jan 16 04:33:04.995 [conn11127] end connection 127.0.0.1:53056 (1 connection now open)
Wed Jan 16 04:33:04.997 [signalProcessingThread] closeAllFiles() finished
Wed Jan 16 04:33:04.997 [signalProcessingThread] journalCleanup...
Wed Jan 16 04:33:04.997 [signalProcessingThread] removeJournalFiles
Wed Jan 16 04:33:05.053 [conn4] end connection 127.0.0.1:56470 (0 connections now open)
Wed Jan 16 04:33:05.223 [signalProcessingThread] shutdown: removing fs lock...
Wed Jan 16 04:33:05.224 dbexit: really exiting now
***** SERVER RESTARTED *****
# Everything works fine from this point onwards
今天是 1 月 16 日,并且在我重新启动树莓派时记录了 1 月 16 日的所有消息,程序现在完全正常运行...但是当我让它继续运行并在第二天检查时,这个问题仍然存在。
我的问题是,为什么会发生这种情况? mongo服务什么时候停止?为什么我不能用我的功能重新启动它?谁能解释从日志中可以推断出什么?意外的电源断开是否会导致 mongodb 服务在启动时无法运行?请帮助我解决可能发生的问题以及如何处理它,我不希望我的程序崩溃!
抱歉,帖子太长了,我可以提供您需要的更多详细信息。
感谢您的阅读。
编辑:只是想澄清 AttributeError: Nonetype 的来源。请记住,我所说的集合只是指数据库中的一个文档。 我有一个名为 update_status_collection() 的函数:
def update_status_collection(the_update):
try:
document = get_database_collection(collection_name='status_collection')
document.update(the_update)
except Exception:
database_logger = logging.getLogger('database_thread')
database_logger.exception('Could not update_status_collection')
start_mongo_service()
update_status_collection(the_update)
现在由于某种原因,get_database_collection 函数将 Nonetype 返回给文档变量,这就是引发 AttributeError 异常的地方,因为您无法更新 Nonetype。虽然我很好奇当 get_database_collection() 也是递归时如何返回 Nonetype ......它必须在达到 MaximumRecursionDepth 后返回 None 对吧?这是我还没有研究的东西。
更新:好的,我正在搜索系统日志以查找可疑内容,我想我找到了 linux 停止 mongo 的点,以下是 1 月 15 日的系统日志 (/var/log/syslog):
Jan 15 12:12:07 raspberrypi systemd[1]: Stopping An object/document-oriented database...
Jan 15 12:12:08 raspberrypi systemd[1]: Stopped An object/document-oriented database.
Jan 15 12:12:08 raspberrypi rc.local[463]: [967] Failed to execute script __main__
Jan 15 12:12:11 raspberrypi systemd[1]: Started An object/document-oriented database.
Jan 15 12:12:11 raspberrypi mongod[2336]: all output going to: /var/log/mongodb/mongodb.log
Jan 15 12:14:22 raspberrypi systemd[1]: Stopped target Timers.
Jan 15 12:14:22 raspberrypi systemd[1]: Stopped Daily apt upgrade and clean activities.
Jan 15 12:14:22 raspberrypi systemd[1]: Stopped target Bluetooth.
Jan 15 12:14:22 raspberrypi systemd[1]: Stopped Daily apt download activities.
Jan 15 12:14:22 raspberrypi systemd[1]: Stopping User Manager for UID 1000...
Jan 15 12:14:22 raspberrypi systemd[1]: Stopped target System Time Synchronized.
Jan 15 12:14:22 raspberrypi vncserver-x11-serviced[453]: XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0"
Jan 15 12:14:22 raspberrypi vncserver-x11-serviced[453]: after 14426 requests (14426 known processed) with 0 events remaining.
Jan 15 12:14:22 raspberrypi bluetoothd[524]: Terminating
Jan 15 12:14:22 raspberrypi systemd[1]: Stopping Bluetooth service...
Jan 15 12:14:22 raspberrypi watchdog[562]: stopping daemon (5.15)
Jan 15 12:14:22 raspberrypi systemd[1]: Stopping Disk Manager...
Jan 15 12:14:22 raspberrypi udisksd[883]: udisks daemon version 2.1.8 exiting
Jan 15 12:14:22 raspberrypi systemd[1]: Stopped target Sound Card.
Jan 15 12:14:22 raspberrypi bluetoothd[524]: Stopping SDP server
Jan 15 12:14:22 raspberrypi systemd[1]: Closed Load/Save RF Kill Switch Status /dev/rfkill Watch.
Jan 15 12:14:22 raspberrypi bluetoothd[524]: Exit
Jan 15 12:14:22 raspberrypi systemd[1]: Stopping watchdog daemon...
Jan 15 12:14:22 raspberrypi systemd[1]: Stopping Save/Restore Sound Card State...
Jan 15 12:14:22 raspberrypi systemd[1]: Unmounting RPC Pipe File System...
Jan 15 12:14:22 raspberrypi systemd[1]: Stopping Authorization Manager...
Jan 15 12:14:22 raspberrypi systemd[1]: Stopped Daily Cleanup of Temporary Directories.
Jan 15 12:14:22 raspberrypi systemd[1]: Stopping Session c1 of user pi.
Jan 15 12:14:22 raspberrypi systemd[1]: Stopped Getty on tty1.
Jan 15 12:14:22 raspberrypi vncserver-x11-serviced[453]: XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0"
Jan 15 12:14:28 raspberrypi systemd-modules-load[111]: Inserted module 'i2c_dev'
Jan 15 12:14:28 raspberrypi systemd[1]: Started Apply Kernel Variables.
Jan 15 12:14:28 raspberrypi fake-hwclock[112]: Tue 15 Jan 12:14:24 UTC 2019
Jan 15 12:14:28 raspberrypi systemd[1]: Time has been changed
Jan 15 12:14:28 raspberrypi systemd[1]: Started Restore / save the current clock.
Jan 15 12:14:28 raspberrypi systemd-fsck[113]: e2fsck 1.43.4 (31-Jan-2017)
Jan 15 12:14:28 raspberrypi systemd[1]: Started Create Static Device Nodes in /dev.
Jan 15 12:14:28 raspberrypi systemd[1]: Starting udev Kernel Device Manager...
Jan 15 12:14:28 raspberrypi systemd-fsck[113]: /dev/mmcblk0p2: clean, 137720/939744 files, 1384356/3809792 blocks
Jan 15 12:14:28 raspberrypi systemd[1]: Started File System Check on Root Device.
Jan 15 12:14:28 raspberrypi systemd[1]: Starting Remount Root and Kernel File Systems...
Jan 15 12:14:28 raspberrypi systemd[1]: Started Remount Root and Kernel File Systems.
Jan 15 12:14:28 raspberrypi systemd[1]: Starting Load/Save Random Seed...
Jan 15 12:14:28 raspberrypi systemd[1]: Starting udev Coldplug all Devices...
Jan 15 12:14:28 raspberrypi systemd[1]: Starting Flush Journal to Persistent Storage...
Jan 15 12:14:28 raspberrypi systemd[1]: Started Load/Save Random Seed.
Jan 15 12:14:28 raspberrypi systemd[1]: Started Flush Journal to Persistent Storage.
Jan 15 12:14:28 raspberrypi systemd[1]: Started Set the console keyboard layout.
Jan 15 12:14:28 raspberrypi systemd[1]: Reached target Local File Systems (Pre).
Jan 15 12:14:28 raspberrypi systemd[1]: Started udev Kernel Device Manager.
Jan 15 12:14:28 raspberrypi systemd[1]: Started udev Coldplug all Devices.
Jan 15 12:14:28 raspberrypi systemd[1]: Starting Show Plymouth Boot Screen...
Jan 15 12:14:28 raspberrypi systemd[1]: Found device /dev/serial1.
Jan 15 12:14:28 raspberrypi systemd[1]: Started Show Plymouth Boot Screen.
Jan 15 12:14:28 raspberrypi systemd[1]: Reached target Encrypted Volumes.
Jan 15 12:14:28 raspberrypi systemd[1]: Reached target Paths.
Jan 15 12:14:28 raspberrypi systemd[1]: Started Forward Password Requests to Plymouth Directory Watch.
Jan 15 12:14:28 raspberrypi mtp-probe: checking bus 1, device 3: "/sys/devices/platform/soc/3f980000.usb/usb1/1-1/1-1.1"
Jan 15 12:14:28 raspberrypi mtp-probe: checking bus 1, device 4: "/sys/devices/platform/soc/3f980000.usb/usb1/1-1/1-1.2"
Jan 15 12:14:28 raspberrypi mtp-probe: checking bus 1, device 5: "/sys/devices/platform/soc/3f980000.usb/usb1/1-1/1-1.4"
Jan 15 12:14:28 raspberrypi mtp-probe: bus: 1, device: 4 was not an MTP device
Jan 15 12:14:28 raspberrypi mtp-probe: bus: 1, device: 5 was not an MTP device
Jan 15 12:14:28 raspberrypi mtp-probe: bus: 1, device: 3 was not an MTP device
Jan 15 12:14:28 raspberrypi systemd[1]: Listening on Load/Save RF Kill Switch Status /dev/rfkill Watch.
Jan 15 12:14:28 raspberrypi systemd[1]: Reached target Sound Card.
Jan 15 12:14:28 raspberrypi systemd[1]: Found device /dev/disk/by-partuuid/f143b93d-01.
Jan 15 12:14:28 raspberrypi systemd[1]: Starting File System Check on /dev/disk/by-partuuid/f143b93d-01...
前两行停止数据库,第三行“Failed to execute script main”来自我的程序!但随后它立即重新启动数据库......有人能理解发生了什么吗?好像有很多服务被停止然后重新启动...
【问题讨论】:
-
subprocess.call(["sudo", "service", "mongodb", "start"]) -> 这是正确的命令吗?在我的机器上,我执行“sudo service mongod start”(不是 mongodb)
-
其实,你可以在终端中运行启动命令并检查输出吗?
-
@Neil 嘿,我已经在终端上使用了该命令并验证了它是否有效,我也在 stackoverflow 上找到了它,我会尝试找到该线程。
-
好的。你知道这个文件是什么 ""gateway-embedded-code/database.py" 吗?你能更新你的帖子来显示你是如何记录这个异常的吗?
-
你检查过/var/log/messages吗?我的第一个想法可能是 OOM 杀手,但它看起来像 SIGTERM 而不是 SIGKILL ......但服务不会任意获取 SIGTERM,所以我希望系统日志能有一些解释。
标签: python database mongodb raspberry-pi pymongo