【发布时间】:2016-08-12 09:47:27
【问题描述】:
堆栈:python3.4、PostgreSQL 9.4.7、peewee 2.8.0、psycopg2 2.6.1(dt dec pq3 ext lo64)
我需要能够与每个工作人员中的 postgresql 数据库进行对话(选择、插入、更新)。我正在使用 pythons 多处理池来创建 10 个工作人员,每个工作人员都进行 curl 调用,然后根据找到的内容与数据库进行对话。
在互联网上阅读了一些帖子后,我认为连接池是要走的路。所以我将下面的代码放在了我的 models.py 文件上。我对连接池有疑问,因为我的理解是跨线程重用数据库连接是不行的。
db = PooledPostgresqlExtDatabase(
'uc',
max_connections=32,
stale_timeout=300, # 5 minutes.
**{'password': cfg['psql']['pass'],
'port': cfg['psql']['port'],
'register_hstore':False,
'host': cfg['psql']['host'],
'user': cfg['psql']['user']})
现在回答这个问题。从一些工作人员与数据库交谈时,我遇到了随机的 sql 错误。在我将 peewee 引入组合之前,我使用的是没有包装器的“psycopg2”库。我还为每个工作人员创建了一个新的数据库连接。没有错误。
我得到的示例错误是:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/playhouse/postgres_ext.py", line 377, in execute_sql
self.commit()
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3468, in commit
self.get_conn().commit()
psycopg2.DatabaseError: error with no message from the libpq
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.4/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.4/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/home/dan/dev/link-checker/crawler/manager.py", line 17, in startWorker
wrk.perform()
File "/home/dan/dev/link-checker/crawler/worker.py", line 49, in perform
self.pullUrls()
File "/home/dan/dev/link-checker/crawler/worker.py", line 63, in pullUrls
newUrlDict = UrlManager.createUrlWithInProgress(self._url['crawl'], source_url, self._url['base'])
File "/home/dan/dev/link-checker/crawler/models.py", line 152, in createUrlWithInProgress
newUrl = Url.create(**newUrlDict)
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 4494, in create
inst.save(force_insert=True)
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 4680, in save
pk_from_cursor = self.insert(**field_dict).execute()
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3213, in execute
cursor = self._execute()
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 2628, in _execute
return self.database.execute_sql(sql, params, self.require_commit)
File "/usr/local/lib/python3.4/dist-packages/playhouse/postgres_ext.py", line 377, in execute_sql
self.commit()
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3285, in __exit__
reraise(new_type, new_type(*exc_args), traceback)
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 127, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.4/dist-packages/playhouse/postgres_ext.py", line 377, in execute_sql
self.commit()
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3468, in commit
self.get_conn().commit()
peewee.DatabaseError: error with no message from the libpq
我还跟踪了 postgresql 文件,这就是我所看到的:
2016-04-19 20:34:23 EDT [26824-3] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-4] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-5] uc_user@uc WARNING: there is no transaction in progress
2016-04-19 20:34:23 EDT [26824-6] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-7] uc_user@uc WARNING: there is no transaction in progress
2016-04-19 20:34:23 EDT [26824-8] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-9] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-1] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-2] uc_user@uc WARNING: there is no transaction in progress
2016-04-19 20:35:14 EDT [26976-3] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-4] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-5] uc_user@uc WARNING: there is no transaction in progress
2016-04-19 20:35:14 EDT [26976-6] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-7] uc_user@uc WARNING: there is no transaction in progress
2016-04-19 20:35:14 EDT [26976-8] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-9] uc_user@uc WARNING: there is no transaction in progress
我的预感是连接池和多处理不能很好地结合在一起。有没有人成功地做到这一点而没有错误,如果是这样,你能给我举个例子或给我一条可行的建议吗?
我是否需要在工作人员内部显式创建与 peewee 的新连接,或者是否有更简单的方法将 peewee 与多处理池库一起使用。
感谢您的回答和阅读。
【问题讨论】:
标签: python postgresql multiprocessing threadpool peewee