【问题标题】:python peewee multiprocessing pool errorpython peewee 多处理池错误
【发布时间】:2016-08-12 09:47:27
【问题描述】:

堆栈:python3.4、PostgreSQL 9.4.7、peewee 2.8.0、psycopg2 2.6.1(dt dec pq3 ext lo64)

我需要能够与每个工作人员中的 postgresql 数据库进行对话(选择、插入、更新)。我正在使用 pythons 多处理池来创建 10 个工作人员,每个工作人员都进行 curl 调用,然后根据找到的内容与数据库进行对话。

在互联网上阅读了一些帖子后,我认为连接池是要走的路。所以我将下面的代码放在了我的 models.py 文件上。我对连接池有疑问,因为我的理解是跨线程重用数据库连接是不行的。

db = PooledPostgresqlExtDatabase(
    'uc',
    max_connections=32,
    stale_timeout=300,  # 5 minutes.
    **{'password': cfg['psql']['pass'], 
       'port': cfg['psql']['port'], 
       'register_hstore':False,
       'host': cfg['psql']['host'], 
       'user': cfg['psql']['user']})

现在回答这个问题。从一些工作人员与数据库交谈时,我遇到了随机的 sql 错误。在我将 peewee 引入组合之前,我使用的是没有包装器的“psycopg2”库。我还为每个工作人员创建了一个新的数据库连接。没有错误。

我得到的示例错误是:

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/playhouse/postgres_ext.py", line 377, in execute_sql
    self.commit()
  File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3468, in commit
    self.get_conn().commit()
psycopg2.DatabaseError: error with no message from the libpq

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.4/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/usr/lib/python3.4/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/home/dan/dev/link-checker/crawler/manager.py", line 17, in startWorker
    wrk.perform()
  File "/home/dan/dev/link-checker/crawler/worker.py", line 49, in perform
    self.pullUrls()
  File "/home/dan/dev/link-checker/crawler/worker.py", line 63, in pullUrls
    newUrlDict = UrlManager.createUrlWithInProgress(self._url['crawl'], source_url, self._url['base'])
  File "/home/dan/dev/link-checker/crawler/models.py", line 152, in createUrlWithInProgress
    newUrl = Url.create(**newUrlDict)
  File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 4494, in create
    inst.save(force_insert=True)
  File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 4680, in save
    pk_from_cursor = self.insert(**field_dict).execute()
  File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3213, in execute
    cursor = self._execute()
  File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 2628, in _execute
    return self.database.execute_sql(sql, params, self.require_commit)
  File "/usr/local/lib/python3.4/dist-packages/playhouse/postgres_ext.py", line 377, in execute_sql
    self.commit()
  File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3285, in __exit__
    reraise(new_type, new_type(*exc_args), traceback)
  File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 127, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.4/dist-packages/playhouse/postgres_ext.py", line 377, in execute_sql
    self.commit()
  File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3468, in commit
    self.get_conn().commit()
peewee.DatabaseError: error with no message from the libpq

我还跟踪了 postgresql 文件,这就是我所看到的:

2016-04-19 20:34:23 EDT [26824-3] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-4] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-5] uc_user@uc WARNING:  there is no transaction in progress
2016-04-19 20:34:23 EDT [26824-6] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-7] uc_user@uc WARNING:  there is no transaction in progress
2016-04-19 20:34:23 EDT [26824-8] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-9] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-1] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-2] uc_user@uc WARNING:  there is no transaction in progress
2016-04-19 20:35:14 EDT [26976-3] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-4] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-5] uc_user@uc WARNING:  there is no transaction in progress
2016-04-19 20:35:14 EDT [26976-6] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-7] uc_user@uc WARNING:  there is no transaction in progress
2016-04-19 20:35:14 EDT [26976-8] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-9] uc_user@uc WARNING:  there is no transaction in progress

我的预感是连接池和多处理不能很好地结合在一起。有没有人成功地做到这一点而没有错误,如果是这样,你能给我举个例子或给我一条可行的建议吗?

我是否需要在工作人员内部显式创建与 peewee 的新连接,或者是否有更简单的方法将 peewee 与多处理池库一起使用。

感谢您的回答和阅读。

【问题讨论】:

    标签: python postgresql multiprocessing threadpool peewee


    【解决方案1】:

    我让它工作了,models.py 文件中的所有代码都将被工人使用。如本页所述,我将其包装在“使用 db.execution_context 作为 ctx”中:

    http://docs.peewee-orm.com/en/latest/peewee/database.html#advanced-connection-management

    【讨论】:

      猜你喜欢
      • 2021-07-24
      • 2020-08-14
      • 2016-11-10
      • 2017-04-18
      • 1970-01-01
      • 1970-01-01
      • 2014-08-25
      • 2016-12-07
      • 2020-03-12
      相关资源
      最近更新 更多