【问题标题】:DeadlineExceededError: The overall deadline for responding to the HTTP request was exceededDeadlineExceededError:超过响应 HTTP 请求的总期限
【发布时间】:2018-01-17 12:58:20
【问题描述】:

我有一个 cron 作业,它调用供应商 api 来获取公司列表。获取数据后,我们将该数据存储到云数据存储中,如下面的代码所示。由于某些原因,在过去两天我触发 cron 作业时,开始看到错误。当我在本地调试代码时,我没有看到这个错误

    company_list = cron.rest_client.load(config, "companies", '')

    if not company_list:
        logging.info("Company list is empty")
        return "Ok"

    for row in company_list:
        company_repository.save(row,original_data_source, 
                                 actual_data_source)

仓库代码

  def save( dto, org_ds , act_dp):
   try:
    key = 'FIN/%s' % (dto['ticker'])
    company = CompanyInfo(id=key)
    company.stock_code = key
    company.ticker = dto['ticker']
    company.name = dto['name']
    company.original_data_source = org_ds
    company.actual_data_provider = act_dp
    company.put()
    return company
  except Exception:
    logging.exception("company_repository: error occurred saving the 
                       company record ")
    raise

错误

  DeadlineExceededError: The overall deadline for responding to the 
                          HTTP request was exceeded.

异常详情

  Traceback (most recent call last):
  File   

"/base/data/home/runtimes/python27_experiment/python27_lib/versions/1/googl
    e/appengine/runtime/wsgi.py", line 267, in Handle
    result = handler(dict(self._environ), self._StartResponse)
   File "/base/data/home/apps/p~svasti-173418/internal-
  api:20170808t160537.403249868819304873/lib/flask/app.py", line 1836, in __call__
    return self.wsgi_app(environ, start_response)
  File "/base/data/home/apps/p~svasti-173418/internal-
   api:20170808t160537.403249868819304873/lib/flask/app.py", line 1817, in 
    wsgi_app
      response = self.full_dispatch_request()
    File "/base/data/home/apps/p~svasti-173418/internal-
   api:20170808t160537.403249868819304873/lib/flask/app.py", line 1475, in full_dispatch_request
    rv = self.dispatch_request()
  File "/base/data/home/apps/p~svasti-173418/internal-api:20170808t160537.403249868819304873/lib/flask/app.py", line 1461, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/base/data/home/apps/p~svasti-173418/internal-api:20170808t160537.403249868819304873/internal/cron/company_list.py", line 21, in run
    company_repository.save(row,original_data_source, actual_data_source)
  File "/base/data/home/apps/p~svasti-173418/internal-api:20170808t160537.403249868819304873/internal/repository/company_repository.py", line 13, in save
    company.put()
  File "/base/data/home/runtimes/python27_experiment/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 3458, in _put
    return self._put_async(**ctx_options).get_result()
  File "/base/data/home/runtimes/python27_experiment/python27_lib/versions/1/google/appengine/ext/ndb/tasklets.py", line 383, in get_result
    self.check_success()
  File "/base/data/home/runtimes/python27_experiment/python27_lib/versions/1/google/appengine/ext/ndb/tasklets.py", line 378, in check_success
    self.wait()
  File "/base/data/home/runtimes/python27_experiment/python27_lib/versions/1/google/appengine/ext/ndb/tasklets.py", line 362, in wait
    if not ev.run1():
  File "/base/data/home/runtimes/python27_experiment/python27_lib/versions/1/google/appengine/ext/ndb/eventloop.py", line 268, in run1
    delay = self.run0()
  File "/base/data/home/runtimes/python27_experiment/python27_lib/versions/1/google/appengine/ext/ndb/eventloop.py", line 248, in run0
    _logging_debug('rpc: %s.%s', rpc.service, rpc.method)
  File "/base/data/home/runtimes/python27_experiment/python27_lib/versions/1/google/appengine/api/apiproxy_stub_map.py", line 453, in service
    @property
DeadlineExceededError: The overall deadline for responding to the HTTP request was exceeded.

【问题讨论】:

    标签: python google-app-engine google-cloud-datastore google-cloud-platform


    【解决方案1】:

    你的公司名单变大了吗?

    您要放置多少实体?

    尝试将它们保存为批处理,而不是按顺序保存在循环中。从def save( dto, org_ds , act_dp): 中删除company.put(),然后改用ndb.put_multi()

    company_list = cron.rest_client.load(config, "companies", '')
    
    if not company_list:
        logging.info("Company list is empty")
        return "Ok"
    
    company_objs=[]
    for row in company_list:
        company_objs.append(company_repository.save(row,original_data_source, 
                                 actual_data_source))
        # put 500 at a time
        if len(company_objs) > 500:
            ndb.put_multi(company_objs)
            company_objs=[]
    # put any remainders
    if len(company_objs) > 0:
        ndb.put_multi(company_objs)
    

    【讨论】:

    • 大约有6000只股票
    • 这是一个相当大的数量,这已经到了你可能想要 mapreduce 这个东西的地步。我修改了我的答案,一次放 500 个。根据我的经验,在超过 1000 个对象上使用 put_multi / get_multi 时,ndb 有时会挂起
    【解决方案2】:

    我的答案基于 Alex 给出的答案,但运行 async

    我已将 put_multi() 替换为 put_multi_async()

    通过将put_multi() 的调用替换为对其异步等效项put_multi_async() 的调用,应用程序可以立即执行其他操作,而不是阻塞put_multi()

    并添加@ndb.toplevel装饰器

    这个装饰器告诉处理程序在异步请求完成之前不要退出

    如果您的数据越来越大,您可能需要更深入地了解defered library。它可用于每 X 个批次重生任务,以及其余未处理的数据。

    @ndb.toplevel
    def fetch_companies_list():
        company_list = cron.rest_client.load(config, "companies", '')
    
        if not company_list:
            logging.info("Company list is empty")
            return "Ok"
    
        company_objs=[]
        for row in company_list:
            company_objs.append(company_repository.save(row,original_data_source, 
                                 actual_data_source))
            # put 500 at a time
            if len(company_objs) >= 500:
                ndb.put_multi_async(company_objs)
                company_objs=[]
        # put any remainders
        if len(company_objs) > 0:
            ndb.put_multi_async(company_objs)
    

    【讨论】:

    • @DanCornilescu 如何为每个 put 生成任务队列?由于我将加载 6000 条记录,因此将是 6000 个任务队列
    • 这也行 - 使用任务队列并不简单,但恕我直言,它更灵活/可扩展。但是,如果每个公司要做的工作量非常小,这可能显得有些矫枉过正,在这种情况下,每个任务处理一个批次可能会更好。无论哪种方式,在这种情况下要小心的一件事是在创建任务时使用逐渐变长的delay,以免突然在您的应用程序中抛出一堆立即可执行的任务,这可能会导致启动许多动态实例,这可能会很昂贵。
    • @Pythonist 这不等于在整批 6000 上做ndb.put_multi() 吗?我认为ndb.put_multi() 只是在多个实体上调用.put_async() 的便捷函数。
    猜你喜欢
    • 2018-09-10
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2012-08-30
    • 2021-08-28
    • 2020-11-17
    • 2023-03-04
    相关资源
    最近更新 更多