【问题标题】:Set Request Timeout in Elastic Search for bulk loads [duplicate]在弹性搜索中为批量加载设置请求超时[重复]
【发布时间】:2017-04-14 13:22:45
【问题描述】:

我想在 Elasticsearch 批量上传中将请求时间设置为 20 秒或更长。默认时间设置为 10 秒,我的警告消息天数需要 10.006 秒。而且,在显示警告后,执行会引发错误

现在,我想为每个接受用户输入的请求或默认设置的任何值设置请求超时。

错误信息:

    WARNING:elasticsearch:HEAD /opportunityci/predictionsci [status:404 request:0.080s]
validated the index and mapping...!
WARNING:elasticsearch:POST http://192.168.204.154:9200/_bulk [status:N/A request:10.003s]
Traceback (most recent call last):
  File "/Users/adaggula/anaconda/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 94, in perform_request
    response = self.pool.urlopen(method, url, body, retries=False, headers=self.headers, **kw)
  File "/Users/adaggula/anaconda/lib/python2.7/site-packages/urllib3/connectionpool.py", line 640, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/Users/adaggula/anaconda/lib/python2.7/site-packages/urllib3/util/retry.py", line 238, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/Users/adaggula/anaconda/lib/python2.7/site-packages/urllib3/connectionpool.py", line 595, in urlopen
    chunked=chunked)
  File "/Users/adaggula/anaconda/lib/python2.7/site-packages/urllib3/connectionpool.py", line 395, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/Users/adaggula/anaconda/lib/python2.7/site-packages/urllib3/connectionpool.py", line 315, in _raise_timeout
    raise ReadTimeoutError(self, url, "Read timed out. (read timeout=%s)" % timeout_value)
ReadTimeoutError: HTTPConnectionPool(host='192.168.204.154', port='9200'): Read timed out. (read timeout=10)
ERROR:DataScience:init exception : Traceback (most recent call last):
  File "/Users/adaggula/Documents/workspace/LatestDemo/demo/com/ci/dataScience/engine/Driver.py", line 194, in <module>
    sample.persist(finalResults)
  File "/Users/adaggula/Documents/workspace/LatestDemo/demo/com/ci/dataScience/ES/sample.py", line 68, in persist
    res = helpers.bulk(client,data,stats_only=True)
  File "/Users/adaggula/anaconda/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 188, in bulk
    for ok, item in streaming_bulk(client, actions, **kwargs):
  File "/Users/adaggula/anaconda/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 160, in streaming_bulk
    for result in _process_bulk_chunk(client, bulk_actions, raise_on_exception, raise_on_error, **kwargs):
  File "/Users/adaggula/anaconda/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 89, in _process_bulk_chunk
    raise e
ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='192.168.204.154', port='9200'): Read timed out. (read timeout=10))

【问题讨论】:

    标签: python elasticsearch request-timed-out elasticsearch-bulk-api


    【解决方案1】:

    使用参数'request_timeout'

    例如:

    bulk(es, records, chunk_size=500, request_timeout=20)
    

    【讨论】:

    • 批量方法不使用request_timeout。只有扫描可以。
    • @MarkM 它对我有用,我看到了源代码,它有 request_timeout 参数。我在 2017 年发布了这个答案,可能是图书馆已经更新,现在他们有其他选择。
    • 抱歉,我错了。通过**kwargsrequest_timeout 被传递了几次,直到client.utils.query_params 消耗并将其添加到params
    • chunk_size 不是当前版本的 ES 的有效参数。
    猜你喜欢
    • 2018-03-10
    • 2016-04-26
    • 1970-01-01
    • 2020-11-20
    • 2017-07-29
    • 2019-06-03
    • 1970-01-01
    • 1970-01-01
    • 2012-11-29
    相关资源
    最近更新 更多