【发布时间】:2014-12-27 07:46:51
【问题描述】:
谁能解释一下scrapy如何调用并处理Request的回调函数结果?
我知道scrapy可以接受对象的结果(请求,BaseItem,无)或对象的可迭代。例如:
1.返回对象(Request 或 BaseItem 或 None)
def parse(self, response):
...
return scrapy.Request(...)
2。返回对象的迭代
def parse(self, response):
...
for url in self.urls:
yield scrapy.Request(...)
我认为它们在scrapy的代码中的某个地方是这样处理的。
# Assumed process_callback_result is a function that called after
# a Request's callback function has been executed.
# The "result" parameter is the callback's returned value
def process_callback_result(self, result):
if isinstance(result, scrapy.Request):
self.process_request(result)
elif isinstance(result, scrapy.BaseItem):
self.process_item(result)
elif result is None:
pass
elif isinstance(result, collections.Iterable):
for obj in result:
self.process_callback_result(obj)
else:
# show error message
# ...
在_process_spidermw_output函数中找到了<PYTHON_HOME>/Lib/site-packages/scrapy/core/scraper.py对应的代码:
def _process_spidermw_output(self, output, request, response, spider):
"""Process each Request/Item (given in the output parameter) returned
from the given spider
"""
if isinstance(output, Request):
self.crawler.engine.crawl(request=output, spider=spider)
elif isinstance(output, BaseItem):
self.slot.itemproc_size += 1
dfd = self.itemproc.process_item(output, spider)
dfd.addBoth(self._itemproc_finished, output, response, spider)
return dfd
elif output is None:
pass
else:
typename = type(output).__name__
log.msg(format='Spider must return Request, BaseItem or None, '
'got %(typename)r in %(request)s',
level=log.ERROR, spider=spider, request=request, typename=typename)
但是我找不到elif isinstance(result, collections.Iterable):逻辑的部分。
【问题讨论】:
标签: python callback scrapy iterable