【发布时间】:2017-04-27 10:57:05
【问题描述】:
我定义了一个name='myspider'的蜘蛛,它的行为会根据设置而有所不同。我想在不同的进程中运行不同实例的蜘蛛,可以吗?
我检查了源代码,似乎 SpiderLoader 只是遍历了 spiders 模块,我可以一次只运行一个同名的spider。
运行代码似乎:
for item in items:
settings = get_project_settings()
settings.set('item', item)
settings.set('DEFAULT_REQUEST_HEADERS', item.get('request_header'))
process = CrawlerProcess(settings)
process.crawl("myspider")
process.start()
当然,错误显示:
Traceback (most recent call last):
File "/home/xuanqi/workspace/github/foolcage/fospider/fospider/main.py", line 44, in <module>
process.start() # the script will block here until the crawling is finished
File "/usr/local/lib/python3.5/dist-packages/scrapy/crawler.py", line 280, in start
reactor.run(installSignalHandlers=False) # blocking call
File "/usr/local/lib/python3.5/dist-packages/twisted/internet/base.py", line 1194, in run
self.startRunning(installSignalHandlers=installSignalHandlers)
File "/usr/local/lib/python3.5/dist-packages/twisted/internet/base.py", line 1174, in startRunning
ReactorBase.startRunning(self)
File "/usr/local/lib/python3.5/dist-packages/twisted/internet/base.py", line 684, in startRunning
raise error.ReactorNotRestartable()
twisted.internet.error.ReactorNotRestartable
提前感谢您的帮助!
【问题讨论】:
标签: scrapy