【发布时间】:2018-12-17 20:52:25
【问题描述】:
我无法抓取此页面https://www.adidas.pe/,scrapy crawl my_spider 返回:
2018-12-17 15:36:39 [scrapy.core.engine] INFO: Spider opened
2018-12-17 15:36:39 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2018-12-17 15:36:39 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6024
2018-12-17 15:36:39 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET http://www.adidas.pe/> from <GET http://adidas.pe/>
2018-12-17 15:37:39 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2018-12-17 15:38:39 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
我试图改变settings.py:
COOKIES_ENABLED = True
ROBOTSTXT_OBEY = False
不工作
【问题讨论】:
标签: web-scraping scrapy web-crawler scrapy-spider