Scrapy请求，蜘蛛中的shell Fetch()答案

【问题标题】：Scrapy request, shell Fetch() in spiderScrapy请求，蜘蛛中的shell Fetch()
【发布时间】：2018-07-17 17:04:55
【问题描述】：

我正在尝试访问特定页面，我们称之为http://example.com/puppers。使用scrapy shell 或标准scrapy.request 模块直接连接时无法访问此页面（结果为<405> HTTP）。

但是，当我先使用scrapy shell 'http://example.com/kittens'，然后使用fetch('http://example.com/puppers') 时，它可以工作，我得到一个<200> OK HTTP 代码。我现在可以使用scrapy shell 提取数据。

我尝试在我的脚本中实现这一点，通过更改 referer（使用 url #1）、user-agent 和其他几个，同时连接到 puppers（url #2）页面.我仍然得到一个代码..

感谢所有帮助。谢谢。

【问题讨论】：

如果您共享 URL 或您的代码会更好。如何创建最小、完整和可验证的示例：stackoverflow.com/help/mcve

标签： python web-scraping scrapy scrapy-spider

【解决方案1】：

start_urls = ['http://example.com/kittens']

def parse(self, response):

    yield scrapy.Request(

        url="http://example.com/puppers",
        callback=self.parse_puppers
    )

def parse_puppers(self, response):
    #process your puppers
    .....

【讨论】：