10049：请求的地址在其上下文中无效.. Scrapy-Splash 未正确读取 URL答案

【问题标题】：10049: The requested address is not valid in its context.. Scrapy-Splash not reading URL correctly10049：请求的地址在其上下文中无效.. Scrapy-Splash 未正确读取 URL
【发布时间】：2019-01-09 22:01:31
【问题描述】：

我试图让代码在网页中使用 splash 读取更复杂的网站，但我什至无法让代码为这个简单的网站位置运行。我运行 docker 并将 8050 端口映射到我的 settings.py 文件中的 0.0.0.0 。任何帮助将不胜感激。请提供您用于任何软件包的版本，因为我担心这可能是一个问题。

在此过程中，我尝试了许多错误修复。更改 Splash、Scrapy 和 Twisted 的版本。 Scrapy 只能在 Python 3.x 上使用较新版本的 Twisted，但 Splash 表示无法与 Twisted > 16.2 相比。所以我尝试在没有修复的情况下切换版本。

import scrapy
import scrapy_splash


class ExampleSpider(scrapy.Spider):
    name = "test"
    #allowed_domains = ["Monster.com"]
    start_urls = [
        'http://quotes.toscrape.com/page/1/'
    ]


    def start_requests(self):
        for url in self.start_urls:
            yield scrapy_splash.SplashRequest(url, self.parse, 
                args={
                    'wait': 0.5,
                     },
                    endpoint='render.html',
            )
    def parse(self, response):
        for quote in response.css('div.quote'):
            print (quote.css('span.text::text').extract())

我应该只收到报价文本，即。这是来自 python 文档的相同 URL

【问题讨论】：

你能澄清你的问题吗？你得到了什么结果，你期望得到什么？同时发布您的settings.py 配置。

标签： python-3.x scrapy splash-screen scrapy-splash

【解决方案1】：

您的代码没有问题。你的问题是这样的：

我在settings.py 文件中将 8050 端口映射到 0.0.0.0

settings.py 中的正确映射应该是：

SPLASH_URL = http://localhost:8050

或

SPLASH_URL = http://127.0.0.1:8050

【讨论】：