Scrapy TCP 连接超时答案

【问题标题】：Scrapy TCP Connection Timed OutScrapy TCP 连接超时
【发布时间】：2013-11-27 12:20:07
【问题描述】：

我是第一次尝试使用 Scrapy。（是的，我看到了关于此的另一篇文章，但没有得到任何答案）。所以我想让它至少运行起来超级简单。

这是我的蜘蛛代码：

from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector

class Spider(BaseSpider):
    name = "craigs"
    allowed_domain = ["craigslist.org"]
    start_urls = ["http://sfbay.craigslist.org/sfc/npo/"]

    def parse(self, response):
        hxs = HtmlXPathSelector(response)
        titles = hxs.select("//p")
        for titles in titles:
            title = titles.select("a/text()").extract()
            link = titles.select("a/@href").extract()
            print title, link

我得到这个错误 “TCP连接超时：10060：连接尝试失败，因为连接方在一段时间后没有正确响应......”

我用另一个网站 URL 尝试了这个，但仍然没有。

如果是可能被阻止的端口，我应该打开哪些端口（但同时不要让我的计算机易受攻击）谢谢。

【问题讨论】：

标签： python tcp scrapy

【解决方案1】：

您是否使用代理。如果是，则设置 http_proxy 环境变量或使用 scrapy 的代理中间件。

【讨论】：