【问题标题】:Scrapy request don't get parsedScrapy请求不被解析
【发布时间】:2019-08-01 00:34:50
【问题描述】:

我的所有请求都没有得到解析。虽然它已成功解析。这是我的代码:

# -*- coding: utf-8 -*-
import scrapy

from boardgamegeek.items import BoardgamegeekItem

class TwoPlayersSpider(scrapy.Spider):
    name = 'two_players'
    start_urls = [
        'https://www.boardgamegeek.com/xmlapi/geeklist/48970',
        'https://www.boardgamegeek.com/xmlapi/geeklist/48986'
    ]

    def parse(self, response):
        bg_ids = ",".join(response.xpath("//item/@objectid").extract())
        yield scrapy.Request("https://www.boardgamegeek.com/xmlapi/boardgame/{}".format(bg_ids), self.parse_bg)

    def parse(self, response):
        for bg in response.xpath("//boardgame").extract():
            minplaytime = int(bg.xpath(".//minplaytime/text()").extract_first())
            maxplaytime = int(bg.xpath(".//maxplaytime/text()").extract_first())
            maxplayers = int(bg.xpath(".//maxplayers/text()").extract_first())

            if (minplaytime <= 40 or maxplaytime <= 60) and maxplayers >= 3:
                i = BoardgamegeekItem()
                i["link"] = "http://www.boardgamegeek.com/boardgame/{}".format(bg.xpath(".//objectid").extract_first())
                i["title"] = bg.xpath(".//name/text()").extract_first()
                i["minplayers"] = int(bg.xpath(".//minplayers/text()").extract_first())
                i["maxplayers"] = maxplayers
                i["minplaytime"] = minplaytime
                i["maxplaytime"] = maxplaytime

                yield i

【问题讨论】:

    标签: python python-2.7 web-scraping scrapy


    【解决方案1】:

    找到了!发生这种情况是因为我有两个名称完全相同的解析函数!我忘记将解析名称之一更新为parse_bg

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2014-03-27
      • 2013-12-03
      • 2019-09-25
      • 1970-01-01
      • 2011-09-18
      • 1970-01-01
      • 1970-01-01
      • 2016-11-21
      相关资源
      最近更新 更多