【发布时间】:2019-08-01 00:34:50
【问题描述】:
我的所有请求都没有得到解析。虽然它已成功解析。这是我的代码:
# -*- coding: utf-8 -*-
import scrapy
from boardgamegeek.items import BoardgamegeekItem
class TwoPlayersSpider(scrapy.Spider):
name = 'two_players'
start_urls = [
'https://www.boardgamegeek.com/xmlapi/geeklist/48970',
'https://www.boardgamegeek.com/xmlapi/geeklist/48986'
]
def parse(self, response):
bg_ids = ",".join(response.xpath("//item/@objectid").extract())
yield scrapy.Request("https://www.boardgamegeek.com/xmlapi/boardgame/{}".format(bg_ids), self.parse_bg)
def parse(self, response):
for bg in response.xpath("//boardgame").extract():
minplaytime = int(bg.xpath(".//minplaytime/text()").extract_first())
maxplaytime = int(bg.xpath(".//maxplaytime/text()").extract_first())
maxplayers = int(bg.xpath(".//maxplayers/text()").extract_first())
if (minplaytime <= 40 or maxplaytime <= 60) and maxplayers >= 3:
i = BoardgamegeekItem()
i["link"] = "http://www.boardgamegeek.com/boardgame/{}".format(bg.xpath(".//objectid").extract_first())
i["title"] = bg.xpath(".//name/text()").extract_first()
i["minplayers"] = int(bg.xpath(".//minplayers/text()").extract_first())
i["maxplayers"] = maxplayers
i["minplaytime"] = minplaytime
i["maxplaytime"] = maxplaytime
yield i
【问题讨论】:
标签: python python-2.7 web-scraping scrapy