【发布时间】:2018-08-14 01:09:52
【问题描述】:
我试图废弃一个有分页链接的网站,所以我这样做了
import scrapy
class SummymartSpider(scrapy.Spider):
name = 'dummymart'
allowed_domains = ['www.dummrmart.com/product']
start_urls = ['https://www.dummymart.net/product/auto-parts--118?page%s'% page for page in range(1,20)]
成功了!!使用单个 url 它可以工作,但是当我尝试这样做时:
import scrapy
class DummymartSpider(scrapy.Spider):
name = 'dummymart'
allowed_domains = ['www.dummymart.com/product']
start_urls = ['https://www.dummymart.net/product/auto-parts--118?page%s',
'https://www.dummymart.net/product/accessories-tools--112?id=1316264860?page%s'% page for page in range(1,20)]
它不起作用,我如何实现相同的逻辑,但对于多个 URL?谢谢
【问题讨论】:
标签: python web-scraping scrapy