【发布时间】:2020-02-27 15:06:06
【问题描述】:
我正在使用此代码来提取特定链接类中的文本。我可以选择该类的一个元素.extract_first(),但我不能使用同一类的所有元素,我希望能够全部选择它们并将它们存储在一个列表中。有我的代码:
# -*- coding: utf-8 -*-
import scrapy
from scrapy_splash import SplashRequest
class MySpider(scrapy.Spider):
name = "quotes4"
start_urls = ["https://www.woolworths.com.au/shop/browse/drinks/cordials-juices-iced-teas/iced-teas"]
def start_requests(self):
for url in self.start_urls:
yield SplashRequest(url=url, callback=self.parse)
def parse(self, response):
# I can select first element of class
'''yield{
'name': response.css(".shelfProductTile-descriptionLink::text").extract_first()
}'''
# But not all the elements of the same class
a= response.css(".shelfProductTile-descriptionLink::text").extract()
print ('list lengh is : ' + str(len(a))) # OUTPUT : 0
我做错了吗?谢谢。
【问题讨论】:
标签: web-scraping scrapy scrapy-splash