无法选择带有scrapy-splash的元素答案

【问题标题】：Can't select elements with scrapy-splash无法选择带有scrapy-splash的元素
【发布时间】：2020-02-27 15:06:06
【问题描述】：

我正在使用此代码来提取特定链接类中的文本。我可以选择该类的一个元素.extract_first()，但我不能使用同一类的所有元素，我希望能够全部选择它们并将它们存储在一个列表中。有我的代码：

# -*- coding: utf-8 -*-
import scrapy
from scrapy_splash import SplashRequest

class MySpider(scrapy.Spider):
    name = "quotes4"

    start_urls = ["https://www.woolworths.com.au/shop/browse/drinks/cordials-juices-iced-teas/iced-teas"]

    def start_requests(self):
        for url in self.start_urls:
            yield SplashRequest(url=url, callback=self.parse)


    def parse(self, response):
        # I can select first element of class
        '''yield{ 
            'name': response.css(".shelfProductTile-descriptionLink::text").extract_first()
            }'''

        # But not all the elements of the same class
        a= response.css(".shelfProductTile-descriptionLink::text").extract()
        print ('list lengh is : ' + str(len(a)))   # OUTPUT  : 0

我做错了吗？谢谢。

【问题讨论】：

docs.scrapy.org/en/latest/topics/dynamic-content.html

标签： web-scraping scrapy scrapy-splash

【解决方案1】：

你需要为此使用 scrapy_splash 吗？您的 yield 语句看起来像常规的 scrapy 代码，而不是 scrapy_splash。如果您要抓取的只是 html（不是 javascript），那么您不需要 scrapy_splash。

【讨论】：

我要打印的类是从JS动态加载的
你等得够久了吗？
通常scrapy splash需要SplashRequest函数的'endpoint'和'args'参数，这就是为什么我说它看起来像普通的scrapy代码。