选择器 response.xpath 和 response.css 之间的输出区别是什么？答案

【问题标题】：What is output difference between selectors response.xpath and response.css?选择器 response.xpath 和 response.css 之间的输出区别是什么？
【发布时间】：2018-01-08 05:13:00
【问题描述】：

我正在 python 中工作，在 Scrapy 库的帮助下制作爬虫。当我使用选择器 response.xpath 和 response.css 获取数据时，它会给出不同的结果。就像当我使用 xpath 时它不显示结果，如果我替换 xpath 用css然后它显示结果。请帮助我理解这个概念。
xpath 查询

img = response.xpath('//div[@class="product-images"]//img/@src').extract()

css查询

img = response.css('div.product-images img::attr(src)').extract()

谢谢。

【问题讨论】：

div.product-images 元素有多个类吗？ HTML 是什么样的？
是的，它不止一个类。<div class="product-images relative mb-half has-hover woocommerce-product-gallery woocommerce-product-gallery--with-images woocommerce-product-gallery--columns-4 images">

标签： python xpath css-selectors scrapy

【解决方案1】：

XPath 谓词[@class="product-images"] 对类属性值执行完全匹配，这意味着它只会匹配带有class="product-images" 的元素。如果元素有多个类，它不会被谓词匹配。另一方面，类选择器将匹配具有指定类名的元素，即使它有多个类。

XPath 等价于考虑多个类的类选择器相当麻烦，因为 XPath 没有为此特定目的而设计的函数：

img = response.xpath('//div[contains(concat(" ", @class, " "), " product-images ")]//img/@src').extract()

【讨论】：