【发布时间】:2018-07-21 11:23:30
【问题描述】:
是否可以使用 for 循环来搜索与某个短语相对应的标签文本。我一直在尝试创建这个循环,但没有一直在工作。任何帮助表示感谢!这是我的代码:
def parse_page(self, response):
titles2 = response.xpath('//div[@id = "mainColumn"]/h1/text()').extract_first()
year = response.xpath('//div[@id = "mainColumn"]/h1/span/text()').extract()[0].strip()
aud = response.xpath('//div[@id="scorePanel"]/div[2]')
a_score = aud.xpath('./div[1]/a/div/div[2]/div[1]/span/text()').extract()
a_count = aud.xpath('./div[2]/div[2]/text()').extract()
c_score = response.xpath('//a[@id = "tomato_meter_link"]/span/span[1]/text()').extract()[0].strip()
c_count = response.xpath('//div[@id = "scoreStats"]/div[3]/span[2]/text()').extract()[0].strip()
info = response.xpath('//div[@class="panel-body content_body"]/ul')
mp_rating = info.xpath('./li[1]/div[2]/text()').extract()[0].strip()
genre = info.xpath('./li[2]/div[2]/a/text()').extract_first()
date = info.xpath('./li[5]/div[2]/time/text()').extract_first()
box = response.xpath('//section[@class = "panel panel-rt panel-box "]/div')
actor1 = box.xpath('./div/div[1]/div/a/span/text()').extract()
actor2 = box.xpath('./div/div[2]/div/a/span/text()').extract()
actor3 = box.xpath('./div/div[3]/div/a/span/text()').extract_first()
for x in info.xpath('//li'):
if info.xpath("./li[x]/div[1][contains(text(), 'Box Office: ')/text()]]
box_office = info.xpath('./li[x]/div[2]/text()')
else if info.xpath('./li[x]/div[1]/text()').extract[0] == "Runtime: "):
runtime = info.xpath('./li[x]/div[2]/time/text()')
【问题讨论】:
-
是的。但你真正的问题是什么?你试过什么?您的意见和预期结果是什么?
标签: python html xpath scrapy tags