提取数据时的xpath问题[重复]答案

【问题标题】：xpath issue in extracting data [duplicate]提取数据时的xpath问题[重复]
【发布时间】：2019-11-07 23:00:06
【问题描述】：

我必须编写 xpath 来获取已发布的

<div class="lpadding20" style="font-weight: normal;">
<strong>Published: </strong>6/11/2019 at 8:02 AM.
This list includes 414 eligible players.
</div>

【问题讨论】：

问题不清楚。你能提供你所期望的代码sn-p，输入和输出吗
@asaika 请提供实际代码，而不是伪代码，否则很难理解您想要什么。您可以将其编辑到您的问题中。

标签： python regex

【解决方案1】：

你也可以使用 split() 函数来完成你的任务

str = 'published = 6/11/2019 at 8:02 AM'
str=str.split('=')
str=str[1].split('at')
print('published date =',str[0],'\npublished time =',str[1])

你会得到同样的结果

【讨论】：

@asaika stackoverflow.com/help/someone-answers
AttributeError: 'list' object has no attribute 'split' 我得到这个错误

【解决方案2】：

这个简单的表达式可能在这里起作用：

published\s*=\s*(.+?)\s*at\s*(.+)\s*

在这个demo 中，解释了表达式，如果您可能感兴趣的话。

测试

import re

regex = r"published\s*=\s*(.+?)\s*at\s*(.+)\s*"

test_str = "published = 6/11/2019 at 8:02 AM"

subst = "published date = \\1\\npublished time = \\2"

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

【讨论】：

效果很好，现在我需要循环播放它
str = player.xpath('//div[@class="lpadding20"]/text()') #str = 'raw_pubished' for str in player: regex = r"published\s *=\s*(.+?)\sat\s*(.+)\s" #test_str = "已发布 = 2019 年 6 月 11 日上午 8:02" subst = "发布日期 = \\1\\n发布时间 = \\2" # 可以通过更改第 4 个参数来手动指定替换次数 result = re.sub(regex,subst, 0, re.MULTILINE) if result: print (结果）
return _compile(pattern, flags).sub(repl, string, count) TypeError: expected string or bytes-like object
我面临这个错误，对于来自 xpath 正则表达式的每个数据都应该应用