访问下一个兄弟的文本答案

【问题标题】：Access text of next sibling访问下一个兄弟的文本
【发布时间】：2016-02-03 10:32:48
【问题描述】：

这是 jenkins xml 文件的一部分。

我想用xpath提取project_name的defaultValue。

在这种情况下，值为*****。

<?xml version='1.0' encoding='UTF-8'?>
<project>
    <properties>
        <hudson.model.ParametersDefinitionProperty>
            <parameterDefinitions>
                <hudson.model.StringParameterDefinition>
                    <name>customer_name</name>
                    <description></description>
                    <defaultValue>my_customer</defaultValue>
                </hudson.model.StringParameterDefinition>
                <hudson.model.StringParameterDefinition>
                    <name>project_name</name>
                    <description></description>
                    <defaultValue>*****</defaultValue>
                </hudson.model.StringParameterDefinition>
            </parameterDefinitions>
        </hudson.model.ParametersDefinitionProperty>
    </properties>
 </project>

我使用 python 的 etree，但是 AFAIK 这并不重要，因为这是一个 xpath 问题。

我目前的 xpath 知识有限。我目前的做法：

for name_tag in config.findall('.//name'):
    if name_tag.text=='project_host':
        default=name_tag.getparent().findall('defaultValue')[0].text

我在这里得到AttributeError: 'Element' object has no attribute 'getparent'

我又想到了这个，我认为在python中循环是错误的方法。这应该可以通过 xpath 选择。

【问题讨论】：

请同时展示您的 Python 代码并解释您的方法为何无法按预期工作。谢谢。更多帮助：stackoverflow.com/help/mcve.
etree 仅支持有限的 XPath 1.0 子集：19.7.2.2. Supported XPath syntax。如果您想广泛使用 XPath，请使用 lxml
@MathiasMüller 我添加了我当前的解决方案。
@har07 我可以安装lxml，没问题。但是到目前为止我还不知道 xpath 魔法本身。
在 XPath 中，您可以使用谓词表达式（包含在 [] 中的表达式）过滤上下文元素，即 [] 之前的元素，具有特定条件，就像下面的答案所示。跨度>

标签： python xml xpath jenkins elementtree

【解决方案1】：

您的问题的 XPath 答案是

/project/properties/hudson.model.ParametersDefinitionProperty/parameterDefinitions/hudson.model.StringParameterDefinition[name = 'project_name']/defaultValue/text()

将选择作为唯一结果

*****

鉴于您的实际文档没有命名空间。您不需要访问父元素或兄弟轴。

即使 etree 也应该支持这种 XPath 表达式，但它可能不支持 - 请参阅 comment by har07。

我又想到了这个，我认为在python中循环是错误的方法。这应该可以通过 xpath 选择。

是的，我同意。如果要从文档中选择单个值，请使用 XPath 表达式选择它并将其直接存储为 Python 字符串，而无需循环遍历元素。

使用 lxml 的完整示例

from lxml import etree
from StringIO import StringIO

document_string = """<project>
    <properties>
        <hudson.model.ParametersDefinitionProperty>
            <parameterDefinitions>
                <hudson.model.StringParameterDefinition>
                    <name>customer_name</name>
                    <description></description>
                    <defaultValue>my_customer</defaultValue>
                </hudson.model.StringParameterDefinition>
                <hudson.model.StringParameterDefinition>
                    <name>project_name</name>
                    <description></description>
                    <defaultValue>*****</defaultValue>
                </hudson.model.StringParameterDefinition>
            </parameterDefinitions>
        </hudson.model.ParametersDefinitionProperty>
    </properties>
 </project>"""

tree = etree.parse(StringIO(document_string))

result_list = tree.xpath("/project/properties/hudson.model.ParametersDefinitionProperty/parameterDefinitions/hudson.model.StringParameterDefinition[name = 'project_name']/defaultValue/text()")

print result_list[0]

输出：

*****

【讨论】：

@ Mathias Muller 如果您的 xpath 返回 python 列表，那么您将如何从它们构造文本？您能否展示一下 - 如果使用 selenium 等进行 Web 测试，我需要纠正我吗？
@SIslam 在这种情况下，XPath 表达式的返回值取决于您正在调用的 Python 函数。例如，lxml 函数可能会返回一个列表。但是说我的 XPath 表达式返回一个 Python 列表是不准确的。如果结果确实是一个列表，那么在 Python 中从字符串列表中提取字符串是微不足道的。最后，Selenium 测试与这个问题有什么关系？
啊！我练习 selenium(python) 因此我需要正确的路径来遵循-我的意思是 lxml 返回 python 文字（即列表）顺便说一句我看到你在你的 xpath 中使用了text() 并说不使用循环所以现在你能告诉我如何得到来自您的 xpath 为多个节点返回的 lxml 对象中的文本，最后我渴望看到一个工作示例。
我刚刚看到您的编辑-如果您的 xpath 选择了多个相同的节点（可以使用相同的 xpath 提取），您能否展示如何在不循环的情况下处理这种情况，例如多个defaultValue 在多个project_name 之后
@SIslam 据我所知，您不是提出上述问题的人。如果您有新问题，请单独提出问题（即打开新帖子）。如果有多个结果，则列表只是用更多值扩展。（不过，我可能真的不明白你在问什么。）

【解决方案2】：

您可以尝试lxml.etree，如下所示-我使用循环来选择具有相同位置的所有节点。

所需 xpath 的示例是 - 我使用了 relative xpath，因为它在长节点路径的情况下非常有用。

.//hudson.model.StringParameterDefinition/name[contains(text(),'project_name')]/following-sibling::defaultValue

或

.//hudson.model.StringParameterDefinition/name[contains(text(),'project_name')]/following::defaultValue[1]

from lxml import etree as et

data  = """<?xml version='1.0' encoding='UTF-8'?>
<project>
    <properties>
        <hudson.model.ParametersDefinitionProperty>
            <parameterDefinitions>
                <hudson.model.StringParameterDefinition>
                    <name>customer_name</name>
                    <description></description>
                    <defaultValue>my_customer</defaultValue>
                </hudson.model.StringParameterDefinition>
                <hudson.model.StringParameterDefinition>
                    <name>project_name</name>
                    <description></description>
                    <defaultValue>*****</defaultValue>
                </hudson.model.StringParameterDefinition>
            </parameterDefinitions>
        </hudson.model.ParametersDefinitionProperty>
    </properties>
 </project>"""

tree = et.fromstring(data)

print [i.text for i in tree.xpath(".//hudson.model.StringParameterDefinition/defaultValue")]
print [i.text for i in tree.xpath(".//hudson.model.StringParameterDefinition/name[contains(text(),'project_name')]/following-sibling::defaultValue")]
print [i.text for i in tree.xpath(".//hudson.model.StringParameterDefinition/name[contains(text(),'project_name')]/following::defaultValue[1]")]

输出-

['my_customer', '*****']
['*****']
['*****']

【讨论】：