使用python中的漂亮汤从xml文件中提取特定标签答案

【问题标题】：extract a specific tag from xml file using beautiful soup in python使用python中的漂亮汤从xml文件中提取特定标签
【发布时间】：2020-06-02 13:26:30
【问题描述】：

我有一个看起来像这样的 xml 文件（让我们调用的是 abc.xml）。

<?xml version="1.0" encoding="UTF-8"?>

<properties>
  <product name="XYZ" version="123"/>
  <application-links>
    <application-links>
      <id>111111111111111</id>
      <name>Link_1</name>
      <primary>true</primary>
      <type>applinks.ABC</type>
      <display-url>http://ABC.displayURL</display-url>
      <rpc-url>http://ABC.displayURL</rpc-url>
    </application-links>
  </application-links>
</properties>

我的python代码是这样的

f = open ('file.xml', 'r')
from bs4 import BeautifulSoup
soup = BeautifulSoup(f,'lxml')

print(soup.product)

for applinks in soup.application-links:
    print(applinks)

打印以下内容

<product name="XYZ" version="123"></product>
Traceback (most recent call last):
  File "parse.py", line 7, in <module>
    for applinks in soup.application-links:
NameError: name 'links' is not defined

请您帮助我了解如何打印包含短划线/连字符“-”的标签的行

【问题讨论】：

标签： python xml parsing

【解决方案1】：

我不知道beautifulsoup 是否是这里的最佳选择，但我真的建议在 python 中使用ElementTree 模块，如下所示：

>>> import xml.etree.ElementTree as ET
>>> root = ET.parse('file.xml').getroot()
>>> for app in root.findall('*/application-links/'):
...     print(app.text)
111111111111111
Link_1
true
applinks.ABC
http://ABC.displayURL
http://ABC.displayURL

因此，要打印<name> 标签内的值，您可以这样做：

>>> for app in root.findall('*/application-links/name'):
...     print(app.text)
Link_1

【讨论】：

让我试一试，我会告诉你我的进展情况。
之前从未真正使用过 ElementTree，因此需要阅读，找到本教程datacamp.com/community/tutorials/python-xml-elementtree，将通读并在此处返回您的代码。目前，我正在尝试从 application-links/application-links 打印出 name 属性
没关系...我已经编辑了答案，只打印了name