【问题标题】:Index error out of range while using xpath使用 xpath 时索引错误超出范围
【发布时间】:2021-02-03 14:04:10
【问题描述】:

我有错误:使用此代码在key = element.xpath('./@ID')[0] 行列出索引超出范围:

from lxml import etree
from urllib.request import urlopen


url = urlopen('https://raw.githubusercontent.com/ArthurK-GH/Instances/main/sprint01.xml')
doc = etree.parse(url)
root = doc.getroot()

DictShiftType= {}
DictDay = {}

pat = root.xpath('//Pattern')
for element in pat:
    key = element.xpath('./@ID')[0]
    shift = element.xpath('.//ShiftType/text()')
    day = element.xpath('.//Day/text()')
    DictShiftType[key]=shift
    DictDay[key]=day

但是,如果我复制/粘贴我要查找的 XML 文档的一部分,我会使用 fromstring 阅读它并复制/粘贴它按预期工作的相同代码行:

patterns = """
  <Patterns>
    <Pattern ID="0" weight="1">
      <PatternEntries>
        <PatternEntry index="0">
          <ShiftType>L</ShiftType>
          <Day>Any</Day>
        </PatternEntry>
        <PatternEntry index="1">
          <ShiftType>D</ShiftType>
          <Day>Any</Day>
        </PatternEntry>
      </PatternEntries>
    </Pattern>
    <Pattern ID="1" weight="1">
      <PatternEntries>
        <PatternEntry index="0">
          <ShiftType>D</ShiftType>
          <Day>Any</Day>
        </PatternEntry>
        <PatternEntry index="1">
          <ShiftType>E</ShiftType>
          <Day>Any</Day>
        </PatternEntry>
        <PatternEntry index="2">
          <ShiftType>D</ShiftType>
          <Day>Any</Day>
        </PatternEntry>
      </PatternEntries>
    </Pattern>
    <Pattern ID="2" weight="1">
      <PatternEntries>
        <PatternEntry index="0">
          <ShiftType>None</ShiftType>
          <Day>Friday</Day>
        </PatternEntry>
        <PatternEntry index="1">
          <ShiftType>Any</ShiftType>
          <Day>Saturday</Day>
        </PatternEntry>
        <PatternEntry index="2">
          <ShiftType>Any</ShiftType>
          <Day>Sunday</Day>
        </PatternEntry>
      </PatternEntries>
    </Pattern>
  </Patterns>
"""

doc = etree.fromstring(patterns)
DictShiftType= {}
DictDay = {}

pat = doc.xpath('//Pattern')
for element in pat:
    key = element.xpath('./@ID')[0]
    shift = element.xpath('.//ShiftType/text()')
    day = element.xpath('.//Day/text()')
    DictShiftType[key]=shift
    DictDay[key]=day
DictDay 

输出:

{'0': ['Any', 'Any'],
 '1': ['Any', 'Any', 'Any'],
 '2': ['Friday', 'Saturday', 'Sunday']}

由于我必须遍历几组数据,我无法在我的代码中复制/粘贴 xml 文档,所以我尝试在我的代码请求时使用 fromstring 而不是 parse 但它没有不工作。你能帮忙找出我的错误在哪里吗?谢谢

【问题讨论】:

    标签: python xml parsing xpath lxml


    【解决方案1】:

    如果您分析 xml 结构,还有其他名为 Pattern 的标记导致索引错误,因为它们没有 &lt;Pattern ID="0" weight="1"&gt;,请参见示例:

    <Pattern>0</Pattern>
    <Pattern>1</Pattern>
    <Pattern>2</Pattern>
    

    如果您只是将 pat xpath 更改为:pat = root.xpath('//Patterns/Pattern'),您的代码将可以工作:

    from lxml import etree
    from urllib.request import urlopen
    
    
    url = urlopen('https://raw.githubusercontent.com/ArthurK-GH/Instances/main/sprint01.xml')
    doc = etree.parse(url)
    root = doc.getroot()
    
    DictShiftType= {}
    DictDay = {}
    
    pat = root.xpath('//Patterns/Pattern')
    for element in pat:
        key = element.xpath('./@ID')[0]
        shift = element.xpath('.//ShiftType/text()')
        day = element.xpath('.//Day/text()')
        DictShiftType[key]=shift
        DictDay[key]=day
    

    输出:

    {'0': ['Any', 'Any'],
     '1': ['Any', 'Any', 'Any'],
     '2': ['Friday', 'Saturday', 'Sunday']}
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2016-09-08
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2014-03-02
      • 2016-04-17
      • 2015-07-13
      相关资源
      最近更新 更多