【问题标题】:AttributeError: 'xml.etree.ElementTree.Element' object has no attribute 'children'AttributeError: 'xml.etree.ElementTree.Element' 对象没有属性 'children'
【发布时间】:2020-06-17 15:18:43
【问题描述】:

我有这个 XML 文件:

<population>
    <person id="101">
        <attributes>
            <attribute name="age" class="java.lang.Integer" >53</attribute>
        </attributes>
        <plan score="-0.38" selected="yes">
            <activity type="outside" link="81312" facility="outside_208" x="649324.9906891582" y="6866581.699995641" end_time="08:22:00" >
            </activity>
            <leg mode="car" dep_time="08:22:00" trav_time="00:10:13">
                <route type="links" start_link="81312" end_link="138852" trav_time="00:10:13" distance="6046.54932060571" vehicleRefId="7262234">81312</route>
            </leg>
            <activity type="work" link="138852" facility="38407" x="651680.6" y="6863892.5" start_time="08:45:22" end_time="17:15:22" >
                <attributes>
                    <attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
                </attributes>
            </activity>
            <leg mode="car" dep_time="17:15:22" trav_time="00:07:05">
                <route type="links" start_link="138852" end_link="189898" trav_time="00:07:05" distance="4604.544053407517" vehicleRefId="7262234">138852</route>
            </leg>
            <activity type="outside" link="189898" facility="outside_249" x="648729.9598002436" y="6866057.250182923" end_time="17:20:35" >
            </activity>
        </plan>
        <plan score="-0.38" selected="no">
            <activity type="inside" link="81312" facility="outside_208" x="649324.9906891582" y="6866581.699995641" end_time="08:22:00" >
            </activity>
            <leg mode="bike" dep_time="08:22:00" trav_time="00:10:13">
                <route type="links" start_link="81312" end_link="138852" trav_time="00:10:13" distance="6046.54932060571" vehicleRefId="7262234">81312</route>
            </leg>
            <activity type="shopping" link="138852" facility="38407" x="651680.6" y="6863892.5" start_time="08:45:22" end_time="17:15:22" >
                <attributes>
                    <attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
                </attributes>
            </activity>
            <leg mode="bike" dep_time="08:22:00" trav_time="00:10:13">
                <route type="links" start_link="81312" end_link="138852" trav_time="00:10:13" distance="6046.54932060571" vehicleRefId="7262234">81312</route>
            </leg>
            <activity type="work" link="138852" facility="38407" x="651680.6" y="6863892.5" start_time="08:45:22" end_time="17:15:22" >
                <attributes>
                    <attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
                </attributes>
            </activity>
            <leg mode="pt" dep_time="17:15:22" trav_time="00:07:05">
                <route type="links" start_link="138852" end_link="189898" trav_time="00:07:05" distance="4604.544053407517" vehicleRefId="7262234">138852</route>
            </leg>
            <activity type="outside" link="189898" facility="outside_249" x="648729.9598002436" y="6866057.250182923" end_time="17:20:35" >
            </activity>
        </plan>
    </person>
    <person id="102">
        <attributes>
            <attribute name="age" class="java.lang.Integer" >53</attribute>
        </attributes>
        <plan score="-0.38" selected="yes">
            <activity type="inside" link="81312" facility="outside_208" x="649324.9906891582" y="6866581.699995641" end_time="08:22:00" >
            </activity>
            <leg mode="bike" dep_time="08:22:00" trav_time="00:10:13">
                <route type="links" start_link="81312" end_link="138852" trav_time="00:10:13" distance="6046.54932060571" vehicleRefId="7262234">81312</route>
            </leg>
            <activity type="work" link="138852" facility="38407" x="651680.6" y="6863892.5" start_time="08:45:22" end_time="17:15:22" >
                <attributes>
                    <attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
                </attributes>
            </activity>
            <leg mode="bike" dep_time="08:22:00" trav_time="00:10:13">
                <route type="links" start_link="81312" end_link="138852" trav_time="00:10:13" distance="6046.54932060571" vehicleRefId="7262234">81312</route>
            </leg>
            <activity type="work" link="138852" facility="38407" x="651680.6" y="6863892.5" start_time="08:45:22" end_time="17:15:22" >
                <attributes>
                    <attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
                </attributes>
            </activity>
            <leg mode="pt" dep_time="17:15:22" trav_time="00:07:05">
                <route type="links" start_link="138852" end_link="189898" trav_time="00:07:05" distance="4604.544053407517" vehicleRefId="7262234">138852</route>
            </leg>
            <activity type="outside" link="189898" facility="outside_249" x="648729.9598002436" y="6866057.250182923" end_time="17:20:35" >
            </activity>
        </plan>
    </person>
    <person id="103">
        <attributes>
            <attribute name="age" class="java.lang.Integer" >53</attribute>
        </attributes>
        <plan score="-0.38" selected="yes">
            <activity type="inside" link="81312" facility="outside_208" x="649324.9906891582" y="6866581.699995641" end_time="08:22:00" >
            </activity>
            <leg mode="bike" dep_time="08:22:00" trav_time="00:10:13">
                <route type="links" start_link="81312" end_link="138852" trav_time="00:10:13" distance="6046.54932060571" vehicleRefId="7262234">81312</route>
            </leg>
            <activity type="shopping" link="138852" facility="38407" x="651680.6" y="6863892.5" start_time="08:45:22" end_time="17:15:22" >
                <attributes>
                    <attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
                </attributes>
            </activity>
            <leg mode="bike" dep_time="08:22:00" trav_time="00:10:13">
                <route type="links" start_link="81312" end_link="138852" trav_time="00:10:13" distance="6046.54932060571" vehicleRefId="7262234">81312</route>
            </leg>
            <activity type="work" link="138852" facility="38407" x="651680.6" y="6863892.5" start_time="08:45:22" end_time="17:15:22" >
                <attributes>
                    <attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
                </attributes>
            </activity>
            <leg mode="pt" dep_time="17:15:22" trav_time="00:07:05">
                <route type="links" start_link="138852" end_link="189898" trav_time="00:07:05" distance="4604.544053407517" vehicleRefId="7262234">138852</route>
            </leg>
            <activity type="outside" link="189898" facility="outside_249" x="648729.9598002436" y="6866057.250182923" end_time="17:20:35" >
            </activity>
        </plan>
    </person>
</population>

我的意图是创建一个包含三列的数据框; activity typeleg moderoute distance。它们应该用下面的代码填充。

我使用以下代码尝试此操作,但收到以下错误消息:

import gzip
import xml.etree.ElementTree as ET
import pandas as pd


data = gzip.open('file.xml.gz', 'r')

root = ET.parse(data).getroot()

from collections import defaultdict
d = defaultdict(list)

for ent in root.findall('./person/plan[@selected="yes"]'):
    if ent.name == 'activity':
        d['type'].append(ent.get('type'))
    elif ent.name == 'leg':
        d['mode'].append(ent.get('mode'))
        for place in ent.children:
            if place.name=='route':
                d['distance'].append(place.get('distance'))
coords=pd.DataFrame(d)


AttributeError: 'xml.etree.ElementTree.Element' object has no attribute 'children'

我已阅读 thisthis,但不知道如何将其应用于我的问题。

非常感谢您的帮助!

【问题讨论】:

    标签: python xml elementtree


    【解决方案1】:

    下面的解决方案可能会有所帮助 - 我注意到活动元素比每个 plan 的腿多一个,因此必须进行调整以确保在提取时有同步:

    import xml.etree.ElementTree as ET
    from itertools import zip_longest,chain
    from collections import defaultdict
    
    root = ET.parse('test.xml').getroot()
    
    #key elements and tags to extract
    elements = ['activity', 'leg', 'route']
    tags = ['type', 'mode', 'distance']
    
    box = []
    
    for entry in root.findall(".//plan[@selected='yes']"):
        #keeping the defaultdict within the for loop ensures 
        #there is a new dictionary for every iteration
        #also allows us align each extaction per ``plan`` element
        d = defaultdict(list)
        for element, tag in zip(elements, tags):
            for ent in entry.findall(f".//{element}"):
                d[f"{element}_{tag}"].append(ent.attrib.get(tag))
        box.append(d)
    
    flatten = chain.from_iterable
    
    #activity results are more than leg mode and route
    #zip longest helps pair them, without excluding any entry
    
    flat_data = flatten(zip_longest(*ent.values()) for ent in box)
    outcome = pd.DataFrame(flat_data, columns = d)
    
    outcome
        activity_type   leg_mode    route_distance
    0   outside           car   6046.54932060571
    1   work              car   4604.544053407517
    2   outside           None  None
    3   inside            bike  6046.54932060571
    4   work              bike  6046.54932060571
    5   work              pt    4604.544053407517
    6   outside           None  None
    7   inside            bike  6046.54932060571
    8   shopping          bike  6046.54932060571
    9   work              pt    4604.544053407517
    10  outside           None  None
    

    【讨论】:

    • 非常感谢您提供此解决方案! :D
    • 这段代码在我的笔记本电脑上运行得非常好,只有一个小数据集。如果我尝试在一个更大的集群上运行它,我会遇到以下错误:``` 结果 = pd.DataFrame(flat_data, columns = d) File "/cluster/apps/python/3.6.0/x86_64 /lib64/python3.6/site-packages/pandas/core/frame.py",第 325 行,在 init 中引发 TypeError("data argument can't be an iterator") TypeError: data argument不能是迭代器```你知道是什么原因造成的吗?
    • 不确定...您能否跟踪代码中的哪一行引发了 TypeError。调试器可以提供帮助
    猜你喜欢
    • 1970-01-01
    • 2018-05-12
    • 1970-01-01
    • 2012-12-01
    • 2021-04-19
    • 2021-11-22
    • 1970-01-01
    • 1970-01-01
    • 2018-08-28
    相关资源
    最近更新 更多