【发布时间】:2021-04-09 19:13:46
【问题描述】:
我对 python 还很陌生,希望在将 XML 文件转换为 Pandas Dataframe 时获得一些帮助。我已经搜索了其他资源,但仍然卡住了。我希望将标签之间的所有字段都放入一个表中。任何帮助是极大的赞赏!谢谢。
下面是我试过的代码,但它不能正常工作。
import xml.etree.ElementTree as ET
import pandas as pd
xml_data = open('5249009-08-34-59-126029.xml', 'r').read()
root = ET.XML(xml_data)
data = []
cols = []
for i, child in enumerate(root):
data.append([subchild.text for subchild in child])
cols.append(child.tag)
df = pd.DataFrame(data).T
df.columns = cols
print(df)
以下是示例输入数据"
<?xml version="1.0"?>
-<RECORDING>
<IDENT>0</IDENT>
<DEVICEID>133242232</DEVICEID>
<DEVICEALIAS>52232009</DEVICEALIAS>
<GROUP>1823481655</GROUP>
<GATE>1011655</GATE>
<ANI>7777777777</ANI>
<DNIS>777777777</DNIS>
<USER1>00:07:53.2322691,00:03:21.34232761</USER1>
<USER2>text</USER2>
<USER3/>
<USER4/>
<USER5>34fc0a8d-d5632c9b1</USER5>
<USER6>000dfsdf98701596638094</USER6>
<USER7>97</USER7>
<USER8>00701596638094</USER8>
<USER9>10155</USER9>
<USER10/>
<USER11/>
<USER12/>
<USER13>Text</USER13>
<USER14>4</USER14>
<USER15>10</USER15>
<CALLSEGMENTID/>
<CALLID>9870</CALLID>
<FILENAME>\\folderpath\folderpath\folderpath\folderpath\2020\Aug\05\5249009\52343109-234234-34-59-1234234029</FILENAME>
<DURATION>201</DURATION>
<STARTYEAR>2020</STARTYEAR>
<STARTMONTH>08</STARTMONTH>
<STARTMONTHNAME>August</STARTMONTHNAME>
<STARTDAY>05</STARTDAY>
<STARTDAYNAME>Wednesday</STARTDAYNAME>
<STARTHOUR>08</STARTHOUR>
<STARTMINUTE>34</STARTMINUTE>
<STARTSECOND>59</STARTSECOND>
<PRIORITY>50</PRIORITY>
<RECORDINGTYPE>S</RECORDINGTYPE>
<CALLDIRECTION>I</CALLDIRECTION>
<SCREENCAPTURE>7</SCREENCAPTURE>
<KEEPCALLFORDAYS>90</KEEPCALLFORDAYS>
<BLACKOUTREMOTEAUDIO>false</BLACKOUTREMOTEAUDIO>
<BLACKOUTS/>
</RECORDING>
【问题讨论】:
-
David Erickson - 感谢您提供此链接,但是,他的数据似乎设置不同并且有多行。我的只有 1 行多列
标签: python xml pandas dataframe