【发布时间】:2012-11-01 03:06:48
【问题描述】:
我正在尝试聚合一些 youtube 提要,将它们连接起来,然后解析提要。 当我自己解析单个提要时,我没有任何问题,而且代码似乎可以工作。但是,当我尝试将提要聚合为一个长字符串然后使用 etree.fromstring(aggregate_partner_feed) 时,出现错误。我得到的错误是 ParseError: unbound prefix 和 etree 行(前面引用过)作为错误给出。有关如何解决此问题的任何建议?
aggregated_partners_list = [cnn, teamcoco, buzzfeed]
i = 1
number_of_partners = len(aggregated_partners_list)
aggregate_partner_feed = ''
for entry in aggregated_partners_list:
#YOUTUBE FEED
#download the file:
file = urllib2.urlopen('http://gdata.youtube.com/feeds/api/users/'+entry+'/uploads?v=2&max-results=50')
#convert to string:
data = file.read()
#close file because we dont need it anymore:
file.close()
if i == 1:
#remove ending </feed>
data = data[:-7]
if i>1 and i != number_of_partners:
data = data[data.find('<entry'):]
data = data[:-7]
#remove everything before first <entry> in the new feed and the last </entry>
#if last, then only remove everything before first <entry>
if i == number_of_partners:
data = data[data.find('<entry'):]
#append the current feed to the existing feed
aggregate_partner_feed += data
#increment the counter
i=i+1
print isinstance(data, basestring) #returns true
print isinstance(aggregate_partner_feed, basestring) #returns true
#apply the parsing to the aggregated feed
#entire feed
root = etree.fromstring(aggregate_partner_feed) #this is the line that give an error
#all entries
entries = root.findall('{http://www.w3.org/2005/Atom}entry')
#more code that seems to work...
【问题讨论】:
-
你能显示
aggregate_partner_feed的值吗? -
您可以使用 etree 单独解析每个提要,并将解析后的条目附加到组合树对象中,而不是手动将 xml 操作为字符串
-
@J.F.Sebastian 如何将解析后的条目附加到组合树对象?
-
使用 Youtube Data API 而不是解析原始提要可能有助于简化代码。
-
@sharataka:每个元素都是其子元素的集合,例如,
.append()方法
标签: python xml concatenation feed