如何将提要聚合成一个字符串然后解析它们？答案

【问题标题】：How do I aggregate feeds together in a string and then parse them?如何将提要聚合成一个字符串然后解析它们？
【发布时间】：2012-11-01 03:06:48
【问题描述】：

我正在尝试聚合一些 youtube 提要，将它们连接起来，然后解析提要。当我自己解析单个提要时，我没有任何问题，而且代码似乎可以工作。但是，当我尝试将提要聚合为一个长字符串然后使用 etree.fromstring(aggregate_partner_feed) 时，出现错误。我得到的错误是 ParseError: unbound prefix 和 etree 行（前面引用过）作为错误给出。有关如何解决此问题的任何建议？

aggregated_partners_list = [cnn, teamcoco, buzzfeed]


i = 1 
number_of_partners = len(aggregated_partners_list)
aggregate_partner_feed = '' 

for entry in aggregated_partners_list:
    #YOUTUBE FEED
    #download the file:
    file = urllib2.urlopen('http://gdata.youtube.com/feeds/api/users/'+entry+'/uploads?v=2&max-results=50')
    #convert to string:
    data = file.read()
    #close file because we dont need it anymore:
    file.close()

    if i == 1:
        #remove ending </feed>
        data = data[:-7]

    if i>1 and i != number_of_partners:
        data = data[data.find('<entry'):]
        data = data[:-7]
        #remove everything before first <entry> in the new feed and the last </entry>

    #if last, then only remove everything before first <entry>
    if i == number_of_partners:
        data = data[data.find('<entry'):]

    #append the current feed to the existing feed
    aggregate_partner_feed += data

    #increment the counter  
    i=i+1

print isinstance(data, basestring)                      #returns true
print isinstance(aggregate_partner_feed, basestring)    #returns true

#apply the parsing to the aggregated feed

#entire feed
root = etree.fromstring(aggregate_partner_feed)     #this is the line that give an error
#all entries
entries = root.findall('{http://www.w3.org/2005/Atom}entry')
#more code that seems to work...

【问题讨论】：

你能显示aggregate_partner_feed的值吗？
您可以使用 etree 单独解析每个提要，并将解析后的条目附加到组合树对象中，而不是手动将 xml 操作为字符串
@J.F.Sebastian 如何将解析后的条目附加到组合树对象？
使用 Youtube Data API 而不是解析原始提要可能有助于简化代码。
@sharataka：每个元素都是其子元素的集合，例如，.append() 方法

标签： python xml concatenation feed

【解决方案1】：

我单独解析每个提要，然后使用 .append 而不是将字符串连接在一起然后解析。

【讨论】：