为什么这个 Python 脚本只将最后一个 RSS 帖子读入文件中？答案

【问题标题】：Why does this Python script only read the last RSS post into the file?为什么这个 Python 脚本只将最后一个 RSS 帖子读入文件中？
【发布时间】：2009-07-22 23:40:00
【问题描述】：

我正在尝试修复一个 Python 脚本，该脚本从特定的 RSS 提要中获取帖子并将它们剥离并将它们输入到文本文件中。正如您在下面看到的，有两个主要的打印功能。一个只打印到 shell 一次运行，但它显示 all 的帖子，这是我想要它做的。现在，第二部分是问题所在。它只将 RSS 提要的 last 帖子打印成文本，而不是像第一个函数那样打印整个内容。我还尝试使用 %s 而不是新的打印行 pr 使第二个函数 (f = open()) 与第一个函数相同。变量。

如果有人能告诉我为什么脚本不会将 RSS 提要的多个（最后一个）帖子发布到文本中，而是在 shell 中发布整个内容，以及我需要进行哪些修改来修复它，我将不胜感激它:)

代码如下：

import urllib
import sys
import xml.dom.minidom

#The url of the feed
address = 'http://www.vg.no/export/Alle/rdf.hbs?kat=nyheter'

#Our actual xml document
document = xml.dom.minidom.parse(urllib.urlopen(address))
for item in document.getElementsByTagName('item'):
    title = item.getElementsByTagName('title')[0].firstChild.data
    link = item.getElementsByTagName('link')[0].firstChild.data
    description = item.getElementsByTagName('description')[0].firstChild.data

    str = link.strip("http://go.vg.no/cgi-bin/go.cgi/rssart/")
    print "\n"
    print "------------------------------------------------------------------"
    print '''"%s"\n\n%s\n\n(%s)''' % (title.encode('UTF8', 'replace'),
                                            description.encode('UTF8','replace'),
                                            str.encode('UTF8','replace'))
    print "------------------------------------------------------------------"
    print "\n"

f = open('lawl.txt','w')
print >>f, "----------------------Nyeste paa VG-------------------------------"
print >>f, title.encode('UTF8','replace')
print >>f, description.encode('UTF8','replace')
print >>f, str.encode('UTF8','replace')
print >>f, "------------------------------------------------------------------"
print >>f, "\n"

【问题讨论】：

标签： python xml rss

【解决方案1】：

您的print >>f 在for 循环之后，因此它们运行一次，并对您上次保存到title、description 和str 的数据进行操作。

您应该在for 循环之前打开文件，然后将print >>f 行放入循环中。

import urllib
import sys
import xml.dom.minidom

#The url of the feed
address = 'http://www.vg.no/export/Alle/rdf.hbs?kat=nyheter'

f = open('lawl.txt','w')

#Our actual xml document
document = xml.dom.minidom.parse(urllib.urlopen(address))
for item in document.getElementsByTagName('item'):
    title = item.getElementsByTagName('title')[0].firstChild.data
    link = item.getElementsByTagName('link')[0].firstChild.data
    description = item.getElementsByTagName('description')[0].firstChild.data

    str = link.strip("http://go.vg.no/cgi-bin/go.cgi/rssart/")
    print "\n"
    print "------------------------------------------------------------------"
    print '''"%s"\n\n%s\n\n(%s)''' % (title.encode('UTF8', 'replace'),
                                            description.encode('UTF8','replace'),
                                            str.encode('UTF8','replace'))
    print "------------------------------------------------------------------"
    print "\n"

    print >>f, "----------------------Nyeste paa VG-------------------------------"
    print >>f, title.encode('UTF8','replace')
    print >>f, description.encode('UTF8','replace')
    print >>f, str.encode('UTF8','replace')
    print >>f, "------------------------------------------------------------------"
    print >>f, "\n"

【讨论】：

【解决方案2】：

您遍历所有帖子，将它们的属性分配给变量并打印到终端。

然后将变量（恰好保存上次赋值的结果）打印到文件中。所以你在这里得到一个帖子。

如果你想要不止一个，也需要迭代。

【讨论】：