带有雅虎天气RSS的python feedparser答案

【问题标题】：python feedparser with yahoo weather rss带有雅虎天气RSS的python feedparser
【发布时间】：2010-05-20 00:04:19
【问题描述】：

我正在尝试使用 feedparser 从 yahoos weather rss 获取一些数据。看起来提要解析器剥离了 yweather 命名空间数据：

http://weather.yahooapis.com/forecastrss?w=24260013&u=c

<yweather:condition  text="Fair" code="34"  temp="23"  date="Wed, 19 May 2010 5:55 pm EDT" />

看起来 feedparser 完全忽略了这一点。有没有可以拿的？

【问题讨论】：

标签： python feedparser

【解决方案1】：

这是您可以使用lxml 获取数据的一种方法：

import urllib2
import lxml.etree

url = "http://weather.yahooapis.com/forecastrss?w=24260013&u=c"
doc = lxml.etree.parse( urllib2.urlopen(url) ).getroot()
conditions = doc.xpath('*/*/yweather:condition',
                       namespaces={'yweather': 'http://xml.weather.yahoo.com/ns/rss/1.0'})
try:
    condition=conditions[0]
except IndexError:
    print('yweather:condition not found')
print(condition.items())
# [('text', 'Fair'), ('code', '33'), ('temp', '16'), ('date', 'Wed, 19 May 2010 9:55 pm EDT')]

using xpath with namespaces 部分可能特别有用。

【讨论】：

【解决方案2】：

为了完整起见，feedparser 也支持这一点。一般语法是命名空间前缀下划线标记名称（例如，yweather_condition）。

在给出的雅虎天气示例中，可以这样做：

import feedparser
d=feedparser.parse('http://weather.yahooapis.com/forecastrss?w=24260013&u=c')
print (d['items'][0]['yweather_condition'])

产量

{'date': u'Mon, 18 Jul 2011 7:53 pm EDT', 'text': u'Fair', 'code': u'34', 'temp': u'27'}

文档位于http://www.feedparser.org/docs/namespace-handling.html

【讨论】：