【问题标题】:Extracting value between m:properties tag在 m:properties 标签之间提取值
【发布时间】:2018-10-06 05:36:54
【问题描述】:

我想提取放在m:properties 标记之间的数据值。我该怎么做?

<entry>
<id>http://data.treasury.gov/Feed.svc/DailyTreasuryYieldCurveRateData(7086)</id>
<title type="text"></title>
<updated>2018-04-25T02:39:22Z</updated>
<author>
  <name />
</author>
<link rel="edit" title="DailyTreasuryYieldCurveRateDatum" href="DailyTreasuryYieldCurveRateData(7086)" />
<category term="TreasuryDataWarehouseModel.DailyTreasuryYieldCurveRateDatum" scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme" />
<content type="application/xml">
  <m:properties>
    <d:Id m:type="Edm.Int32">7086</d:Id>
    <d:NEW_DATE m:type="Edm.DateTime">2018-04-24T00:00:00</d:NEW_DATE>
    <d:BC_1MONTH m:type="Edm.Double">1.7</d:BC_1MONTH>
    <d:BC_3MONTH m:type="Edm.Double">1.87</d:BC_3MONTH>
    <d:BC_6MONTH m:type="Edm.Double">2.05</d:BC_6MONTH>
    <d:BC_1YEAR m:type="Edm.Double">2.25</d:BC_1YEAR>
    <d:BC_2YEAR m:type="Edm.Double">2.48</d:BC_2YEAR>
    <d:BC_3YEAR m:type="Edm.Double">2.63</d:BC_3YEAR>
    <d:BC_5YEAR m:type="Edm.Double">2.83</d:BC_5YEAR>
    <d:BC_7YEAR m:type="Edm.Double">2.95</d:BC_7YEAR>
    <d:BC_10YEAR m:type="Edm.Double">3</d:BC_10YEAR>
    <d:BC_20YEAR m:type="Edm.Double">3.08</d:BC_20YEAR>
    <d:BC_30YEAR m:type="Edm.Double">3.18</d:BC_30YEAR>
    <d:BC_30YEARDISPLAY m:type="Edm.Double">3.18</d:BC_30YEARDISPLAY>
  </m:properties>
</content>

我尝试使用请求

r = requests.get('http://data.treasury.gov/feed.svc/DailyTreasuryYieldCurveRateData?$filter=month(NEW_DATE)%20eq%204%20and%20year(NEW_DATE)%20eq%202018')
soup = BeautifulSoup(r.text, 'lxml')
data = soup.find(id ='http://data.treasury.gov/Feed.svc/DailyTreasuryYieldCurveRateData(7086)')
print(data)

输出是None

【问题讨论】:

    标签: python python-3.x web-scraping beautifulsoup


    【解决方案1】:
    import requests
    from bs4 import BeautifulSoup
    r = requests.get('http://data.treasury.gov/feed.svc/DailyTreasuryYieldCurveRateData?$filter=month(NEW_DATE)%20eq%204%20and%20year(NEW_DATE)%20eq%202018')
    soup = BeautifulSoup(r.text, 'lxml')
    data = soup.find_all("content")
    for i in data:
        print(i)
    

    【讨论】:

    • 谢谢拉克什。如果我有多个 m:properties,我该如何提取?我想从这个链接中提取 10 年美国债券收益率data.treasury.gov/feed.svc/…
    • 更新了 sn-p。你可以使用soup.find_all("content")
    • 我需要获取这个 7086 的内容。这具有最新的债券收益率。抱歉打扰了,我是新手。
    • data[0].find("m:properties").find("d:id").text 应该会得到你想要的
    猜你喜欢
    • 2016-11-12
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2014-11-23
    • 2016-02-03
    • 1970-01-01
    • 2016-09-10
    • 2013-05-23
    相关资源
    最近更新 更多