【发布时间】:2015-08-20 14:31:08
【问题描述】:
基本上我正在做的是使用 urllib.request 对 pubmed 进行 API 调用,接收一个 XML 文件作为回报,并试图解析它但没有运气。
我尝试过使用元素树和其他模块,但没有成功。我认为 XML 对象本身可能存在问题。
#Imorting URL Request Modules for API Calls
#Also importing ElemenTree as it seems to be best for XML parsing
import urllib.request
import urllib.parse
import re
import xml.etree.ElementTree as ET
from urllib import request
#Now I can make the API call.
id_request = urllib.request.urlopen('http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=pubmed&id=17570568')
#id_request will be an object that I'm not sure I understand?
#id_request Returns: "<http.client.HTTPResponse object at 0x0000000003693FD0>"
#Let's now read this baby in XML format!
id_pubmed = id_request.read()
#If I look at the id_pubmed object, I not have the XML file I want to parse.
您可以在此处查看 XML 文件 id_pubmed 调用/打印的内容:http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=pubmed&id=17570568
我的问题是我根本无法让 Element Tree 解析它。我试过了:
tree = ET.parse(id_pubmed)
root = tree.getroot()
以及来自https://docs.python.org/3/library/xml.etree.elementtree.html#module-xml.etree.ElementTree的各种其他建议
【问题讨论】:
标签: python xml api python-3.x elementtree