来自 HTTP 的 XML 解析文件答案

【问题标题】：XML parse file from HTTP来自 HTTP 的 XML 解析文件
【发布时间】：2010-06-17 01:59:56
【问题描述】：

我有一个 XML 文件位于某个位置，例如

http://example.com/test.xml

我正在尝试解析 XML 文件以在我的程序中使用这样的 xPath，但它不起作用。

Document doc = builder.parse(new File(url));

如何获取 XML 文件？

【问题讨论】：

为什么要为此付出 +100 的赏金？请参阅 Nils 响应，您只需先将 xml 文件作为流获取，然后对其进行解析。

标签： java xml

【解决方案1】：

尝试使用URLConnection.getInputStream() 来获取 XML 文件的句柄。

请参阅下面的代码，因为我正在尝试打开一个 xml 文件并打印所有 description 字段：

import java.io.InputStream;
import java.net.URL;
import java.net.URLConnection;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.w3c.dom.Document;
import org.w3c.dom.NodeList;

public class HTTPXMLTest
{
    public static void main(String[] args) 
    {
        try {
            new HTTPXMLTest().start();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    private void start() throws Exception
    {
        URL url = new URL("http://localhost:8080/AutoLogin/resource/web.xml");
        URLConnection connection = url.openConnection();

        Document doc = parseXML(connection.getInputStream());
        NodeList descNodes = doc.getElementsByTagName("description");

        for(int i=0; i<descNodes.getLength();i++)
        {
            System.out.println(descNodes.item(i).getTextContent());
        }
    }

    private Document parseXML(InputStream stream)
    throws Exception
    {
        DocumentBuilderFactory objDocumentBuilderFactory = null;
        DocumentBuilder objDocumentBuilder = null;
        Document doc = null;
        try
        {
            objDocumentBuilderFactory = DocumentBuilderFactory.newInstance();
            objDocumentBuilder = objDocumentBuilderFactory.newDocumentBuilder();

            doc = objDocumentBuilder.parse(stream);
        }
        catch(Exception ex)
        {
            throw ex;
        }       

        return doc;
    }
}

【讨论】：

【解决方案2】：

这是从字符串“http://www.gettingagile.com/feed/rss2/”获取数据的简单示例

public class MainClassXml {

    public static void main(String args[]) throws URISyntaxException,
            ClientProtocolException, IOException, MalformedURLException {

        String url = "http://www.gettingagile.com/feed/rss2/";
        System.out.println("Url is careated****");
        URL url2 = new URL(url);
        HttpGet httpGet = new HttpGet(url);
        HttpClient httpClient = new DefaultHttpClient();

        HttpResponse httpResponse = httpClient.execute(httpGet);
        HttpEntity entity = httpResponse.getEntity();
        System.out.println("Entity is*****" + entity);
        try {
            String xmlParseString = EntityUtils.toString(entity);
            System.out.println("This Stirng to be Pasrse***" + xmlParseString);

            HttpURLConnection connection = (HttpURLConnection) url2
                    .openConnection();
            InputStream inputStream = connection.getInputStream();

            DocumentBuilderFactory builderFactory = DocumentBuilderFactory
                    .newInstance();
            DocumentBuilder documentBuilder = builderFactory
                    .newDocumentBuilder();
            Document document = documentBuilder.parse(inputStream);
            document.getDocumentElement().normalize();

            System.out.println("Attributes are***" + document.getAttributes());

            NodeList nodeList = document.getElementsByTagName("rss");
            System.out.println("This is firstnode" + nodeList);
            for (int getChild = 0; getChild < nodeList.getLength(); getChild++) {

                Node Listnode = nodeList.item(getChild);
                System.out.println("Into the for loop"
                        + Listnode.getAttributes().getLength());
                Element firstnoderss = (Element) Listnode;
                System.out.println("ListNodes" + Listnode.getAttributes());
                System.out.println("This is node list length"
                        + nodeList.getLength());

                Node Subnode = nodeList.item(getChild);
                System.out.println("This is list node" + Subnode);
                System.out.println("rss attributes***************");
            }

        } catch (Exception exception) {

            System.out.println("Exception is" + exception);

        }
    }

【讨论】：

【解决方案3】：

摆脱new File()：

Document doc = builder.parse(url);

【讨论】：

【解决方案4】：

更多细节，基于 laz 的回答：

String urlString = "http://example.com/test.xml";
URL url = new URL(urlString);
Document doc = builder.parse(url);

【讨论】：

builder.parse 无法处理 URL。
嗯，好吧，我犯了一个错误。但这是你应该这样做的方式。首先打开一个带有 URL 的连接，读取内容然后解析它。对不起那个兄弟。
这段代码甚至无法编译：docs.oracle.com/javase/8/docs/api/javax/xml/parsers/…
我的回答是从 2010 年开始，您指的是去年发布的 Java 8。

【解决方案5】：

使用 XMLPullParser 会容易得多...您不必处理此事件，并且可以快速获取一些关键字...我也在使用它...只有几行代码 :)

http://developer.android.com/reference/org/xmlpull/v1/XmlPullParser.html

关于 HTTP 和文件看这里 Download a file with DefaultHTTPClient and preemptive authentication

【讨论】：

不是安卓设备吗？

【解决方案6】：

File fileXml = new File(url);

DocumentBuilder parser = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document document = parser.parse(fileXml);

应该去

【讨论】：