【问题标题】:Parsing XML from webpage从网页解析 XML
【发布时间】:2013-03-28 22:01:18
【问题描述】:

如果我将这个站点的 xml 复制并粘贴到一个 xml 文件中,我可以用 java 解析它

http://api.indeed.com/ads/apisearch?publisher=8397709210207872&q=java&l=austin%2C+tx&sort&radius&st&jt&start&limit&fromage&filter&latlong=1&chnl&userip=1.2.3.4&v=2

但是,如果可能的话,我想直接从网页中解析它!

这是我当前的代码:

import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder; 
import javax.xml.parsers.ParserConfigurationException;

import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
import org.xml.sax.SAXException;

import java.io.File;
import java.io.IOException;

  public class XMLParser {

public void readXML(String parse) {
    File xml = new File(parse);
    DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder dBuilder;
    try {
        dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(xml);
 //         System.out.println("Root element :"
 //                 + doc.getDocumentElement().getNodeName());

        NodeList nList = doc.getElementsByTagName("result");

        System.out.println("----------------------------");

        for (int temp = 0; temp < nList.getLength(); temp++) {

            Node nNode = nList.item(temp);

 //             System.out.println("\nCurrent Element :" + 
     nNode.getNodeName());

            if (nNode.getNodeType() == Node.ELEMENT_NODE) {

                Element eElement = (Element) nNode;

                System.out.println("job title : "
                        + 
 eElement.getElementsByTagName("jobtitle").item(0)
                        .getTextContent());;
                System.out.println("Company: "
                        + 
  eElement.getElementsByTagName("company")

 .item(0).getTextContent());
                System.out.println("City : "
                        + 
  eElement.getElementsByTagName("city").item(0)
                                .getTextContent());
                System.out.println("State : "
                        + 
eElement.getElementsByTagName("state").item(0)
                                .getTextContent());
                System.out.println("Country : "
                        + 
eElement.getElementsByTagName("country").item(0)
                                .getTextContent());
                System.out.println("Date posted : "
                        + 
     eElement.getElementsByTagName("date").item(0)
                                .getTextContent());
                System.out.println("Job summary : "
                        + 
    eElement.getElementsByTagName("snippet").item(0)
                                .getTextContent());
                System.out.println("Latitude : "
                        +      
 eElement.getElementsByTagName("latitude").item(0).getTextContent());
                System.out.println("longitude : "
                        +     
eElement.getElementsByTagName("longitude").item(0).getTextContent());

            }
        }

    } catch (ParserConfigurationException | SAXException | IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

}

public static void main(String[] args) {
    new XMLParser().readXML("test.xml");
}
 }

任何帮助将不胜感激。

【问题讨论】:

    标签: java xml parsing


    【解决方案1】:

    给它 URI 而不是 XML。它会为您下载。

    文档 doc = dBuilder.parse(uriString)

    【讨论】:

      【解决方案2】:

      请找这样的代码sn-p

      String url = "http://api.indeed.com/ads/apisearch?publisher=8397709210207872&q=java&l=austin%2C+tx&sort&radius&st&jt&start&limit&fromage&filter&latlong=1&chnl&userip=1.2.3.4&v=2";
      
      try
      {
        DocumentBuilderFactory f = DocumentBuilderFactory.newInstance();
        DocumentBuilder b = f.newDocumentBuilder();
        Document doc = b.parse(url);
      }
      

      【讨论】:

        【解决方案3】:

        您需要在 for 循环中包含所需的元素/节点。所以它可以扫描xml文件,找到你要搜索的正确节点。

        reads the xml file as a string, and creates a xml structure
        
                builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
                Document doc = builder.parse(connection.getInputStream());
                NodeList nodes = doc.getElementsByTagName("mode");
        
                for (int i = 0; i < nodes.getLength(); i++)
        
                Element element = (Element) nodes.item(i);
        
                    //Gets tag from XML and it´s content
                    NodeList nodeMode = element.getElementsByTagName("mode");
                    Element elemMode = (Element) nodeMode.item(0);
        

        之后,如果您想选择一个值并解析为一个 int 或您想要的内容,您可以这样做:

        int currentMode = Integer.parseInt(elemMode.getFirstChild().getTextContent());
        

        【讨论】:

          【解决方案4】:

          这就是我直接从 url http://www.nbp.pl/kursy/xml/+something 解析数据的方式

          static class Kurs {
              public float kurs_sprzedazy;
              public float kurs_kupna;
          }
          
          private static DocumentBuilder dBuilder;
          
          private static Kurs getData(String filename, String currency) throws Exception {
              Document doc = dBuilder.parse("http://www.nbp.pl/kursy/xml/"+filename+".xml");
          
              doc.getDocumentElement().normalize();
              NodeList nList = doc.getElementsByTagName("pozycja");
          
              for(int i = 0; i < nList.getLength(); i++) {
                  Element nNode = (Element)nList.item(i);
                  if(nNode.getElementsByTagName("kod_waluty").item(0).getTextContent().equals(currency)) {
                      Kurs kurs = new Kurs();
                      String data = nNode.getElementsByTagName("kurs_sprzedazy").item(0).getTextContent();
                      data = data.replace(',', '.'); 
                      kurs.kurs_sprzedazy = Float.parseFloat(data);
                      data = nNode.getElementsByTagName("kurs_kupna").item(0).getTextContent();
                      data = data.replace(',', '.');
                      kurs.kurs_kupna = Float.parseFloat(data);
                      return kurs;
                  }
              }
              return null;
          }
          

          【讨论】:

            猜你喜欢
            • 2019-08-25
            • 2012-02-12
            • 2011-06-13
            • 1970-01-01
            • 2012-03-18
            • 2012-07-07
            • 1970-01-01
            • 2011-07-17
            相关资源
            最近更新 更多