在android中使用jsoup在dom解析器中解析图像答案

【问题标题】：parse image in dom parser using jsoup in android在android中使用jsoup在dom解析器中解析图像
【发布时间】：2016-04-21 20:03:46
【问题描述】：

我正在尝试获取该网站的 RSS 提要：

http://www.phonearena.com/feed

这是我的 domparser 活动：

public class DOMParser {
private RSSFeed _feed = new RSSFeed();

public RSSFeed parseXml(String xml) {

    URL url = null;
    try {
        url = new URL(xml);
    } catch (MalformedURLException e1) {
        e1.printStackTrace();
    }

    try {

        DocumentBuilderFactory dbf;
        dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder();


        Document doc = db.parse(new InputSource(url.openStream()));
        doc.getDocumentElement().normalize();

        NodeList nl = doc.getElementsByTagName("item");
        NodeList itemChildren = null;
        Node currentItem = null;
        Node currentChild = null;
        int length = nl.getLength();

        for (int i = 0; i < length; i++) {
             currentItem = nl.item(i);
            RSSItem _item = new RSSItem();

            NodeList nchild = currentItem.getChildNodes();
            int clength = nchild.getLength();


            for (int j = 0; j < clength; j++) {

                currentChild = nchild.item(j);
                String theString = null;
                String nodeName = currentChild.getNodeName();

                theString = nchild.item(j).getFirstChild().getNodeValue();

                if (theString != null) {
                    if ("title".equals(nodeName)) {

                        _item.setTitle(theString);
                    }

                    else if ("description".equals(nodeName)) {

                        _item.setDescription(theString);

                        // Parse the html description to get the image url
                        String html = theString;
                        org.jsoup.nodes.Document docHtml = Jsoup
                                .parse(html);
                        Elements imgEle = docHtml.select("img");
                        _item.setImage(imgEle.attr("src"));
                    }

                    else if ("pubDate".equals(nodeName)) {


                        String formatedDate = theString.replace(" +0000",
                                "");
                        _item.setDate(formatedDate);
                    }

                }
            }


            _feed.addItem(_item);
        }

    } catch (Exception e) {
    }


    return _feed;
}
}

除了我试图通过 jsoup 获取的图像外，一切正常。

谁能告诉我我做错了什么或错过了什么？

【问题讨论】：

字符串html怎么样？您可以发布String html = theString; 行的字符串html 吗？
确保您的 HTML 字符串确实包含您想要的图像 (img)。
感谢大家帮助我，我会尝试你所说的。

标签： java android dom jsoup

【解决方案1】：

变量theString 需要在传递给Jsoup 之前取消转义。

else if ("description".equals(nodeName)) {
    _item.setDescription(theString);

    // Unescape then Parse the html description to get the image url
    Element imgEle = Jsoup.parse( //
            Parser.unescapeEntities( //
                  Parser.xmlParser().parseInput(theString, "").outerHtml(), //
                  true //
            )) //
            .select("img").first();

    if (imgEle != null) {
        _item.setImage(imgEle.attr("src"));
    }
}

【讨论】：

抱歉斯蒂芬这么晚回复我很忙。我尝试了你的解决方案，除了图像，所有的东西都被解析了，你知道可能是什么问题吗？
“除了图像之外的所有东西都被解析了” 你是什么意思？
我的意思是标题，日期已解析并显示，但只有图像未显示。
如何显示标题、日期和图片？
我使用 DomParser 来获取上面提到的 DomParser 类中的数据以及描述标记 i 中 CDATA 部分中的图像使用了第三方库和 jsoup 库。只有标题和日期能够解析，而不是问题所在的图像。