【问题标题】:extract td and title tag from html content in android?从android中的html内容中提取td和title标签?
【发布时间】:2017-09-05 22:29:09
【问题描述】:

我在这样的字符串变量内容中有 html 内容。我想从这个 html 内容字符串中提取标题标签。为了获取此内容,我正在使用下面的方法 status()。使用 httpclient。

String content="<html>
<head>
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"> 
<META HTTP-EQUIV="Refresh" CONTENT="300">
<title>Relay Control  - Cabin + Conference Logger</title>
</head>
<tr>
<td valign=top width="17%" height="100%">
<table width="100%" height="100%" align=center border=0 cellspacing=1 cellpadding=0>
    <tr><td valign=top bgcolor="#F4F4F4">
    <table width="100%" cellpadding=1 cellspacing=5>

    <tr><td align=center>

    <table><tr><td><a href="http://www.digital-loggers.com/1P.html"><img src="logo.gif" width=195 height=65 border=0 alt="Digital Loggers, Inc."></a></td>

    <td><b><font size=-1>Ethernet Power Controller</font></b></td></tr></table>
    <hr>
    </td></tr>



<tr><td nowrap><b><a href="/index.htm">Relay Control</a></b></td></tr>
<tr><td nowrap><b><a href="/admin.htm">Setup</a></b></td></tr>
<tr><td nowrap><b><a href="/script.htm">Scripting</a></b></td></tr>


<tr><td nowrap><b><a href="/rtc.htm">Date/Time</a></b></td></tr>
<tr><td nowrap><b><a href="/serial.htm">Serial Ports</a></b></td></tr>

<tr><td nowrap><b><a href="/ap.htm">AutoPing</a></b></td></tr>
<tr><td nowrap><b><a href="/syslog.htm">System Log</a></b></td></tr>
<tr><td nowrap><b><a href="/logout">Logout</a></b></td></tr>
<tr><td nowrap><b><a href="/support.htm">Support</a></b></td></tr>
<tr><td nowrap><b><a href="/help/">Help</a></b></td></tr>




</body>
</html>
";

所以现在,我想从这个 html 内容中提取标题标签,我正在使用这种方法,但我无法得到

public static String status() {


        StringBuffer stringBuffer = new StringBuffer("");
        BufferedReader bufferedReader = null;
        try {
            HttpClient httpClient = new DefaultHttpClient();
            HttpGet httpGet = new HttpGet();

            URI uri = new URI("http://10.1.1.82/index.htm");
            httpGet.setURI(uri);
            httpGet.addHeader(BasicScheme.authenticate(
                    new UsernamePasswordCredentials("admin", "kirti123"),
                    HTTP.UTF_8, false));

            HttpResponse httpResponse = httpClient.execute(httpGet);

            HttpEntity entity = httpResponse.getEntity();
            Log.e("entity: ", "> " + entity);

            // Read the contents of an entity and return it as a String.
            content = EntityUtils.toString(entity);

            Log.e("content: ", "> " + content);


            //    String result = httpResponse.toString();
           htmlDocument = Jsoup.connect(content).get();
          htmlContentInStringFormat = htmlDocument.title();
            Log.e("title: ", "> " + htmlContentInStringFormat);

            InputStream inputStream = httpResponse.getEntity().getContent();
            bufferedReader = new BufferedReader(new InputStreamReader(
                    inputStream));

            String readLine = bufferedReader.readLine();
            while (readLine != null) {
                stringBuffer.append(readLine);
                stringBuffer.append("\n");
                readLine = bufferedReader.readLine();
            }
        } catch (Exception e) {
            // TODO: handle exception
        } finally {
            if (bufferedReader != null) {
                try {
                    bufferedReader.close();
                } catch (IOException e) {
                    // TODO: handle exception
                }
            }
        }
        return stringBuffer.toString();

    }

所以请帮助我如何提取标题标签?

【问题讨论】:

标签: java android html


【解决方案1】:
public String[] GetTags(String html, String tagName) {

    List<String> result = new ArrayList<String>();

    String tagStart = "<" + tagName + ">";
    String tagEnd = "</" + tagName + ">";
    String tag_data;
    int end_index = 0;
    int last_index = 0;
    int start_index = 0;
    do {

        start_index = html.indexOf(tagStart,last_index+1);
        end_index = html.indexOf(tagEnd,last_index+1);
        last_index = end_index;
        if(end_index > 0) {
            tag_data = html.substring(start_index + tagStart.length(),end_index);
            result.add(tag_data);
        }
        else {
            break;
        }

    }while(true);

    return (String[]) result.toArray();
}

试试这个

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2015-02-08
    • 2015-06-10
    • 2012-09-03
    相关资源
    最近更新 更多