【发布时间】:2014-01-06 08:31:55
【问题描述】:
我已经编写了一个代码,它将获取页面的 html 内容作为响应,我正在使用 HTML 单元来执行此操作。但我收到一些特定网址的错误,例如
[https://communities.netapp.com/welcome][1]
对于第一页,我可以检索内容。但是当我没有使用加载更多按钮时获得的内容。
这是我的代码:
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.io.Writer;
import java.net.MalformedURLException;
import com.gargoylesoftware.htmlunit.BrowserVersion;
import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException;
import com.gargoylesoftware.htmlunit.NicelyResynchronizingAjaxController;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
public class Sample {
public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException, InterruptedException {
String url = "https://communities.netapp.com/welcome";
WebClient client = new WebClient(BrowserVersion.INTERNET_EXPLORER_9);
client.getOptions().setJavaScriptEnabled(true);
client.getOptions().setRedirectEnabled(true);
client.getOptions().setThrowExceptionOnScriptError(true);
client.getOptions().setCssEnabled(true);
client.getOptions().setUseInsecureSSL(true);
client.getOptions().setThrowExceptionOnFailingStatusCode(false);
client.setAjaxController(new NicelyResynchronizingAjaxController());
HtmlPage page = client.getPage(url);
Writer output = null;
String text = page.asText();
File file = new File("D://write6.txt");
output = new BufferedWriter(new FileWriter(file));
output.write(text);
output.close();
System.out.println("Your file has been written");
// System.out.println("as Text ==" +page.asText());
// System.out.println("asXML == " +page.asXml());
// System.out.println("text content ==" +page.getTextContent());
// System.out.println(page.getWebResponse().getContentAsString());
}
}
有什么建议吗?
【问题讨论】:
标签: java javascript ajax parsing htmlunit