【问题标题】:HtmlUnit Does not get content after clickHtmlUnit点击后不获取内容
【发布时间】:2016-11-23 19:33:39
【问题描述】:

我想用 html 单元解析一个网站。流程基本;

WebClient client = new WebClient(BrowserVersion.CHROME);
client.waitForBackgroundJavaScript(5 * 1000);
HtmlPage page = client.getPage("http://www.exapmle.com"); //here it waits to run js code.

HtmlUnorderedList ul = (HtmlUnorderedList) page.getByXPath("//ul[contains(@class, 'class-name')]").get(0);
HtmlListItem li = (HtmlListItem) ul.getChildNodes().get(1); // I want to click li and get result page. But it takes a little time to execute.

li.click();

client.waitForBackgroundJavaScript(5 * 1000); //At here it does not do what I want.

之后,当我检查页面时,我发现它的内容没有改变。

如何才能获得正确的页面结果?

谢谢。

【问题讨论】:

    标签: java click htmlunit


    【解决方案1】:

    您可以尝试轮询 JavaScript 条件是否为真

    int attempts = 20;
    int pollMillis = 500;
    boolean success = false;
    for (int i = 0; i < attempts && !success; i++) {
        TimeUnit.MILLISECONDS.sleep(pollMillis);
        if (someJavascriptCondition == true) {
            success = true;
        }
    }
    if (!success) throw new RuntimeException(String.format("Condition not met after %s millis", attempts * pollMillis);
    

    here 讨论的类似技术

    【讨论】:

    • 我没有这样的java脚本条件:/
    • 当然可以。检查微调器图像是否已停止或 div 已更新等
    【解决方案2】:
    WebClient client = new WebClient;
    HtmlPage page = client.getPage("http://www.exapmle.com"); 
    client.waitForBackgroundJavaScript(5 * 1000);
    Thread.sleep(10*1000);// this code will waite to 10 seconds
    HtmlUnorderedList ul = (HtmlUnorderedList) page.getByXPath("//ul[contains(@class, 'class-name')]").get(0);
    HtmlListItem li = (HtmlListItem) ul.getChildNodes().get(1); // I want to click li and get result page. But it takes a little time to execute.
    
    li.click();
    
    client.waitForBackgroundJavaScript(5 * 1000); 
    // this code will waite to 10 seconds
    Thread.sleep(10*1000);
    

    使用 Thread.sleep() 代替 waitForBackgroundJavaScript 为我工作!

    【讨论】:

      【解决方案3】:

      您可以使用 JavaScriptJobManager 检查尚未完成的 JavaScript 作业的数量。调用click()后试试下面的代码。

      JavaScriptJobManager manager = page.getEnclosingWindow().getJobManager();
      while (manager.getJobCount() > 0) {
          System.out.printlin("Jobs remaining: " + manager.getJobCount());
          Thread.sleep(1000);
      }
      

      您可能希望添加另一种方式来结束 while 循环,以防您的 JavaScript 作业永远无法完成。就个人而言,我开始手动终止作业:

      JavaScriptJob job = manager.getEarliestJob();
      System.out.println("Stopping job: " + job.getId());
      manager.stopJob(job.getId());
      

      希望这会有所帮助。

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2021-10-01
        • 2013-07-24
        • 2014-01-06
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2022-01-20
        • 1970-01-01
        相关资源
        最近更新 更多