【问题标题】:casperjs doesn't work as expected on windows machinecasperjs 在 Windows 机器上无法按预期工作
【发布时间】:2017-05-15 18:52:39
【问题描述】:

我有一个 casperjs 脚本,当我在 linux 服务器上运行时,它会给出所需的结果,但是当我在笔记本电脑上运行时,它就不起作用了。

我应该如何调试?工作日志:

[info] [phantom] Starting...
[info] [phantom] Running suite: 3 steps
[debug] [phantom] opening url: http://caspertest.grsrv.com/, HTTP GET
[debug] [phantom] Navigation requested: url=http://caspertest.grsrv.com/, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] Navigation requested: url=https://caspertest.grsrv.com/my_app, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] Navigation requested: url=https://caspertest.grsrv.com/my_app/, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] url changed to "https://caspertest.grsrv.com/my_app/"
[debug] [phantom] Navigation requested: url=https://caspertest.grsrv.com/my_app/#/auth, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] url changed to "https://caspertest.grsrv.com/my_app/#/auth"
[debug] [phantom] Successfully injected Casper client-side utilities
[info] [phantom] Step anonymous 2/3 https://caspertest.grsrv.com/my_app/#/auth (HTTP 200)
[info] [remote] attempting to fetch form element from selector: 'form'
[debug] [remote] Set "null" field value to test
[debug] [remote] Set "null" field value to ****
[debug] [phantom] Capturing page to /home/grsrvadmin/gs/casper/ss.png
[info] [phantom] Capture saved to /home/grsrvadmin/gs/casper/ss.png
[debug] [phantom] Mouse event 'mousedown' on selector: input[id="loginButton"]
[debug] [phantom] Mouse event 'mouseup' on selector: input[id="loginButton"]
[debug] [phantom] Mouse event 'click' on selector: input[id="loginButton"]
[info] [phantom] Step anonymous 2/3: done in 1556ms.
[info] [phantom] Step _step 3/3 https://caspertest.grsrv.com/my_app/#/auth (HTTP 200)
[info] [phantom] Step _step 3/3: done in 1569ms.
[debug] [phantom] Navigation requested: url=https://caspertest.grsrv.com/my_app/#/agreement/r8moskcfv7c80gpcd40fl12nmpf9e0nb, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] url changed to "https://caspertest.grsrv.com/my_app/#/agreement/r8moskcfv7c80gpcd40fl12nmpf9e0nb"
[debug] [phantom] url changed to "https://caspertest.grsrv.com/my_app/#/agreement/r8moskcfv7c80gpcd40fl12nmpf9e0nb"
[info] [phantom] waitFor() finished in 217ms.
[info] [phantom] Step anonymous 4/4 https://caspertest.grsrv.com/my_app/#/agreement/r8moskcfv7c80gpcd40fl12nmpf9e0nb (HTTP 200)
[debug] [phantom] Mouse event 'mousedown' on selector: input[id="aggr_actionAccept"]
[debug] [phantom] Mouse event 'mouseup' on selector: input[id="aggr_actionAccept"]
[debug] [phantom] Mouse event 'click' on selector: input[id="aggr_actionAccept"]
[info] [phantom] Step anonymous 4/4: done in 1813ms.
[info] [phantom] Done 4 steps in 1826ms
[debug] [phantom] Navigation requested: url=about:blank, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] url changed to "about:blank"

在windows机器上登录:

[info] [phantom] Starting...
[info] [phantom] Running suite: 3 steps
[debug] [phantom] opening url: http://caspertest.grsrv.com/, HTTP GET
[debug] [phantom] Navigation requested: url=http://caspertest.grsrv.com/, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] Navigation requested: url=https://caspertest.grsrv.com/my_app, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] Navigation requested: url=https://caspertest.grsrv.com/my_app/, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] url changed to "https://caspertest.grsrv.com/my_app/"
[debug] [phantom] Navigation requested: url=https://caspertest.grsrv.com/my_app/#/auth, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] url changed to "https://caspertest.grsrv.com/my_app/#/auth"
[debug] [phantom] Successfully injected Casper client-side utilities
[info] [phantom] Step anonymous 2/3 https://caspertest.grsrv.com/my_app/#/auth (HTTP 200)
[info] [remote] attempting to fetch form element from selector: 'form'
[debug] [phantom] Navigation requested: url=about:blank, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] url changed to "about:blank"

脚本:

var casper = require('casper').create ({
  waitTimeout: 60000,
  stepTimeout: 60000,
  verbose: true,
  logLevel: "debug",
  viewportSize: {
    width: 1366,
    height: 768
  },
  pageSettings: {
    "userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:50.0) Gecko/20100101 Firefox/50.0",
    "loadImages": true,
    "loadPlugins": true,
    "webSecurityEnabled": false,
    "ignoreSslErrors": true
  },
  onWaitTimeout: function() {
    casper.echo('Wait TimeOut Occured');
  },
  onStepTimeout: function() {
    casper.echo('Step TimeOut Occured');
  }
});

casper.start('http://caspertest.grsrv.com/', function() {
    this.fillSelectors('form', {
        'input[id="userName"]': 'test',
        'input[id="userPassword"]': 'test',
    }, false);
    this.capture('ss.png');
    this.click('input[id="loginButton"]')
});


casper.waitForSelector('#aggr_actionAccept', function() {
    this.click('input[id="aggr_actionAccept"]')
});

这就是我的执行方式:

casperjs --ignore-ssl-errors=true test.js

资源位于客户端,我在 Windows 机器上使用 VPN 来访问浏览器上的资源。它正在使用的linux机器是客户端本身

【问题讨论】:

  • this.fillSelectors 之前添加this.echo(this.getHTML()); this.capture('before_fill.png'); 怎么样 - 也许页面看起来不同或从 Windows 框加载速度不快?
  • 是 this.getHtML() 的响应。如何检查 casperjs 正在运行哪些选项?我想知道它是否真的忽略了 ssl 错误。了解这一点很重要,因为当我使用浏览器访问时,它会出现 ssl 证书问题。抱歉回复晚了。
  • 是的,我正在使用该选项。我在我的问题中提到过。

标签: javascript web-scraping phantomjs casperjs


【解决方案1】:

添加resource.error 事件处理程序:
casper.on("resource.error", function(resourceError){
    console.log('Unable to load resource (#' + resourceError.id + 'URL:' + resourceError.url + ')');
    console.log('Error code: ' + resourceError.errorCode + '. Description: ' + resourceError.errorString);
});

看来,这是一个known PhantomJS bug(已在 2.5 测试版中修复)。
您可以从this page 下载 PhantomJS 2.5 beta。

另请参阅:
CasperJS/PhantomJS doesn't load https page
PhantomJS failing to open HTTPS site

您需要添加以下回调,以捕获所有错误:
casper
.on("error", function(msg){ this.echo("error: " + msg, "ERROR") })
.on("page.error", function(msg, trace){ this.echo("Page Error: " + msg, "ERROR") })
.on("remote.message", function(msg){ this.echo("Info: " + msg, "INFO") });

【讨论】:

  • 是的,我试过在不提交表单的情况下获取屏幕截图,但我得到一个空白页面。所以我打印了 html,我得到了<html><head></head><body></body></html>。在普通浏览器上,我收到一个安全错误,我忽略了它,然后我就可以正常登录了。
  • 之前我的 waitTimeout 和 stepTimeout 为 60 秒,后来我也尝试了 120 秒。没有收获。我添加了您提到的回调,除了:error: CasperError: Errors encountered while filling form: form not found
  • 抱歉,因为是周末,所以无法查看。我会在几分钟内更新我的发现。顺便说一句,我已经尝试过--ssl-protocol=any --ignore-ssl-errors=true
  • 它仍然没有工作,但你的事件处理程序给出了原因Error code: 6. Description: SSL handshake failed;理想情况下,--ssl-protocol=any 应该解决它,但不适合我。 phantomjs --version :2.1.1, casperjs --version : 1.1.0-beta5
  • 它没有用 :(,仍然得到Error code: 6. Description: SSL handshake failed。这是预期的,因为--ssl-protocol=any 没有用。我真的觉得 casper 或 phantom 忽略了这个选项。谢谢你 igor和我一起。:)
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2019-01-15
  • 2014-10-20
  • 2021-08-13
  • 1970-01-01
相关资源
最近更新 更多