【问题标题】:Doesn't CasperJS support redirects with JavaScript?CasperJS 不支持使用 JavaScript 进行重定向吗?
【发布时间】:2016-06-04 21:19:23
【问题描述】:

在我正在抓取的网站上有一个表单,它是使用 JavaScript 提交的。单击按钮后,将发送一个 Ajax 请求,并在收到响应后重定向发生 location.replace...

$("#send").click ( function () {
    var vaaal=encodeURIComponent($("#questionask").val());
        $.ajax({
            url: '/__ajax_post.php',
            data: 'text='+vaaal+'&news=1',
            cache: false,
            dataType: "xml",
            success: function(xml) {
              if ($(xml).find("error").text()==1) {
               $.unblockUI();
                alert ("error");
                return false;
              }
              if ($(xml).find("num").text()) {
                window.location.replace('/question/'+$(xml).find("num").text());
                return false;
              }
            }
        });
  });

之后,我们被重定向到 site-name/question/2098147 url,页面加载了我需要的数据。

我使用这个代码:

spooky.start('http://sprosi-putina.ru/', function fillForm() {
    this.fill('form[name="askmore"]', { questionask: 'fdgs'}, false);
});

spooky.then(function clickSend() {
    this.mouse.click("#send");
});

spooky.then(function readAnswer() {
    this.emit('answerisready', this.evaluate(function() {
        return document.querySelector('.answer').textContent;
    }));
});

spooky.run();

但是PhantomJS点击按钮后,出现问题:

[info] [phantom] Starting...
[info] [phantom] Running suite: 4 steps
[debug] [phantom] opening url: http://sprosi-putina.ru/, HTTP GET
[debug] [phantom] Navigation requested: url=http://sprosi-putina.ru/, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] url changed to "http://sprosi-putina.ru/"
[debug] [phantom] Navigation requested: url=https://googleads.g.doubleclick.net/pagead/html/r20160601/r20151006/zrt_lookup.html, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Navigation requested: url=https://googleads.g.doubleclick.net/pagead/ads?client=ca-pub-4784365547122494&output=html&h=90&slotname=2587861763&adk=581066870&w=728&lmt=1465063498&ea=0&flash=0&url=http%3A%2F%2Fsprosi-putina.ru%2F&wgl=0&dt=1465074178527&bdt=301&idt=323&shv=r20160601&cbv=r20151006&saldr=sb&correlator=2508528726017&frm=20&ga_vid=1561585347.1465074179&ga_sid=1465074179&ga_hid=1525633997&ga_fc=0&pv=2&iag=0&icsg=1018&dssz=7&mdo=0&mso=0&u_tz=180&u_his=1&u_java=0&u_h=768&u_w=1024&u_ah=768&u_aw=1024&u_cd=32&u_nplug=0&u_nmime=0&dff=verdana&dfs=16&adx=0&ady=0&biw=400&bih=300&eid=20040014%2C575144605%2C4087318&oid=3&rx=0&eae=4&fc=216&pc=1&brdim=0%2C0%2C0%2C0%2C1024%2C0%2C0%2C0%2C400%2C300&vis=1&rsz=%7C%7C%7C&abl=CS&ppjl=u1&pfx=0&fu=1040&bc=1&ifi=1&dtd=418, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Successfully injected Casper client-side utilities
[info] [phantom] Step fillForm 2/4 http://sprosi-putina.ru/ (HTTP 200)
[info] [phantom] Step fillForm 2/4: done in 2384ms.
[info] [phantom] Step _step 3/5 http://sprosi-putina.ru/ (HTTP 200)
[info] [phantom] Step _step 3/5: done in 2405ms.
[info] [phantom] wait() finished waiting for 1000ms.
[info] [remote] attempting to fetch form element from selector: 'form[name="askmore"]'
attempting to fetch form element from selector: 'form[name="askmore"]'
[debug] [remote] Set "questionask" field value to fdgs
Set "questionask" field value to fdgs
[info] [phantom] Step clickSend 4/5 http://sprosi-putina.ru/ (HTTP 200)
[debug] [phantom] Navigation requested: url=http://sprosi-putina.ru/, type=LinkClicked, willNavigate=true, isMainFrame=true
[info] [phantom] Step clickSend 4/5: done in 3458ms.
[debug] [phantom] url changed to "http://sprosi-putina.ru/"
[debug] [phantom] Navigation requested: url=https://googleads.g.doubleclick.net/pagead/html/r20160601/r20151006/zrt_lookup.html, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Navigation requested: url=https://googleads.g.doubleclick.net/pagead/ads?client=ca-pub-4784365547122494&output=html&h=90&slotname=2587861763&adk=581066870&w=728&lmt=1465063500&ea=0&flash=0&url=http%3A%2F%2Fsprosi-putina.ru%2F&wgl=0&dt=1465074180587&bdt=15&idt=9&shv=r20160601&cbv=r20151006&saldr=sb&correlator=2171302658049&frm=20&ga_vid=1561585347.1465074179&ga_sid=1465074181&ga_hid=862582667&ga_fc=0&pv=2&iag=0&icsg=1018&dssz=7&mdo=0&mso=0&u_tz=180&u_his=1&u_java=0&u_h=768&u_w=1024&u_ah=768&u_aw=1024&u_cd=32&u_nplug=0&u_nmime=0&dff=verdana&dfs=16&adx=0&ady=0&biw=400&bih=300&eid=20040014%2C575144605%2C828064225&oid=3&rx=0&eae=4&fc=216&pc=1&brdim=0%2C0%2C0%2C0%2C1024%2C0%2C0%2C0%2C400%2C300&vis=1&rsz=%7C%7C%7C&abl=CS&ppjl=u1&pfx=0&fu=1040&bc=1&ifi=1&dtd=31, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Successfully injected Casper client-side utilities
[info] [phantom] Step readAnswer 5/5 http://sprosi-putina.ru/ (HTTP 200)
null
[info] [phantom] Step readAnswer 5/5: done in 3681ms.
[info] [phantom] Done 5 steps in 3698ms

正如您从日志中看到的那样,我们不是转到 /response/number,而是先转到一些 google 广告,然后在我们开始的同一页面上结束 - 主页 (http://sprosi-putina.ru/)。 CasperJS 不能正确处理 JavaScript 重定向还是什么?

【问题讨论】:

    标签: javascript node.js web-scraping phantomjs casperjs


    【解决方案1】:

    CasperJS 不会接受重定向,因为它发生的时间比点击晚得多。如果你有一个动态页面,那么你应该使用适当的wait*函数比如waitForSelector

    spooky.waitForSelector('.answer', function readAnswer() {
        this.echo(this.evaluate(function() {
            return document.querySelector('.answer').textContent;
        }));
    });
    

    【讨论】:

    • 不是这样的。 [错误] [幻象] 等待超时 5000 毫秒已过期,正在退出。等待超时 5000 毫秒已过期,正在退出。
    • This 适用于 CasperJS 1.1-beta5 和 PhantomJS 2.1.1。我得到“.answer”输出
    • 你能把CasperJS和PhantomJS在执行时输出的所有调试日志发给我吗
    • 我已将输出添加到链接中。编码是一个不同的问题。欢迎提出新问题。
    • 当我添加 viewportSize : { width: 1200, height: 1200 },设置为 casper 选项对象时,它终于起作用了。奇怪...
    猜你喜欢
    • 2011-12-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-05-20
    • 2012-01-31
    • 1970-01-01
    • 2019-03-12
    • 2013-04-04
    相关资源
    最近更新 更多