【问题标题】:Selebiun Crawler Timeout Issues C#Selebiun Crawler 超时问题 C#
【发布时间】:2015-05-23 20:08:16
【问题描述】:

我正在运行以下代码,在执行过程中我收到一个错误。

  public void GetCategoriesSelenium() {
            string javascript = System.IO.File.ReadAllText(@"GetCategory.js");
            CrawlerWebSeleniumJS.ExecuteScript("var finished;");
            CrawlerWebSeleniumJS.ExecuteScript("var all_categories;");
            CrawlerWebSeleniumJS.ExecuteScript("finished = false;");
            CrawlerWebSeleniumJS.ExecuteScript("all_categories = [];");

            CrawlerWebSelenium.Manage().Timeouts().SetScriptTimeout(TimeSpan.FromDays(1));
            CrawlerWebSelenium.Manage().Timeouts().SetPageLoadTimeout(TimeSpan.FromDays(1));
            CrawlerWebSelenium.Manage().Timeouts().ImplicitlyWait(TimeSpan.FromDays(1));

            AddToConsole("CRAWLER: GET - Categories");

            try {
                CrawlerWebSeleniumJS.ExecuteScript(javascript);
                }
            catch {
                }

            int ready = 2;

            for (int i = 0; i < ready; i++) {
                try {
                    if (CrawlerWebSeleniumJS.ExecuteScript("return finished").ToString() == "True") {
                        i = i++ + ready++;
                        }
                    else {
                        ready++;
                        }
                    }
                catch {

                    }
                }
            AddToCatsTreeSelenium();
            }

$('.p-pstctgry-lnk-ctgry').each(function (i) {
    var idBits = this.id.split('_');
    var theId = idBits[1];
    var theTitle = this.text;
    var subcategories = [];
    //initiate ajax request for json results
    $.ajax({
        async: false,
        type: 'GET',
        dataType: 'json',
        url: 'URL REMOVED',
        data: {
            nodeType: 'cat',
            level1id: theId
        }
    }).done(function (theJSON1) {
        var thelength1 = Object.keys(theJSON1['items']).length;
        //loop through found subs
        for (var i = 0; i < thelength1; i++) {
            //start of next recursive block to copy and paste inside
            var subsubcategories = [];
            //initiate ajax request for sub json results
            $.ajax({
                async: false,
                type: 'GET',
                dataType: 'json',
                url: 'URL REMOVED',
                data: {
                    nodeType: 'cat',
                    level1id: theId,
                    level2id: theJSON1['items'][i]['id']
                }
            }).done(function (theJSON2) {
                var thelength2 = Object.keys(theJSON2['items']).length;
                for (var k = 0; k < thelength2; k++) {
                    //start of next recursive block to copy and paste inside
                    var subsubsubcategories = [];
                    //initiate ajax request for sub json results
                    if ((theJSON2['items'][k]['id'] != 'OFFER') && (theJSON2['items'][k]['id'] != 'WANTED')) {
                        $.ajax({
                            async: false,
                            type: 'GET',
                            dataType: 'json',
                            url: 'URL REMOVED',
                            data: {
                                nodeType: 'cat',
                                level1id: theId,
                                level2id: theJSON1['items'][i]['id'],
                                level3id: theJSON2['items'][k]['id']
                            }
                        }).done(function (theJSON3) {
                            var thelength3 = Object.keys(theJSON3['items']).length;
                            for (var l = 0; l < thelength3; l++) {
                                console.log('---' + theJSON3['items'][l]['value'] + ' ' + theJSON3['items'][l]['id']);
                                //store this subsub
                                subsubsubcategories.push({
                                    title: theJSON3['items'][l]['value'],
                                    id: theJSON3['items'][l]['id'],
                                    sub: ''
                                });
                            }
                            //end done theJSON
                        });
                    }
                    //end of next recursive block to copy and paste inside
                    console.log('--' + theJSON2['items'][k]['value'] + ' ' + theJSON2['items'][k]['id']);
                    //store this subsub
                    subsubcategories.push({
                        title: theJSON2['items'][k]['value'],
                        id: theJSON2['items'][k]['id'],
                        sub: subsubsubcategories
                    });
                }
                //end done theJSON
            });
            console.log('-' + theJSON1['items'][i]['value'] + ' ' + theJSON1['items'][i]['id']);
            //store this sub with -> subsub
            subcategories.push({
                title: theJSON1['items'][i]['value'],
                id: theJSON1['items'][i]['id'],
                sub: subsubcategories
            });
            //end of next recursive block to copy and paste inside

            //end sub loop
        }
        console.log('' + theTitle + ' ' + theId);
        //store this cat with -> sub -> subsub
        all_categories.push({
            title: theTitle,
            id: theId,
            sub: subcategories
        });
        console.log(all_categories);
        //end first json subcat loop
    });
    //end main cat scan loop
});
finished = true;

上面的代码是我运行的方法,它下面的代码是通过 selenium 运行的纯 javascript。

所以问题一,当代码运行时,硒锁住了。我能理解。这个过程大约需要 4 分钟。 60 秒后超时并出现错误

向远程 WebDriver 服务器请求 URL 的 HTTP 请求在 60 秒后超时。

这真的很烦人并锁定了系统。我知道一个非常快速和简单的方法来解决这个问题。 (Thread.Sleep(300000) 太恶心了……

我的想法是,也许它正在运行一个 javascript 查询并等待它完成,而我不断地用更多的 javascript 请求冲击 Selenium,这些请求如预期的那样超时。

还有其他想法吗?

【问题讨论】:

    标签: javascript c# selenium selenium-webdriver


    【解决方案1】:

    驱动程序的构造函数应该有一个重载,其中包含一个TimeSpan,指示.NET 绑定使用的HTTP 客户端与远程端通信的超时。将其设置为适当大的值应该足以让操作完成。

    【讨论】:

    • 这不是codeCrawlerWebSelenium.Manage().Timeouts().SetScriptTimeout(TimeSpan.FromDays(1)); CrawlerWebSelenium.Manage().Timeouts().SetPageLoadTimeout(TimeSpan.FromDays(1)); CrawlerWebSelenium.Manage().Timeouts().ImplicitlyWait(TimeSpan.FromDays(1)); 在做什么吗?
    • 不,一点也不。这些超时都不会影响语言绑定和驱动程序实现的所谓“远程端”之间的 HTTP 通信。我认为,构造函数重载是您正在寻找的。​​span>
    • 干杯,伙计,我稍微深入研究一下,看看我能找到什么
    猜你喜欢
    • 2011-02-20
    • 1970-01-01
    • 2021-11-20
    • 1970-01-01
    • 1970-01-01
    • 2011-12-22
    • 1970-01-01
    • 1970-01-01
    • 2011-04-24
    相关资源
    最近更新 更多