【问题标题】:How to slow down CURL requests with proxy rotating?如何通过代理旋转减慢 CURL 请求?
【发布时间】:2019-04-06 18:30:29
【问题描述】:

我正在使用CURL 代理旋转:

$url = 'https://www.stubhub.com/';
$proxiesArray = array();
$curl = curl_init();
for ($i = 0; $i <= count($proxiesArray) - 1; $i++) {

    //CURL options.
    curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0);
    curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
    curl_setopt($curl, CURLOPT_PROXYTYPE, CURLPROXY_HTTP);
    curl_setopt($curl, CURLOPT_HTTPPROXYTUNNEL, TRUE);
    curl_setopt($curl, CURLOPT_PROXY, $proxiesArray[$i]);
    curl_setopt($curl, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
    curl_setopt( $curl, CURLOPT_AUTOREFERER, TRUE );
    curl_setopt( $curl, CURLOPT_HEADER, FALSE );
    curl_setopt( $curl, CURLOPT_CONNECTTIMEOUT, 0 );
    curl_setopt( $curl, CURLOPT_TIMEOUT, 0 );
    curl_setopt( $curl, CURLOPT_RETURNTRANSFER, TRUE );
    curl_setopt( $curl, CURLOPT_URL, trim($url) );
    curl_setopt($curl, CURLOPT_REFERER, trim($url));
    curl_setopt( $curl, CURLOPT_FOLLOWLOCATION, TRUE );
    curl_setopt($curl, CURLOPT_VERBOSE, TRUE);

    //CURL info.
    $data = curl_exec( $curl );
    $info = curl_getinfo( $curl );
    $error = curl_error( $curl );
    $all = array($data, $info, $error);

    //If success.
    if (empty($error))  {
        echo '<pre>';
        print_r($all);
        echo '</pre>';
        break;
    }

    //Wait for 2 seconds.
    sleep(2);
}
curl_close( $curl );

但我被重定向到包含一条消息的 Recaptcha 页面:

Due to high volume of activity from your computer, our anti-robot software has blocked your access to stubhub.com. Please solve the puzzle below and you will immediately regain access.

为了减慢请求,我尝试了:

curl_setopt($curl,CURLOPT_MAX_RECV_SPEED_LARGE,10);

还有:

curl_setopt($curl, CURLOPT_PROGRESSFUNCTION, function() {
    sleep(2);
    return 0;
});

但是我得到了同样的信息,那么如何减慢这个过程就像来自浏览器的真实请求一样?

【问题讨论】:

    标签: php curl web-scraping https proxy


    【解决方案1】:

    我认为你的问题来自另一件事

    对于像浏览器这样的创建请求,您应该在请求中使用标头

    例如,我建议您在代码中添加用户代理并在每个请求中更改它!

    示例用户代理: User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:61.0) Gecko/20200101 Firefox/61.0

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2013-02-26
      • 2023-04-02
      • 2012-05-03
      • 2014-08-25
      • 2013-06-11
      相关资源
      最近更新 更多