如何通过代理轮换来减慢CURL请求?

时间:2018-11-02 13:49:54

标签: php curl web-scraping https proxy

我正在使用CURL进行代理轮换:

$url = 'https://www.stubhub.com/';
$proxiesArray = array();
$curl = curl_init();
for ($i = 0; $i <= count($proxiesArray) - 1; $i++) {

    //CURL options.
    curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0);
    curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
    curl_setopt($curl, CURLOPT_PROXYTYPE, CURLPROXY_HTTP);
    curl_setopt($curl, CURLOPT_HTTPPROXYTUNNEL, TRUE);
    curl_setopt($curl, CURLOPT_PROXY, $proxiesArray[$i]);
    curl_setopt($curl, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
    curl_setopt( $curl, CURLOPT_AUTOREFERER, TRUE );
    curl_setopt( $curl, CURLOPT_HEADER, FALSE );
    curl_setopt( $curl, CURLOPT_CONNECTTIMEOUT, 0 );
    curl_setopt( $curl, CURLOPT_TIMEOUT, 0 );
    curl_setopt( $curl, CURLOPT_RETURNTRANSFER, TRUE );
    curl_setopt( $curl, CURLOPT_URL, trim($url) );
    curl_setopt($curl, CURLOPT_REFERER, trim($url));
    curl_setopt( $curl, CURLOPT_FOLLOWLOCATION, TRUE );
    curl_setopt($curl, CURLOPT_VERBOSE, TRUE);

    //CURL info.
    $data = curl_exec( $curl );
    $info = curl_getinfo( $curl );
    $error = curl_error( $curl );
    $all = array($data, $info, $error);

    //If success.
    if (empty($error))  {
        echo '<pre>';
        print_r($all);
        echo '</pre>';
        break;
    }

    //Wait for 2 seconds.
    sleep(2);
}
curl_close( $curl );

但是我被重定向到包含消息的Recaptcha页面:

Due to high volume of activity from your computer, our anti-robot software has blocked your access to stubhub.com. Please solve the puzzle below and you will immediately regain access.

为减慢请求速度,我尝试:

curl_setopt($curl,CURLOPT_MAX_RECV_SPEED_LARGE,10);

也:

curl_setopt($curl, CURLOPT_PROGRESSFUNCTION, function() {
    sleep(2);
    return 0;
});

但是我得到了相同的消息,那么如何像浏览器发出的真实请求那样减慢该过程呢?

1 个答案:

答案 0 :(得分:0)

我认为您的问题是出自另一件事

对于像浏览器这样的创建请求,您应该在请求中使用标头

例如,我建议您在代码中添加useragent并在每个单个请求中对其进行更改!

用户代理示例: User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:61.0) Gecko/20200101 Firefox/61.0