用Goutte和Symfony2刮掉Facebook - 登录Cookies错误

时间:2015-11-01 01:13:35

标签: php facebook symfony curl goutte

我正在使用symfony2Goutte来抓取网络数据。我正在尝试简单地登录到Facebook并返回登录的页面数据。

这是我的代码:

<?php


namespace junk\scraperBundle\Controller;

use Symfony\Bundle\FrameworkBundle\Controller\Controller;
use Goutte\Client;


class ThingController extends Controller
{
public function somethingAction($something)
{

    // make a request to an external site
    $client = new Client();
    $client->getClient()->setDefaultOption('config/curl/'.CURLOPT_SSL_VERIFYHOST, FALSE);
    $client->getClient()->setDefaultOption('config/curl/'.CURLOPT_SSL_VERIFYPEER, FALSE); 
    $client->getClient()->setDefaultOption('config/curl/'.CURLOPT_RETURNTRANSFER, TRUE); 
    $client->getClient()->setDefaultOption('config/curl/'.CURLOPT_FOLLOWLOCATION, TRUE); 
    $client->getClient()->setDefaultOption('config/curl/'.CURLOPT_COOKIESESSION, TRUE); 
    $client->setHeader('User-Agent', "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.101 Safari/537.36");
    $crawler = $client->request('GET', 'https://www.facebook.com');

    // select the form and fill in some values
    $form = $crawler->selectButton('Log In')->form();
    $form['email'] = 'email@junk.com';
    $form['pass'] = 'password';

    // submit that form
    $crawler = $client->submit($form);
    echo $crawler->html();

    return $this->render('scraperBundle:Thing:index.html.twig');

}



} // END class ThingController

问题是我收到了错误:

Cookies Required Cookies are not enabled on your browser. Please enable cookies in your browser preferences to continue.

我认为问题出在cURL的配置选项中。只有CURLOPT_SSL_VERIFYHOSTCURLOPT_SSL_VERIFYPEER选项,我才能成功进入其他https页面,如GitHub,但我无法弄清楚如何为Facebook做到这一点。

有什么建议吗?

谢谢!

1 个答案:

答案 0 :(得分:0)

你可以这样试试:

public function somethingAction($something)
{
    $cookie_file = '/tmp/' . uniqid() . 'cookie';

    $client = new Client();

    /.../

    $client->getClient()->setDefaultOption('config/curl/'.CURLOPT_COOKIEFILE, $cookie_file); 
    $client->getClient()->setDefaultOption('config/curl/'.CURLOPT_COOKIEJAR, $cookie_file); 

    /.../
}

希望它可以帮到你。

相关问题