Question

使用Php，我如何获取这些数据： http://www.google.com/search?q=xbox+360&tbm=shop&hl=en&aq=f

特别是在该页面上，我想抓住“ $ 160 ”，“ $ 210 ”等等，并将该部分存储在变量中？

我尝试使用：

<?php

    # don't forget the library
    include('simple_html_dom.php');

    # this is the global array we fill with article information
    $Prices = array();

    getPrices('http://www.google.com/search?q=xbox+360&tbm=shop&hl=en&aq=f');

function getPrices($page) {
    global $Prices;

    $html = new simple_html_dom();
    $html->load_file($page);

    $items = $html->find('#leftnav');   

    foreach($items as $post) {
        # remember comments count as nodes
        $Prices[] = $post->children(0)->outertext;
    }
}

?>


<html>
<head>
</head>
<body>
    <div id="main">
<?php
    foreach($Prices as $item) {
        echo $item[0];
        #echo $item[1];
    }
?>
    </div>
</body>
</html>

但输出：<

也许这是抓住这个问题的更难的解决方案。有没有人知道更简单的方法来提取第一个值，比如说在第一页上的所有内容？

Answer 1

Web抓取真的是一个灰色区域（http://en.wikipedia.org/wiki/Web_scraping#Legal_issues）。谷歌已经采取了各种阻止自动刮刀的预防方法。一开始他们会设置一个验证码来试图阻止你的工具，如果继续刮，他们可以全力以赴阻止你的IP。

如果您想获得价格，建议您使用Google搜索API进行购物 - http://code.google.com/apis/shopping/search/v1/getting_started.html

获取此数据的最简单方法

1 个答案: