Question

例如，我只想捕获此URL上显示的滚动信息的30个最新事件的数据：

http://hazmat.globalincidentmap.com/home.php#

知道如何捕捉它吗？

Answer 1

您使用的是哪种语言？在Java中，您可以使用以下内容获取页面HTML内容：

URL url;
InputStream is = null;
BufferedReader br;
String line;

try {
    url = new URL("http://hazmat.globalincidentmap.com/home.php");
    is = url.openStream();  // throws an IOException
    br = new BufferedReader(new InputStreamReader(is));

    while ((line = br.readLine()) != null) {
        // Here you need to parse the HTML lines until 
        //you find something you want, like for example
        // "eventdetail.php?ID", and then read the content of
        // the <td> tag or whatever you want to do.

    }
} catch (MalformedURLException mue) {
     mue.printStackTrace();
} catch (IOException ioe) {
     ioe.printStackTrace();
} finally {
    try {
        if (is != null) is.close();
    } catch (IOException ioe) {

    }
}

PHP中的示例：

$c = curl_init('http://hazmat.globalincidentmap.com/home.php');
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);

$html = curl_exec($c);

if (curl_error($c))
    die(curl_error($c));

$status = curl_getinfo($c, CURLINFO_HTTP_CODE);

curl_close($c);

然后解析$html变量的内容。

如何从第三方网站捕获数据？

1 个答案: