网络抓取器

时间:2009-07-03 05:55:08

标签: java

在java中如何下载url并将其保存到本地目录中。更多我想要下载的网址的离线视图(尤其是html内容)。

1 个答案:

答案 0 :(得分:1)

这是将HTML吸引到字符串中的一些代码。注意,这不是拉内容(图像等),只是HTML!享受:)

try
{
    URL url = new URL("http://www.stackoverflow.com");
    URLConnection connection = url.openConnection();

    connection.setDoInput(true);
    InputStream inStream = connection.getInputStream();
    BufferedReader input = new BufferedReader(new InputStreamReader(inStream));

    String html = "";
    String line = "";
    while ((line = input.readLine()) != null)
    {
        html += line;
    }

    //Now you can do what you please with
    //the HTML content (save it locally, parse, etc...)
}
catch(Exception e)
{
    //Error handling
}