Question

我正在尝试使用HttpClient下载PDF文件，它正在下载PDF文件，但页面是空白的。如果我打印它，我可以从响应中看到控制台上的字节。但是当我尝试将其写入文件时，它会生成一个空白文件。

FileUtils.writeByteArrayToFile(new File(outputFilePath), bytes);

然而，该文件显示正确的大小为103KB和297KB，但它只是空白!!

我尝试使用输出流，如：

FileOutputStream fileOutputStream = new FileOutputStream(outFile);
fileOutputStream.write(bytes);

还尝试使用UTF-8编码，如：

Writer out = new BufferedWriter( new OutputStreamWriter(
                new FileOutputStream(outFile), "UTF-8"));
        String str = new String(bytes, StandardCharsets.UTF_8);
        try {
            out.write(str);
        } finally {
            out.close();
        }

没有什么对我有用。任何建议都非常感谢..

更新：我正在使用DefaultHttpClient。

HttpGet httpget = new HttpGet(targetURI);
HttpResponse response = null;
String htmlContents = null;
try {
    httpget = new HttpGet(url);
    response = httpclient.execute(httpget);
    InputStreamReader dataStream=new InputStreamReader(response.getEntity().getContent());
    byte[] bytes = IOUtils.toByteArray(dataStream);
...

Answer 1

以下是我用于从特定网址下载PDF文件的方法。该方法需要两个字符串参数，一个url字符串（例如："https://www.ibm.com/support/knowledgecenter/SSWRCJ_4.1.0/com.ibm.safos.doc_4.1/Planning_and_Installation.pdf"）和一个目标文件夹路径，用于下载PDF文件（或其他）。如果本地文件系统中不存在目标路径，则会自动创建它：

public boolean downloadFile(String urlString, String destinationFolderPath) {
    boolean result = false; // will turn to true if download is successful
    if (!destinationFolderPath.endsWith("/") && !destinationFolderPath.endsWith("\\")) {
        destinationFolderPath+= "/";
    }
    // If the destination path does not exist then create it.
    File foldersToMake = new File(destinationFolderPath);
        if (!foldersToMake.exists()) {
            foldersToMake.mkdirs();
        }

    try {
        // Open Connection
        URL url = new URL(urlString);
        // Get just the file Name from URL
        String fileName = new File(url.getPath()).getName();
        // Try with Resources....
        try (InputStream in = url.openStream(); FileOutputStream outStream = 
                    new FileOutputStream(new File(destinationFolderPath + fileName))) {

            // Read from resource and write to file...
            int length = -1;
            byte[] buffer = new byte[1024]; // buffer for portion of data from connection
            while ((length = in.read(buffer)) > -1) {
                outStream.write(buffer, 0, length);
            }
        }
        // File Successfully Downloaded");
        result = true;
    } 
    catch (MalformedURLException ex) { ex.printStackTrace(); } 
    catch (IOException ex) { ex.printStackTrace(); }
    return result;
}

Answer 2

你做

InputStreamReader dataStream=new InputStreamReader(response.getEntity().getContent());
byte[] bytes = IOUtils.toByteArray(dataStream);

正如评论中已经提到的那样，使用Reader类会损坏二进制数据，例如PDF文件。因此，您不应将内容包装在InputStreamReader。

中

虽然您的内容可用于构建InputStreamReader，但我认为response.getEntity().getContent()会返回InputStream。这样的InputStream通常可以直接用作IOUtils.toByteArray参数。

所以：

InputStream dataStream=response.getEntity().getContent();
byte[] bytes = IOUtils.toByteArray(dataStream);

应该已经适合你了！

从网上下载后，pdf中的空白页

2 个答案: