PHP Curl在下载之前检查文件是否存在

时间:2013-02-05 03:55:33

标签: php curl

我正在编写一个PHP程序,它从后端下载pdf并保存到本地驱动器。现在如何在下载之前检查文件是否存在?

目前我正在使用curl(请参阅下面的代码)进行检查和下载,但它仍然会下载大小为1KB的文件。

$url = "http://wedsite/test.pdf";
$path = "C:\\test.pdf;"
downloadAndSave($url,$path);

function downloadAndSave($urlS,$pathS)
    {
        $fp = fopen($pathS, 'w');

        $ch = curl_init($urlS);

        curl_setopt($ch, CURLOPT_FILE, $fp);
        $data = curl_exec($ch);

        $httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
        echo $httpCode;
        //If 404 is returned, then file is not found.
        if(strcmp($httpCode,"404") == 1)
        {
            echo $httpCode;
            echo $urlS; 
        }

        fclose($fp);

    }

我想在下载之前检查文件是否存在。知道怎么做吗?

4 个答案:

答案 0 :(得分:6)

您可以使用单独的curl HEAD请求执行此操作:

curl_setopt($ch, CURLOPT_NOBODY, true);
$data = curl_exec($ch);

$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);

当您真正想要下载时,可以使用NOBODY设置为false

答案 1 :(得分:2)

由于您使用HTTP来获取互联网上的资源,您真正想要检查的是返回码是404.

在某些PHP安装中,您可以直接使用file_exists($url)。但是,这并不适用于所有环境。 http://www.php.net/manual/en/wrappers.http.php

这是一个与file_exists非常相似的函数,但对于URL,使用curl:

<?php function curl_exists()
  $file_headers = @get_headers($url);
  if($file_headers[0] == 'HTTP/1.1 404 Not Found') {
    $exists = false;
  }
  else {
    $exists = true;
  }
} ?>

来源:http://www.php.net/manual/en/function.file-exists.php#75064

有时CURL扩展没有安装PHP。在这种情况下,您仍然可以在PHP核心中使用套接字库:

<?php function url_exists($url) {
       $a_url = parse_url($url);
       if (!isset($a_url['port'])) $a_url['port'] = 80;
       $errno = 0;
       $errstr = '';
       $timeout = 30;
       if(isset($a_url['host']) && $a_url['host']!=gethostbyname($a_url['host'])){
           $fid = fsockopen($a_url['host'], $a_url['port'], $errno, $errstr, $timeout);
           if (!$fid) return false;
           $page = isset($a_url['path'])  ?$a_url['path']:'';
           $page .= isset($a_url['query'])?'?'.$a_url['query']:'';
           fputs($fid, 'HEAD '.$page.' HTTP/1.0'."\r\n".'Host: '.$a_url['host']."\r\n\r\n");
           $head = fread($fid, 4096);
           $head = substr($head,0,strpos($head, 'Connection: close'));
           fclose($fid);
           if (preg_match('#^HTTP/.*\s+[200|302]+\s#i', $head)) {
            $pos = strpos($head, 'Content-Type');
            return $pos !== false;
           }
       } else {
           return false;
       }
   } ?>

来源:http://www.php.net/manual/en/function.file-exists.php#73175

这里可以找到更快的功能: http://www.php.net/manual/en/function.file-exists.php#76246

答案 2 :(得分:1)

在下载功能之前调用它并完成:

#!/bin/bash
n=${1:-1}
#while getopts f name
#do
#       case $name in
#               f)dopt=1;;
#               *) echo "Invalid arg";;
#       esac
#done
if [[ $1 == "print" ]]
then
    printf "Booktitle: \t\t %s\n" `awk -F '~' '{print $1}' books` >> book_print
    printf "Author(s): \t\t %s\n" `awk -F '~' '{print $2}' books` >> book_print
    printf "Publisher: \t\t %s\n" `awk -F '~' '{print $3}' books`  >> book_print
    printf "Year of Publication: \t %s\n" `awk -F '~' '{print $4}' books` >> book_print
else
    for ((i = 1; i < n + 1; i++))
    do
        echo -n "Booktitle: " 
        read  b
        book=$b
        echo -n $book >> books
        echo -n "~" >> books
        echo -n "Author(s): "
        read a
        author=$a
        echo -n $author >> books
        echo -n "~" >> books
        echo -n "Publisher: "
        read  p
        publisher=$p
        echo -n $publisher >> books
        echo -n "~" >> books
        echo -n "Year of publication: "
        read y
        year=$y
        echo $year >> books
    done
fi

&GT;

答案 3 :(得分:0)

在上面的第一个示例中,$ file_headers [0]可能包含“HTTP / 1.1 404 Not Found”以外的其他内容,例如:

HTTP/1.1 404 Document+%2Fdb%2Fscotbiz%2Freports%2FR20131212%2Exml+not+found

因此,使用其他测试(例如正则表达式)非常重要,因为“==”不可靠。