读取文件到特定行php

时间:2014-10-12 13:44:06

标签: php fgets

我试图读取一个大文件(大约500万行),它一直达到内存限制。有没有办法可以将文件读取到特定行,然后递增计数器并从下一行继续?

以下是我正在使用的代码,如何添加指向fgets起始行的指针?

$handle = @fopen("large_file.txt", "r");
if($handle){
   while(($buffer = fgets($handle, 4096)) !== false){
      //get the content of the line
    }
}

我没有尝试只阅读一条特定的线路,我试图从第1行到第10,000行读取,然后从第10,001行重新开始到另外10,000行,就像那样。

3 个答案:

答案 0 :(得分:0)

我认为答案会在Reading specific line of a file in php

您可以使用搜索获取特定的行位置

$file = new SplFileObject('yourfile.txt');
$file->seek(123); // seek to line 124 (0-based)

答案 1 :(得分:0)

尝试使用此功能,您应该使用迭代并逐行获取。

$file = new SplFileObject('yourfile.txt');

echo getLineRange(1,10000);

function getLineRange($start,$end){
   $tmp = "";

   for ($i = $start; $i <= $end; $i++) {
    $tmp .= $file->seek($i);
   }
   return($tmp);
}

答案 2 :(得分:0)

可以使用fseek() / ftell()在PHP中批量处理块中的大文件,并在块之间保存上下文。 (SplFileObject::seek()可以直接搜索,但似乎有performance issues with large files。)

假设您有某种批处理器可用,以下示例应该可以让您了解该方法。它未经测试,但源自生产中的代码。

<?php

$context = array(
    'path' => 'path/to/file',
    'limit' => 1000,
    'line' => NULL,
    'position' => NULL,
    'size' => NULL,
    'percentage' => 0,
    'complete' => FALSE,
    'error' => FALSE,
    'message' => NULL,
);

function do_chunk($context) {
    $handle = fopen($context['path'], 'r');
    if (!$handle) {
        $context['error'] = TRUE;
        $context['message'] = 'Cannot open file for reading: ' . $context['path'];
        return;
    }
    // One-time initialization of file parameters.
    if (!isset($context['size'])) {
        $fstat = fstat($handle);
        $context['size'] = $fstat['size'];
        $context['position'] = 0;
        $context['line'] = 0;
    }
    // Seek to position for current chunk.
    $ret = fseek($handle, $context['position']);
    if ($ret === -1) {
        $context['error'] = TRUE;
        $context['message'] = 'Cannot seek to ' . $context['position'];
        fclose($handle);
        return;
    }

    $k = 1;
    do {
        $context['line']++;
        $raw_line = fgets($handle);
        if ($raw_line) {
            // Strip newline.
            $line = rtrim($raw_line);
            // Code to process line here.
            list($error, $message) = my_process_line($context, $line);
            if ($error) {
                $context['error'] = TRUE;
                $context['message'] = $message;
                fclose($handle);
                return;
            }
        } elseif (($raw_line === FALSE) && !feof($handle)) {
            $context['error'] = TRUE;
            $context['message'] = 'Unexpected error reading ' . $context['path'];
            fclose($handle);
            return;
        }
    }
    while ($k++ < $context['limit'] && $raw_line);

    // Save position of next chunk.
    $position = ftell($handle);
    if ($position !== FALSE) {
        $context['position'] = $position;
    } else {
        $context['error'] = TRUE;
        $context['message'] = 'Cannot retrieve file pointer in ' . $context['path'];
        fclose($handle);
        return;
    }

    if (!$raw_line) {
        $context['complete'] = TRUE;
        $context['percentage'] = 1;
    } else {
        $context['percentage'] = $context['position'] / $context['size'];
    }

    fclose($handle);
}


// Batch driver for testing only - use a batch processor in production.
while ($context['complete']) {
    do_batch($context);
}
if ($context['error']) {
    print 'error: ' . $context['message'];
} else {
    print 'complete';
}