Question

this was my original question我被困住了，并尝试通过尝试并再次卡住来解决我的问题

我需要从pdf中提取候选人的名字和他的id，所以在使用pdfparser之后我提取了文本并使用php下载了html页面

// main.php
<head>
// other meta tags
<?= Html::csrfMetaTags() ?>
</head>

我做了这个导致我需要的信息在视图源页面的第12行和第13行，这是我需要的所有pdf，所以在下载html文件后我使用下面的代码来查看源页面的HTML文件

<?php
$filename = 'filename.html';
header('Content-disposition: attachment; filename=' . $filename);
header('Content-type: text/html');
// ... the rest of your file
?>
<?php

// Include Composer autoloader if not already done.
include 'C:\Users\amite\Downloads\pdfparser-master (1)\pdfparser-master\vendor\autoload.php';

// Parse pdf file and build necessary objects.
$parser = new  \Smalot\PdfParser\Parser();
$pdf    = $parser->parseFile('C:\Users\amite\Desktop\Data\001.ApplicationForm-CSE-2015-1-omokop (3).pdf');

$text = $pdf->getText();
echo $text;


?>

现在当我运行上面的程序时，我得到了我下载的html文件的源页面，现在我需要从第12行和第13行提取数据，程序的输出如下： -

<?php
show_source("filename.html");
?>

除了html标签和我需要的信息没有标签在12,13行，如果你需要任何澄清请问我会告诉你。我应该如何从第12,13行提取文本，如果有另一种方式告诉我请。我再次陷入困境，如果问题含糊不清，我会澄清或改进，请帮助我。

Answer 1

这是你需要的吗？

<?php
$str = "1text
 2text
3text
4text 
5text 
6text
7text 
8text 
9text
10text 
11text 
12text
13text
";
$k = array_slice(explode("\n",$str),11,1);
print_r($k);

Answer 2

将文件源存储到$source = file('filename.html');的数组中，并通过数组索引11和12提取第12行和第13行，如echo $source[11]; //line 12

使用php从html页面中的特定行提取数据

2 个答案: