获取每个tr的表的td值

时间:2014-01-19 18:51:34

标签: php parsing dom xpath html-parsing

如何使用td为表中的每个tr提取内部值DOM?我有一张这样的桌子:

<table>
   <tbody>
      <tr class="rowData">
         <td class="cellData">
            <a href="#"><span> DATA 1 </span></a>
         </td>
         <td class="cellData">
            <div class="div1"><div class="div2"> DATA 1 a </div></div>
         </td>
         <td class="cellData">
            <div class="div1"><div class="div2"> DATA 1 b </div></div>
         </td>
         <td class="cellData">
            <div class="div1"><div class="div2"> DATA 1 c </div></div>
         </td>
      </tr>
      <tr class="rowData">
         <td class="cellData">
            <a href="#"><span> DATA 2 </span></a>
         </td>
         <td class="cellData">
            <div class="div1"><div class="div2"> DATA 2 a </div></div>
         </td>
         <td class="cellData">
            <div class="div1"><div class="div2"> DATA 2 b </div></div>
         </td>
         <td class="cellData">
            <div class="div1"><div class="div2"> DATA 2 c </div></div>
         </td>
      </tr>
      <tr class="rowData">
         <td class="cellData">
            <a href="#"><span> DATA 3 </span></a>
         </td>
         <td class="cellData">
            <div class="div1"><div class="div2"> DATA 3 a </div></div>
         </td>
         <td class="cellData">
            <div class="div1"><div class="div2"> DATA 3 b </div></div>
         </td>
         <td class="cellData">
            <div class="div1"><div class="div2"> DATA 3 c </div></div>
         </td>
      </tr>
   </tbody>
<table>

我得到的是:每行

<label> DATA n </label>
<input value="DATA n a">
<input value="DATA n b">
<input value="DATA n c">

我被这段代码困住了:

$html = file_get_contents($link);
$html2 = (preg_replace('/\s+/', ' ', $html));
$doc = new DOMDocument();
$doc->loadHTML($html2);
$xpath = new DOMXPath($doc);
$tables = $doc->getElementsByTagName('table');
foreach($xpath->query('.//tbody/tr[@class="rowData"]') as $node){
}
foreach($xpath->query('.//tbody/tr/td/div/div[@class="div2"]') as $node){
}
foreach($xpath->query('.//tbody/tr/td/a/span') as $node){
echo $node->nodeValue;
}

有人可以帮助我吗?

2 个答案:

答案 0 :(得分:0)

这是一个可能的解决方案,实际上是两个 - 但评论一个太难看了。 :)

$html = file_get_contents($link);
$html2 = (preg_replace('/\s+/', ' ', $html));
$doc = new DOMDocument();
$doc->loadHTML($html2);

$elements = $doc->getElementsByTagName('tr');
foreach($elements as $node){

$inputs1=$node->getElementsByTagName('div')->item(1); // 0,2,4...does same
$inputs2=$node->getElementsByTagName('div')->item(3);
$inputs3=$node->getElementsByTagName('div')->item(5);

echo '<label>'. $node->firstChild->nodeValue.  '</label>';
echo '<input value="'. $inputs1->nodeValue.  '">';
echo '<input value="'. $inputs2->nodeValue.  '">';
echo '<input value="'. $inputs3->nodeValue.  '">';

//ugly as hell - but it is working :)

/*echo '<input value="'. $node->firstChild->nextSibling->nextSibling->nodeValue.  '">';


echo '<input value="'. $node->firstChild->nextSibling->nextSibling->nextSibling->nextSibling->nodeValue.  '">';

echo '<input value="'. $node->firstChild->nextSibling->nextSibling->nextSibling->nextSibling->nextSibling->nextSibling->nodeValue.  '">';*/

echo '<br>';
}

答案 1 :(得分:0)

我猜这段代码足够自我解释。 XPath使用了三次:用于查找所有表行,获取标签以及获取所有输入值。

foreach($xpath->query('.//tbody/tr[@class="rowData"]') as $row) {
    echo '<label>'.$xpath->query('td[1]/a/span', $row)->item(0)->textContent."</label>\n";
    foreach($xpath->query('td[position() > 1]/div/div', $row) as $col) {
        echo '<input value="'.trim($col->textContent).'" />'."\n";
    }
}