我是PHP的新手,我正在尝试从我正在使用正则表达式的网站上抓取数据,但在div中查找内容租借和详细信息是一个问题,这是我的代码。有人可以帮助我吗?
<?php
header('content-type: text/plain');
$contents= file_get_contents('http://www.hassconsult.co.ke/index.php?option=com_content&view=article&id=22&Itemid=29');
$contents = preg_replace('/\s(1,)/','',$contents);
$contents = preg_replace('/ /','',$contents);
//print $contents."\n";
$records = preg_split('/<span class="style8"/',$contents);
for ($ix=1; $ix < count($records); $ix++){
$tmp = $records[$ix];
preg_match('/href="(.*?)"/',$tmp, $match_url);
preg_match('/>(.*?)<\/span>/',$tmp,$match_name);
preg_match('/<div[^>]+class ?= ?"style10"[^>]*>(\s*(<div.*(?2).*<\/div>\s*)*)<\/div>/Us',$tmp,$match_rental);//error is here
print_r($match_url);
print_r($match_name);
print_r($match_rental);
print $tmp."\n";
exit ();
}
//print count($records)."\n";
//print_r($records);
//if ($contents===false)
//print 'FALSE';
//print_r(htmlentities($contents));
?>
以下是内容的示例
>HILLVIEW CROSSROADS4 BED HOUSE</span></div></td>
</tr>
<tr>
<td width="57%" style="padding-left:20px;"><div align="left" class="style10" style="color:#007AC7;">
<div align="left">
Rental;
USD 4,500
</div>
</div></td>
<td width="43%" align="right"><div align="right" class="style10" style="color:#007AC7;">
<div align="right">
No.
834
</div>
</div></td>
</tr>
<tr>
<td colspan="2" style="padding-left:20px;color:#000000;">
<div align="justify" style="font-family:Arial, Helvetica, sans-serif;font-size:11px;color:333300;">Artistically designed 4 bed (all
ensuite) house on half acre of well-tended gardens. Lounge with fireplace opening to terrace, opulent master suite, family room, study. Good finishes, SQ, carport, extra water storage
and generator. <a href="/index.php?option=com_content&view=article&id=27&Itemid=74&send=5&ref_no=834/II&t=2">....Details</a> </div></td>
</tr>
</table></td>
</tr>
</table>
<br>
答案 0 :(得分:2)
那个网站没有好的css选择器,但是仍然不难用xpath获取它:
$dom = new DOMDocument();
@$dom->loadHTMLFile('http://www.hassconsult.co.ke/index.php?option=com_content&view=article&id=22&Itemid=29');
$xpath = new DOMXPath($dom);
foreach($xpath->query("//div[@id='ad']/table") as $table) {
// title
echo $xpath->query(".//span[@class='style8']", $table)->item(0)->nodeValue . "\n";
// price
echo $xpath->query(".//div[@class='style10']/div", $table)->item(0)->nodeValue . "\n";
// description
echo $xpath->query(".//div[@align='justify']", $table)->item(0)->nodeValue . "\n";
}