正则表达式,重复的字符串它不起作用

时间:2015-11-20 21:34:45

标签: php preg-match

$string = '<td class="t_ip">85.185.244.101</td><td class="t_port">           <script type="text/javascript">           //<![CDATA[             document.write(HttpSocks^Xinemara^47225);           //]]>           </script>         </td><td class="t_type">         4         </td>';

$regex = "/<td class=\"t_ip\">\\s*((?:[0-9]{1,3}\\.){3}[0-9]{1,3})(?:.|\\n)*<td class=\"t_port\">(?:.|\\n)*\^([0-9]{1,5})(?:.|\\n)*<td class=\"t_type\">\\s*([0-9])/";
preg_match($regex, $string, $matches);
$newString = $matches[1] . ':' . $matches[2] . ' ' . $matches[3];
print_r($newString);

正则表达式:

$regex = "/<td class=\"t_ip\">\\s*((?:[0-9]{1,3}\\.){3}[0-9]{1,3})(?:.|\\n)*<td class=\"t_port\">(?:.|\\n)*\^([0-9]{1,5})(?:.|\\n)*<td class=\"t_type\">\\s*([0-9])/";

以这种方式提取信息:

85.185.244.101:22088 4

但如果重复两次以上不起作用

$string = '<td class="t_ip">85.185.244.101</td><td class="t_port"><script type="text/javascript"> //<![CDATA[document.write(HttpSocks^Xinemara^47225);//]]></script></td><td class="t_type">4</td><td class="t_ip">85.185.244.101</td><td class="t_port"><script type="text/javascript"> //<![CDATA[document.write(HttpSocks^Xinemara^47225);//]]></script></td><td class="t_type">4</td><td class="t_ip">85.185.244.101</td><td class="t_port"><script type="text/javascript"> //<![CDATA[document.write(HttpSocks^Xinemara^47225);//]]></script></td><td class="t_type">4</td>';

那必须改变以使其有效吗?

2 个答案:

答案 0 :(得分:1)

我使用解析器而不是正则表达式,HTML正则表达式不顺利。你可以这样做:

<?php
$string = '<td class="t_ip">85.185.244.101</td><td class="t_port">           <script type="text/javascript">           //<![CDATA[             document.write(HttpSocks^Xinemara^47225);           //]]>           </script>         </td><td class="t_type">         4         </td>';
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($string);
libxml_use_internal_errors(false);
$cells = $doc->getElementsByTagName('td');
foreach($cells as $cell) {
if(preg_match('/\bt_(ip|type)\b/', $cell->getAttribute('class'), $type)){
     echo $type[1] . "=" . trim($cell->nodeValue) . "\n";
}
}

输出:

ip=85.185.244.101
type=4

如果您需要验证IP,可以添加以下内容:

if($type[1] == 'ip') {
if(filter_var($cell->nodeValue, FILTER_VALIDATE_IP)) {
     echo 'valid ip' . $cell->nodeValue;
} else {
     echo 'invalid ip' . $cell->nodeValue;
}
}

我不知道您提供的字符串22088的来源。

答案 1 :(得分:0)

.gitignore