检查字符串是否包含PHP中的IPV6地址

时间:2017-09-07 20:47:14

标签: php regex ipv6

使用PHP,我需要检查一个字符串是否包含IPv6地址 - 然后提取该IPv6地址。

如果它完全是IPv6,我有一个匹配字符串的正则表达式:

$matches = [];
$regex = '/^(((?=.*(::))(?!.*\3.+\3))\3?|([\dA-F]{1,4}(\3|:\b|$)|\2))(?4){5}((?4){2}|(((2[0-4]|1\d|[1-9])?\d|25[0-5])\.?\b){4})\z/i';
preg_match($regex, $ipv6, $matches);

我所坚持的是能够在任何一方添加通配符,所以我可以匹配以下内容:

最终我需要这样做,所以我可以在IPv6地址周围包括方括号,因此它符合RFC 3986(例如http://[2001:0db8:85a3:0000:0000:8a2e:0370:7334]/something/page.html)。

4 个答案:

答案 0 :(得分:0)

您需要使用其他正则表达式,例如this

(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]))

之后,你可以在你的链接中包装ipv6:

<?php
$ipv6 = 'http://2001:0db8:85a3:0000:0000:8a2e:0370:7334/something/page.html';
$matches = [];
$regex = '(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]))';
if (preg_match($regex, $ipv6, $matches)) {
    $result = str_replace($matches[0], '[' .  $matches[0] . ']', $ipv6);
}

答案 1 :(得分:0)

我还没有对代码进行全面测试,所以我无法100%确定它的工作原理,但是,我针对几个不同的URL运行它,它似乎工作正常。

我已经采取了部分答案:

答案

使用域名回答

这就是我想出来的:

(?(DEFINE)
  (?<scheme>[a-z][a-z0-9+.-]*)
  (?<userpass>([^:@\/](:[^:@\/])?@))
  (?<domain>[a-z0-9]+(-[a-z0-9]+)*(\.[a-z0-9]+(-[a-z0-9]+)*)+)
  (?<ip>(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])))
  (?<host>((?&domain)|(?&ip)))
  (?<port>(:[\d]{1,5}))
  (?<path>([^?;\#]*))
  (?<query>(\?[^\#;]*))
  (?<anchor>(\#.*))
)
^(?:(?&scheme):\/\/)?(?&userpass)?(?<address>(?&host))(?&port)?\/?(?&path)?(?&query)?(?&anchor)?$

按照this链接查看正在使用的

没有域名的答案

上述正则表达式将匹配包含有效域的网址(无论是域名还是地址)。如果要匹配 IP地址,请使用以下正则表达式(其中包括名为host的定义组中的简单更改 - 我删除了对名为{{1}的定义组的引用})

domain

按照this链接查看正在使用的

对于那些喜欢长不易读的查询的人,你可以使用下面的正则表达式,它等同于上面的正则表达式。

(?(DEFINE)
  (?<scheme>[a-z][a-z0-9+.-]*)
  (?<userpass>([^:@\/](:[^:@\/])?@))
  (?<domain>[a-z0-9]+(-[a-z0-9]+)*(\.[a-z0-9]+(-[a-z0-9]+)*)+)
  (?<ip>(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])))
  (?<host>(?&ip))
  (?<port>(:[\d]{1,5}))
  (?<path>([^?;\#]*))
  (?<query>(\?[^\#;]*))
  (?<anchor>(\#.*))
)
^(?:(?&scheme):\/\/)?(?&userpass)?(?<address>(?&host))(?&port)?\/?(?&path)?(?&query)?(?&anchor)?$

注意:两个答案都使用^(?:[a-z][a-z0-9+.-]*:\/\/)?(?:[^:@\/](?::[^:@\/])?@)?(?<address>(?:[0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|(?:[0-9a-fA-F]{1,4}:){1,7}:|(?:[0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|(?:[0-9a-fA-F]{1,4}:){1,5}(?::[0-9a-fA-F]{1,4}){1,2}|(?:[0-9a-fA-F]{1,4}:){1,4}(?::[0-9a-fA-F]{1,4}){1,3}|(?:[0-9a-fA-F]{1,4}:){1,3}(?::[0-9a-fA-F]{1,4}){1,4}|(?:[0-9a-fA-F]{1,4}:){1,2}(?::[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:(?:(?::[0-9a-fA-F]{1,4}){1,6})|:(?:(?::[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(?::[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(?:ffff(?::0{1,4}){0,1}:){0,1}(?:(?:25[0-5]|(?:2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(?:25[0-5]|(?:2[0-4]|1{0,1}[0-9]){0,1}[0-9])|(?:[0-9a-fA-F]{1,4}:){1,4}:(?:(?:25[0-5]|(?:2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(?:25[0-5]|(?:2[0-4]|1{0,1}[0-9]){0,1}[0-9]))(?::[\d]{1,5})?\/?(?:[^?;\#]*)?(?:\?[^\#;]*)?(?:\#.*)?$ (不区分大小写)和i(忽略空格)修饰符

答案 2 :(得分:0)

也许2个正则表达式匹配会更好。因为你的正则表达式似乎很复杂

$regex1 = '/^https?:\/\/([a-z0-9:]{39})/';

if( preg_match( $regex1, $your_text, $matches1) ) {

    $regex2 = '/[a-z0-9]{4}:?/';

    if( preg_match_all( $regex2, $matches1[1], $matches2 ) === 8 )
        echo $your_text.' qualifies!!'; 
}

答案 3 :(得分:0)

您无需阅读和理解regex,以验证字符串是否为有效的IPv6地址。 PHP函数filter_var()可以为您进行重量级提升:

echo(filter_var('2001:0db8:85a3:0000:0000:8a2e:0370:7334', FILTER_VALIDATE_IP));
# 2001:0db8:85a3:0000:0000:8a2e:0370:7334

echo(filter_var('2001:0db8:85a3::8a2e:0370:7334', FILTER_VALIDATE_IP));
# 2001:0db8:85a3::8a2e:0370:7334

echo(filter_var('192.168.0.1', FILTER_VALIDATE_IP));
# 192.168.0.1

var_dump(filter_var('192.168.0.1', FILTER_VALIDATE_IP, FILTER_FLAG_IPV6));
# bool(false)

如果输入值有效(根据作为第二个参数传递的过滤器和作为第三个参数传递的选项),则返回输入值,否则返回FALSE。

如果IP地址是URL的域,则可以使用PHP函数parse_url()来提取它:

print_r(parse_url('http://2001:0db8:85a3:0000:0000:8a2e:0370:7334/something/page.html'));
# Array
# (
#     [scheme] => http
#     [host] => 2001:0db8:85a3:0000:0000:8a2e:0370:7334
#     [path] => /something/page.html
# )

示例中的最后一个字符串(2001:0db8:85a3:0000:0000:8a2e:0370:7334/something/page.html)不是URL。它只是一些随机文本,看起来像一个不完整(和无效)的URL。我没有一个简单的解决方案: - (

相关问题