如何在preg_replace中将匹配截断为n个字符以替换?

时间:2014-05-07 16:52:30

标签: php regex

我想像这样缩短长锚文本:

  

http://a.com/abcdefghijklmnopqrstuvwxyz http://b.com/abcdefghijklmnopqrstuvwxyz截断为10个字符

对此:

  

a.com/abcd...b.com/abcd...截断为10个字符

如何缩短preg_replace匹配?

我开始了Regex101 here

PHP:

$str = 'Truncate <a href="http://a.com/abcdefghijklmnopqrstuvwxyz">http://a.com/abcdefghijklmnopqrstuvwxyz</a>
        and <a href="http://b.com/abcdefghijklmnopqrstuvwxyz">http://b.com/abcdefghijklmnopqrstuvwxyz</a> to 10 characters';
$str = preg_replace('~<a href="(https?://[^"]+)".*?>(.*?)</a>~', '<a href="$1">$2</a>', $str);

echo $str; // Truncate <a href="http://a.com/abcdefghijklmnopqrstuvwxyz">http://a.com/abcdefghijklmnopqrstuvwxyz</a> and <a href="https://b.com/abcdefghijklmnopqrstuvwxyz">https://b.com/abcdefghijklmnopqrstuvwxyz</a> to 10 characters

期望的结果:

Truncate <a href="http://a.com/abcdefghijklmnopqrstuvwxyz">a.com/abcd...</a>
and <a href="https://b.com/abcdefghijklmnopqrstuvwxyz">b.com/abcd...</a> to 10 characters

4 个答案:

答案 0 :(得分:1)

更改正则表达式以捕获域名部分,然后使用preg_replace_callback()

$pattern = '~<a href="(https?://([^"]+))".*?>(.*?)</a>~';    
$str = preg_replace_callback($pattern, function ($m) {
    $text = (strlen($m[2]) > 10) ? substr($m[2], 0, 10) . '...' : $m[2];
    return sprintf('<a href="%s">%s</a>', $m[1], $text);
}, $str);

Demo

答案 1 :(得分:0)

$str = 'Truncate <a href="http://a.com/abcdefghijklmnopqrstuvwxyz">http://a.com/abcdefghijklmnopqrstuvwxyz</a>
        and <a href="http://b.com/abcdefghijklmnopqrstuvwxyz">http://b.com/abcdefghijklmnopqrstuvwxyz</a> to 10 characters';
$str = preg_replace_callback('~<a href="(https?://[^"]+)".*?>(.*?)</a>~', 'truncate_link', $str);

echo $str;

function truncate_link($matches) {
    $link = $matches[0];
    $text = $matches[1];
    if (strlen($text > 10)) {
       $text = substr($text, 0, 10) . '...';
    }
    return "<a href=\"$link\">$text</a>";
}

答案 2 :(得分:0)

我认为没有回调函数也可以这样做:

$str = preg_replace(
    '~https?://([^<>]{10})[^<>]+(?=</a>)~', 
    '$1...', 
    $str);

答案 3 :(得分:0)

这里还有一个解决方案,只用超过10个字符的URL替换超链接:

$str = preg_replace('~(<a href="https?://)([^"]{10})([^"]+?").*?>.*?</a>~', '$1$2$3>$2...</a>', $str);

协议类型 - http或https - 由此替换表达式保存。

如果对于短URL,只应删除开头A标记的URL结尾(第二个双引号)和>之间的所有属性,则使用的第二个表达式是:

$str = preg_replace('~(<a href="https?://[^"]{1,10}").*?(>.*?</a>)~', '$1$2', $str);