使用preg_replace时我得到双字符......?

时间:2010-12-16 16:42:41

标签: php string preg-replace

当我使用以下脚本时,我会得到双字符。为什么呢?

$clean_lastname = "Dür";
$clean_lastname = preg_replace("/[ùúûü]/", "u", $clean_lastname);
echo $clean_lastname;

输出:Duur

我希望它是Dur

我仍然做错了什么...'在preg函数中放置一个数组的值是什么?

$clean_lastname = "Boerée";
$l = 0;
$pattern = array('[ÀÁÂÃÄÅ]','[Ç]','[ÈÉÊË]','[ÌÍÎÏ]','[Ñ]','[ÒÓÔÕÖØ]','[Ý]','[ß]','[àáâãäå]','[ç]','[èéêë]','[ìíîï]','[ñ]','[òóôõöø]','[ùúûü]','[ýÿ]');
$replace = array(A,C,E,I,N,O,Y,S,a,c,e,i,n,o,u,y);

foreach ($pattern as $wierdchar)
{
    $clean_lastname = preg_replace('/$wierdchar/u', '$replace[$l]', $clean_lastname);
    $l++;
}

//$clean_lastname = preg_replace('/[èéêë]/u', 'e', $clean_lastname);

//$clean_lastname = strtr($clean_lastname, "ùúûü","uuuu");
echo $clean_lastname;

4 个答案:

答案 0 :(得分:2)

我可以想象这种情况发生的唯一情况是当你的两个字符串(输入字符串和模式)具有不同的字符编码或两者都是UTF-8但你没有正确指定它时。

因为在后一种情况下,"Dür"等同于"D\xC3\xBCr"ü使用两个字节序列0xC3BC编码),模式"/[ùúûü]/"等同于{{ 1}}。由于转义序列"/[\xC3\xB9\xC3\xBA\xC3\xBB\xC3\xBC]/"指定的每个字节都被视为单个字符,因此会产生以下结果:

\xHH

因此,在使用UTF-8时,请确保设置u modifier flag,以便将模式和输入字符串视为UTF-8编码:

echo preg_replace("/[\xC3\xB9\xC3\xBA\xC3\xBB\xC3\xBC]/", "u", "D\xC3\xBCr");  // Duur

修改既然您明确了自己的意图并且似乎尝试实施某种transliteration,那么您应该看一下iconv并且它能够音译:

"/[ùúûü]/u"

另请参阅其他相关主题:

答案 1 :(得分:1)

<?php
    $vowels = array("ù","ú","û","ü" );
    $consonents = array("u","u","u","u");
    $clean_lastname = "Dür";
    echo str_replace( $vowels, $consonents, $clean_lastname);
?>

答案 2 :(得分:1)

$clean_lastname = str_replace(array('ù', 'ú', 'û', 'ü', 'Ù', 'Ú', 'Û', 'Ü'), array('u', 'u', 'u', 'u', 'U', 'U', 'U', 'U'), $clean_lastname);

// OR解决您的初始问题:

$clean_lastname = preg_replace('/[ùúûü]/u', 'u', $clean_lastname);

答案 3 :(得分:1)

坚持原始strtr

$clean_lastname = "Dür Dùr Dúr Dûr";
$clean_lastname = strtr($clean_lastname, "ùúûü", "uuuu");
echo $clean_lastname;