如何删除不需要的字符但保留特定字符

时间:2014-04-08 07:22:45

标签: php regex preg-replace

嗨,我仍然无法解决问题。我正在使用preg_replace。我已经搜索但未能找到解决方案。我需要删除字符串中的未知字符,但保留新行。

    $summary = "ASDASDASDASDSASD 
    [BS][BS][BS] hello
    this is a new line
[BS][BS][BS] 
this is another new line"; 
    // [BS] is an unknown character if you ever encountered it before in Notepadd++.
    // See screenshot, taken from Notepad++
    // The output in the browser is a series of whitespaces.
    // I can't paste the unknown symbol here. 

    echo preg_replace('/[\x00-\x1F\x80-\xFF]/','', $summary); 

    // Output: ASDASDASDASDSASD hello this is a new line this is another new line

//Expected Output:
//ASDASDASDASDSASD 
//    hello
//    this is a new line
//this is another new line

我会感激所有帮助。

This is the BS I am talking about

2 个答案:

答案 0 :(得分:2)

我正在考虑http://www.asciitable.com/,感觉RegEx应该是这样的:

/[\x00-\x08\x0B-\x0C\x0E-\x1F\x7F-\xFF]/

范围(实际上是字符黑名单)不包括您可能想要保留的ASCII标签,换行符和回车符。

PS:BS是Notepad ++表示退格字符(ASCII 0x08)的方式。

答案 1 :(得分:1)

这是

echo preg_replace('/[\x00-\x09\x0B-\x0C\x0E-\x1F\x80-\xFF]/','', $summary);

因为0D和0A(如\ x0D和\ x0A包含在\ x00- \ x1F中)是CR + LF。您需要排除这些(并定义多个范围)