用标签

时间:2017-02-01 14:21:31

标签: php regex

我希望在行的开头将4个空格替换为制表符,但是当存在文本时,不会再进一步​​了。

为了便于阅读,我的/ {4}+//[ ]{4}+/的初始正则表达式明显有效,但很明显,任何有四个空格的实例都会被替换。

$string  = '        this is some text -->    <-- are these tabs or spaces?';
$string .= "\n    and this is another line singly indented";
// I wrote 4 spaces, a tab, then 4 spaces here but unfortunately it will not display
$string .= "\n    \t    and this is third line with tabs and spaces";

$pattern = '/[ ]{4}+/';
$replace = "\t";

$new_str = preg_replace( $pattern , $replace , $string );

echo '<pre>'. $new_str .'</pre>';

这是我原来的一个例子,使用正则表达式,因为表达式在转换方面非常有效,但事实上----&gt;&lt; ----之间的4个空格被替换了通过选项卡。我真的希望压缩后的文字没有改变。

到目前为止我的最大努力是(^)开始行([ ]{4}+)模式(.*?[;\s]*)直到第一个非空格\s

$pattern = '/^[ ]{4}+.*?[;\s]*/m';

哪个...几乎有效,但由于现在缩进了缩进,有人能帮我理解我在这里缺少的东西吗?

[编辑]

为了清楚起见,我要做的是将文本缩进的开头从空格更改为制表符,我真的不明白为什么这会让任何人感到困惑。

尽可能清楚(使用上面$string的值):

First line has 8 spaces at the start, some text with 4 spaces in the middle.
I am looking for 2 tabs at the start and no change to spaces in the text.

Second line has 4 spaces at the start.
I am looking to have only 1 tab at the start of the line.

Third line has 4 spaces, 1 tab and 4 spaces.
I am looking to have 3 tabs at the start of the line.

2 个答案:

答案 0 :(得分:0)

如果你不是正规表达大师,这可能对你最有意义,并且更容易适应类似的用例(这不是最有效的代码,但它是最“可读”的imho):< / p>

// replace all regex matches with the result of applying
// a given anonymous function to a $matches array
function tabs2spaces($s_with_spaces) {
    // before anything else, replace existing tabs with 4 spaces
    // to permit homogenous translation
    $s_with_spaces = str_replace("\t", '    ', $s_with_spaces);
    return preg_replace_callback(
        '/^([ ]+)/m',
        function ($ms) {
            // $ms[0] - is full match
            // $ms[1] - is first (...) group fron regex

            // ...here you can add extra logic to handle
            // leading spaces not multiple of 4

            return str_repeat("\t", floor(strlen($ms[1]) / 4));
        },
        $s_with_spaces
    );
}

// example (using dots to make spaces visible for explaining)
$s_with_spaces = <<<EOS
no indent
....4 spaces indent
........8 spaces indent
EOS;
$s_with_spaces = str_replace('.', ' ');
$s_with_tabs = tabs2spaces($s_with_spaces);

如果你想要一个高性能但难以理解或调整单行代码,那么上面的正则表达式专家的评论中的解决方案应该有效:)

P.S。一般来说,preg_replace_callback(和its equivalent in Javascript)是结构化文本处理的伟大“瑞士军刀”。我可耻地甚至使用它来编写解析器到迷你语言;)

答案 1 :(得分:0)

我会这样做。

$str = "...";
$pattern = "'/^[ ]{4}+/'";
$replace = "\t"; 

$multiStr = explode("\n", $str);
$out = "";
foreach ($multiStr as &$line) {
    $line = str_replace("\t", "    ",$line);
    $out .= preg_replace( $pattern , $replace , $line )
}

$results = implode("\n", $out);

请以快速直观的方式彻底重新评估代码。

因为我无法运行PHP服务器来测试它:(但应该可以帮助您解决这个问题。