Question

所以我实际上存储了一个html字段，但是我想添加一些伪标签以便于发布。 I.E.我想将标题/标题包装到此标记中：＆lt;＆lt; ...＆gt;＆gt; 例如。＆LT;＆LT;我的标题＆gt;＆gt; 然后我会枚举它们，格式化并显示下面的文本。

E.G：

<<News>>
Breaking news on Sunday.
Have been taking hostages.
<<General Information>>
We would want to recieve our blabla.
And you want it.
<<User Suggestions>>
Yeah we want it so much...

应该显示：

<H1 class="whatever" ID="Product_Header_1">News<H1>
Breaking news on Sunday.
Have been taking hostages.
<H1 class="whatever" ID="Product_Header_2">General Information</H1>
We would want to recieve our blabla.
And you want it.
<H1 class="whatever" ID="Product_Header_3">User Suggestion</H1>
Yeah we want it so much...

然后应该返回一个包含实际标题及其编号的数组，这样我就可以在页面的其他地方使用它来进行引用。

所以我们似乎可以直接替换它们，但是枚举和返回值可能会出现问题，并且在未关闭标记的情况下可能会失败。

或者，将它们分成一个数组，然后手动进行，这似乎是一种更好的方法。

这是我到目前为止所尝试的：

$TEXT_A=preg_split('/<<([^>]+)>>/', $TEXT);

foreach($TEXT_A as $key => $val){
    if ($key>0) echo "<br>-!-";
    echo $val;
}

其中$ TEXT是带有伪标签的HTML文本。

问题虽然，split并没有返回regexp匹配本身，所以我对如何提取它感到困惑。也许我需要编写一些自定义函数来返回一个文本和标题数组，而不是常规拆分，但我不知道从哪里开始......

请帮忙。

Answer 1

只需使用

$text_a = preg_split('/<<([^>]+)>>/', $text, -1, PREG_SPLIT_DELIM_CAPTURE);

您会在$text_a的奇数索引处找到标题标记。假设您要忽略第一个标头之前的内容：

$n = count($text_a);
$head_a = array();
$body_a = array();
for ($i = 1; $i < $n; $i += 2) {
   $head_a[] = $text_a[$i];
   $body_a[] = $text_a[$i + 1]; // trim()?
}

Answer 2

以下是使用preg_replace_callback的有效解决方案。它使用非贪婪捕获组和积极前瞻（(?=<<|$)）来捕获“正文”文本。积极的前瞻说“断言开头定界符<<或字符串$的结尾”。

$count = 0;
$TEXT_A = preg_replace_callback( '/<<([^>]+)>>(.*?)(?=<<|$)/s', 
    function( $matches) use (&$count) {
        $count++;
        return '<H1 class="whatever" ID="Product_Header_' . $count . '">' . $matches[1] . '</H1>' . "\n" . trim( $matches[2]) . "\n\n"; 
}, $TEXT);
echo htmlentities( $TEXT_A);

我通过htmlentities传递它来显示生成的HTML，但您当然可以删除该调用以查看浏览器解释的HTML：

<H1 class="whatever" ID="Product_Header_1">News</H1>
Breaking news on Sunday.
Have been taking hostages.

<H1 class="whatever" ID="Product_Header_2">General Information</H1>
We would want to recieve our blabla.
And you want it.

<H1 class="whatever" ID="Product_Header_3">User Suggestions</H1>
Yeah we want it so much...

Demo

修改

这是一个没有匿名函数的解决方案：

function do_replacement( $matches){ static $count = 0; $count++; return '<H1 class="whatever" ID="Product_Header_' . $count . '">' . $matches[1] . '</H1>' . "\n" . trim( $matches[2]) . "\n\n"; } $TEXT_A = preg_replace_callback( '/<<([^>]+)>>(.*?)(?=<<|$)/s', 'do_replacement', $TEXT); echo htmlentities( $TEXT_A);

最终修改

此编辑包含一个用于捕获替换的全局数组。

$custom_array = array(); function do_replacement( $matches){ global $custom_array; static $count = 0; $count++; $custom_array[$count] = $matches[1]; return '<H1 class="whatever" ID="Product_Header_' . $count . '">' . $matches[1] . '</H1>' . "\n" . trim( $matches[2]) . "\n\n"; } $TEXT_A = preg_replace_callback( '/<<([^>]+)>>(.*?)(?=<<|$)/s', 'do_replacement', $TEXT); echo htmlentities( $TEXT_A); var_dump( $custom_array);

Answer 3

听起来您想使用标记格式编写文档，而不是HTML。

这是一个非常普遍的要求，人们已经提出了许多解决方案。如果你想创建自己的标记格式也没关系，但是如果你想节省一些时间，你可能想要考虑一个现有的格式。

我能想到BBCode，Markdown和Wikicode。

Markdown是本网站上的问题/评论中使用的格式。
BBCode在很多论坛软件等中以各种形式使用。
Wikicode是维基百科和其他维基站点使用的标记代码。

Parsers可用于PHP以及其他语言中的所有这些。

例如，PHP的PECL库中有一个BBCode解析器 - 请参见：http://php.net/manual/en/book.bbcode.php。如果您能够在服务器上安装PECL库，则可以在PHP中使用这些BBCode解析函数，而无需在运行时包含任何内容。

如果你不能去PECL路线，也存在其他BBCode解析器：试试这个，例如：http://nbbc.sourceforge.net/

Wiki标记解析器：Which wiki markup parser does Wikipedia use?

Markdown解析器：http://michelf.com/projects/php-markdown/

希望有所帮助。

Answer 4

不是正则表达式，而是......：

$s = '<<News>>
Breaking news on Sunday.
Have been taking hostages.
<<General Information>>
We would want to recieve our blabla.
And you want it.
<<User Suggestions>>
Yeah we want it so much...';

$s = str_replace('>>', '<H1>', $s);
$i = 1;
while (strpos($s, '<<') !== false)
{
    $s = str_replace_one('<<', '<H1 class="whatever" ID="Product_Header_' . $i . '">', $s);
    $i++;
}

function str_replace_one($find, $replace, $subject) 
{
    return implode($replace, explode($find, $subject, 2));
}


echo $s;

Answer 5

为什么不使用preg_replace_callback？

preg_replace_callback('/<<([^>]+)>>/', function($match) {
    static $key=0;
    $html = (($key > 0) ? '<br>-!-' : '') . '<H1 class="whatever" ID="Product_Header_'.$key.'">'.$val.'</H1>';
    $key++;
    return $html;
});

PHP：我需要像split（）这样的东西，但是

5 个答案: