Question

我正在开发模板系统并遇到一些问题。

计划是创建包含[@tags]的HTML文档。我可以使用str_replace（我可以通过所有可能的替换来循环），但我想进一步推动这一点; - ）

我想允许嵌套标记，并允许每个标记的参数：

[@title|You are looking at article [@articlenumber] [@articlename]]

我想通过preg_match_all得到以下结果：

[0] title|You are looking at article [@articlenumber] [@articlename]
[1] articlenumber
[2] articlename

我的脚本将拆分|用于参数。我脚本的输出类似于：

<div class='myTitle'>You are looking at article 001 MyProduct</div>

我遇到的问题是我没有使用正则表达式。 Al my paterns几乎可以得到我想要的东西，但是嵌套的params有问题。

\[@(.*?)\]

将从articlenumber停止。

\[@(.*?)(((?R)|.)*?)\]

更像是它，但它没有抓住文章编号; https://regex101.com/r/UvH7zi/1

希望有人可以帮助我！提前谢谢！

Answer 1

使用常规Python正则表达式无法做到这一点。您正在寻找类似于＆＃34; balancing groups＆＃34;的功能允许嵌套匹配的。NET RegEx's engine中提供。

查看允许嵌套表达式的PyParsing：来自pyparsing import nestedExpr

import pyparsing as pp
text = '{They {mean to {win}} Wimbledon}'
print(pp.nestedExpr(opener='{', closer='}').parseString(text))

输出结果为：

[['They', ['mean', 'to', ['win']], 'Wimbledon']]

不幸的是，这不适用于您的示例。我想你需要更好的语法。

您可以尝试使用QuotedString定义，但仍然可以。

import pyparsing as pp
single_value = pp.QuotedString(quoteChar="'", endQuoteChar="'")
parser = pp.nestedExpr(opener="[", closer="]",
                       content=single_value,
                       ignoreExpr=None)

example = "['@title|You are looking at article' ['@articlenumber'] ['@articlename']]"
print(parser.parseString(example, parseAll=True))

Answer 2

我在手机上输入此内容，因此可能会出现一些错误，但您可以通过在表达中加入前瞻来轻松实现所需的内容：

(?=\\[(@(?:\\[(?1)\\]|.)*)\\])

编辑：是的，它有效，在这里你去：https://regex101.com/r/UvH7zi/4

因为（？=）不消耗任何字符，所以该模式会查找并捕获所有＆＃34; [@ *]＆＃34;的内容。主题中的子串，以递归方式检查内容本身是否包含平衡组（如果有）。

Answer 3

这是我的代码：

@\w+\|[\w\s]+\[@(\w+)]\s+\[@(\w+)]

https://regex101.com/r/UvH7zi/3

Answer 4

现在我已经创建了一个解析器：

- get all opening tags, and put their strpos in array - loop trough all start positions of the opening tags - Look for the next closingtag, is it before the next open-tag? than the tag is complete - If the closingtag was after an opening tag, skip that one and look for the next (and keep checking for openingtags in between)

这样我就可以找到所有完整标签并替换它们。但这需要大约50行代码和多个循环，因此一个preg_match会更大; - ）

具有嵌套匹配的Preg_match_all

4 个答案: