带有嵌套“[”的正则表达式

时间:2012-05-08 02:57:50

标签: python regex

我熟悉正则表达式,但这个复杂的例子让我感到沮丧。我试图理解这行代码正在做什么:

r'/(\\.|[^[/\\\n]|\[(\\.|[^\]\\\n])*])+/([gim]+\b|\B)'

这是一个试图检测正则表达式的代码段,例如:/\s+/。我理解它,直到嵌套的[\[(对应物。

(我需要将此代码从Python移植到Java,并且在理解上述工作原理以及为什么它在Java中不起作用时遇到问题。)

1 个答案:

答案 0 :(得分:5)

这是一个可能有帮助的分解版本:

/                      # Match an opening slash
(                      # Followed by one or more...
  \\.                  #    Backslash followed by any character
  |                    #   or...
  [^[/\\\n]            #    Something that's not a [, /, \, or newline
  |                    #   or...
  \[                   #    A literal [, followed by any number of...
    (
      \\.              #     backslashes followed by any character
      |                #     or...
      [^\]\\\n]        #     something that's not a ], \, or newline
    )*
  ]                    #    and ending with a ]
)+
/                      # And a closing slash
(
  [gim]+\b             # Followed by one or more of g, i, m
  |
  \B                   # or something that isn't a word boundary
)