正则表达式根据模式排除匹配

时间:2014-01-08 22:37:35

标签: regex regex-negation

我正在尝试创建符合以下条件的正则表达式(Perl兼容,但不是Perl本身):

  • 它不能包含“R”,后跟任意数字的数字(不区分大小写),在字边界
  • 可能会有或可能没有其他单词和/或围绕它的空白区域
  • 是R#包含在括号内,不应该匹配

我到目前为止提出的正则表达式是:

^(.(?!\b(?:r)\d*\b))*$

下面是一个示例表。有些正在运作,有些正在失败。

对于下面的输入字符串:

  • \ n =新行
  • \ s = space
  • \ t = tab

结果

+-------------------------------+---------------+--------------+
|         Input string          | Desired Match | Actual Match |
+-------------------------------+---------------+--------------+
| Some text                     | yes           | yes          |
| Some textr1                   | yes           | yes          |
| Some text default(r3)         | yes           | NO           |
| Some text default(abc r3)     | yes           | NO           |
| Some text default(r3 xyz)     | yes           | NO           |
| Some text default(abc r3 xyz) | yes           | NO           |
| Some text r12 default(r3)     | no            | no           |
| Some text r1                  | no            | no           |
| Some r1 text                  | no            | no           |
| \sR12 Some text               | no            | no           |
| Some text r1 somethingElse    | no            | no           |
| R1                            | no            | YES          |
| \s\sR2                        | no            | no           |
| R3\s\s                        | no            | YES          |
| \tr4                          | no            | no           |
| \t\sR5\t                      | no            | no           |
+-------------------------------+---------------+--------------+

任何人都可以提供有效的正则表达式吗?

Mike V。

1 个答案:

答案 0 :(得分:4)

您可以使用此模式:

(?i)^(?>[^r(]++|(?<!\\[ts])\Br|r(?![0-9])|(\((?>[^()]++|(?1))*\))|\()++$

模式细节:

(?i)                  # modifier: case insensitive
^                     # anchor: begining of the string
(?>                   # open an atomic group
    [^r(]++           # all characters except r and opening parenthesis
  |                   # OR
    (?<!\\[ts])\Br    # r without word boundary and not preceded by \t or \s
  |                   # OR
    r(?![0-9])        # r (with word boundary or preceded by \t or \s) not followed by a digit
  |                   # OR
    (                 # (nested or not parenthesis): open the capture group n°1
        \(            # literal: (
        (?>           # open an atomic group
            [^()]++   # all characters except parenthesis
          |           # OR
            (?1)      # (recursion): repeat the subpattern of the capture group n°1
        )*            # repeat the atomic group (the last) zero or more times
        \)            # literal: )
    )                 # close the first capturing group
  |                   # OR
    \(                # for possible isolated opening parenthesis
)++                   # repeat the first atomic group one or more times
$                     # anchor: end of the string

注意:如果您的帖子\t\s不是文字,则可以删除(?<!\\[ts])

相关问题