Java正则表达式匹配模式

时间:2012-06-12 08:55:17

标签: java regex

我需要针对某些文本检查模式(我必须检查我的模式是否在许多文本中)。

这是我的例子

String pattern = "^[a-zA-Z ]*toto win(\\W)*[a-zA-Z ]*$";    
if("toto win because of".matches(pattern))
 System.out.println("we have a winner");
else
 System.out.println("we DON'T have a winner");

对于我的测试,模式必须匹配,但使用我的正则表达式不匹配。 必须匹配:

" toto win bla bla"

"toto win because of"
"toto win. bla bla"


"here. toto win. bla bla"
"here? toto win. bla bla"

"here %dfddfd . toto win. bla bla"

不得匹配:

" -toto win bla bla"
" pretoto win bla bla"

我尝试使用我的正则表达式,但它不起作用。

你能指出我做错了什么吗?

5 个答案:

答案 0 :(得分:1)

只需将您的代码更改为String pattern = "\\s*toto win[\\w\\s]*";

即可

\ W表示无字字符,\ w表示字符(a-zA-Z_0-9)。

[\\w\\s]*将在“toto win”之后匹配任意数量的单词和空格。

<强>更新

为了反映您的新要求,此表达式将起作用:

"((.*\\s)+|^)toto win[\\w\\s\\p{Punct}]*"

((.*\\s)+|^)匹配任何后跟至少一个空格OR行首的任何内容。

[\\w\\s\\p{Punct}]*匹配单词,数字,空格和标点符号的任意组合。

答案 1 :(得分:1)

这样可行

(?im)^[?.\s%a-z]*?\btoto win\b.+$

<强>解释

"(?im)" +         // Match the remainder of the regex with the options: case insensitive (i); ^ and $ match at line breaks (m)
"^" +             // Assert position at the beginning of a line (at beginning of the string or after a line break character)
"[?.\\s%a-z]" +    // Match a single character present in the list below
                     // One of the characters “?.”
                     // A whitespace character (spaces, tabs, and line breaks)
                     // The character “%”
                     // A character in the range between “a” and “z”
   "*?" +            // Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
"\\b" +            // Assert position at a word boundary
"toto\\ win" +     // Match the characters “toto win” literally
"\\b" +            // Assert position at a word boundary
"." +             // Match any single character that is not a line break character
   "+" +             // Between one and unlimited times, as many times as possible, giving back as needed (greedy)
"$"               // Assert position at the end of a line (at the end of the string or before a line break character)

更新1

(?im)^[?~`'!@#$%^&*+.\s%a-z]*? toto win\b.*$

更新2

(?im)^[^-]*?\btoto win\b.*$

更新3

(?im)^.*?(?<!-)toto win\b.*$

<强>解释

"(?im)" +       // Match the remainder of the regex with the options: case insensitive (i); ^ and $ match at line breaks (m)
"^" +           // Assert position at the beginning of a line (at beginning of the string or after a line break character)
"." +           // Match any single character that is not a line break character
   "*?" +          // Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
"(?<!" +        // Assert that it is impossible to match the regex below with the match ending at this position (negative lookbehind)
   "-" +           // Match the character “-” literally
")" +
"toto\\ win" +   // Match the characters “toto win” literally
"\\b" +          // Assert position at a word boundary
"." +           // Match any single character that is not a line break character
   "*" +           // Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
"$"             // Assert position at the end of a line (at the end of the string or before a line break character)

RegEx需要转义才能在代码中使用

答案 2 :(得分:0)

您在win和模式中的下一个字之间缺少空格

试试这个:\\stoto\\swin\\s\\w

http://gskinner.com/RegExr/在这里你可以尝试你的正则表达式

答案 3 :(得分:0)

以下正则表达式

^[a-zA-Z. ]*toto win[a-zA-Z. ]*$

将匹配

 toto win bla bla
toto win because of
toto win. bla bla

与之匹配

-toto win bla bla"

答案 4 :(得分:0)

如果您包含实际要求,而不是(不)匹配的东西列表,它会更容易。我有一个强烈的怀疑“toto winabc”不应该匹配,但我不确定,因为你没有包括这样的例子或解释了要求。无论如何,这适用于您当前的所有示例:

static String[] matchThese = new String[] {
        " toto win bla bla",
        "toto win because of",
        "toto win. bla bla",
        "here. toto win. bla bla",
        "here? toto win. bla bla",
        "here %dfddfd . toto win. bla bla"
};

static String[] dontMatchThese = new String[] {
        " -toto win bla bla",
        " pretoto win bla bla"
};


public static void main(String[] args) {
    // either beginning of a line or whitespace followed by "toto win"
    Pattern p = Pattern.compile("(^|\\s)toto win");

    System.out.println("Should match:");
    for (String s : matchThese) {
        System.out.println(p.matcher(s).find());
    }

    System.out.println("Shouldn't match:");
    for (String s : dontMatchThese) {
        System.out.println(p.matcher(s).find());
    }
}