匹配最小可能的组java regexp

时间:2015-11-17 05:23:35

标签: java regex

我正在试图弄清楚如何使这个正则表达式以我需要的方式正常工作。基本上我有一堆歌词和歌词。我正在遍历每首歌词,看看它们是否与我正在寻找的搜索短语相匹配,并返回字符串的长度,就像匹配评级一样。

例如我在这里有一首歌的部分歌词:

 "you gave up the love you got & that is that
 she loves me now she loves you not and that where its at"

我正在使用此正则表达式来查找匹配项:

 (?mi)(\bShe\b).*(\bloves\b).*(\byou\b)

然而它捕获了这个组

 "she loves me now she loves you"

我想捕捉最小的可能只是“她爱你”的群体

如何才能捕获最小的组?

下面我的一些代码,我接受这个短语并将其拆分成一个数组,然后检查以确保歌词包含该单词,否则我们可以拯救。然后我构建一个将成为正则表达式的字符串

 static int rankPhrase(String lyrics, String lyricsPhrase){
    //This takes in song lyrics and the phrase we are searching for

    //Split the phrase up into separate words
    String[] phrase = lyricsPhrase.split("[^a-zA-Z]+");

    //Start to build the regex
    StringBuilder regex = new StringBuilder("(?im)"+"(\\" + "b" + phrase[0] + "\\b)");

    //loop through each word in the phrase
    for(int i = 1; i < phrase.length; i++){

        //Check to see if this word exists in the lyrics first
        if(lyrics.contains(phrase[i])){

            //add this to the regex we will search for
            regex.append(".*(\\b" + phrase[i] + "\\b)");

        }else{
            //if the song isn't found return the rank of 
            //-1 this means song doesn't contain phrase
            return -1;
        }

    }

    //Create the pattern
    Pattern p = Pattern.compile(regex.toString());
    Matcher m = p.matcher(lyrics);


    //Check to see if it can find a match
    if(m.find()){

        //Store this match in a string
        String match = m.group();

3 个答案:

答案 0 :(得分:2)

(\bShe\b)(?:(?!\b(?:she|loves|you)\b).)*(\bloves\b)(?:(?!\b(?:she|loves|you)\b).)*(\byou\b)

您可以在此处使用lookahead。参见演示。

https://regex101.com/r/hE4jH0/11

对于java,请使用

(\\bShe\\b)(?:(?!\\b(?:she|loves|you)\\b).)*(\\bloves\\b)(?:(?!\\b(?:she|loves|you)\\b).)*(\\byou\\b)

答案 1 :(得分:1)

Java的正则表达式匹配器仅适用于前进方向。您需要做的是迭代找到的所有匹配项集,并选择最短的匹配项。

答案 2 :(得分:0)

这里你需要使用负向前瞻,

Pattern.compile("\\bShe\\b(?:(?!\\bshe\\b).)*?\\bloves\\b(?:(?!\\b(?:you|loves)\\b).)*\\byou\\b");