我正在尝试将由特定单词分隔的任何合理长度的2个子句子分组(在示例“AND”中),其中第二个可以是可选的。 一些例子:
CASE1:
foo sentence A AND foo sentence B
应给予
"foo sentence A" --> matching group 1
"AND" --> matching group 2 (optionally)
"foo sentence B" --> matching group 3
CASE2:
foo sentence A
应给予
"foo sentence A" --> matching group 1
"" --> matching group 2 (optionally)
"" --> matching group 3
我尝试了以下正则表达式
(.*) (AND (.*))?$
并且它有效,但只有在CASE2中,我在字符串的最后位置放置一个空格,否则模式不匹配。 如果我在圆括号组内包含“AND”之前的空格,则在情况1中,匹配器包括第一组中的整个字符串。 我想知道一个前瞻性和后瞻性断言,但不确定他们能帮助我。 有什么建议吗? 感谢
答案 0 :(得分:2)
如何使用
String split[] = sentence.split("AND");
这将用你的单词分开句子并给你一个子部分列表。
答案 1 :(得分:2)
此正则表达式将请求的字符串部分返回到请求的组中。 and
是可选的,如果在字符串中找不到,则将整个字符串放入组1.所有\s*?
强制所捕获的组自动修剪其空白区域。
^\s*?\b(.*?)\b\s*?(?:\b(and)\b\s*?\b(.*?)\b\s*?)?$
0获取整个匹配字符串
and
之前获取字符串,如果没有and
则会在此处显示整个字符串and
案例1
import java.util.regex.Pattern;
import java.util.regex.Matcher;
class Module1{
public static void main(String[] asd){
String sourcestring = "foo sentence A AND foo sentence B";
Pattern re = Pattern.compile("^\\s*?\\b(.*?)\\b\\s*?(?:\\b(and)\\b\\s*?\\b(.*?)\\b\\s*?)?$",Pattern.CASE_INSENSITIVE);
Matcher m = re.matcher(sourcestring);
if(m.find()){
for( int groupIdx = 0; groupIdx < m.groupCount()+1; groupIdx++ ){
System.out.println( "[" + groupIdx + "] = " + m.group(groupIdx));
}
}
}
}
$matches Array:
(
[0] => foo sentence A AND foo sentence B
[1] => foo sentence A
[2] => AND
[3] => foo sentence B
)
案例2,使用相同的正则表达式
import java.util.regex.Pattern;
import java.util.regex.Matcher;
class Module1{
public static void main(String[] asd){
String sourcestring = "foo sentence A";
Pattern re = Pattern.compile("^\\s*?\\b(.*?)\\b\\s*?(?:\\b(and)\\b\\s*?\\b(.*?)\\b\\s*?)?$",Pattern.CASE_INSENSITIVE);
Matcher m = re.matcher(sourcestring);
if(m.find()){
for( int groupIdx = 0; groupIdx < m.groupCount()+1; groupIdx++ ){
System.out.println( "[" + groupIdx + "] = " + m.group(groupIdx));
}
}
}
}
$matches Array:
(
[0] => foo sentence A
[1] => foo sentence A
)
答案 2 :(得分:2)
我会使用这个正则表达式:
^(.*?)(?: (AND) (.*))?$
<强>解释强>
The regular expression:
(?-imsx:^(.*?)(?: (AND) (.*))?$)
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
^ the beginning of the string
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
.*? any character except \n (0 or more times
(matching the least amount possible))
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
----------------------------------------------------------------------
' '
----------------------------------------------------------------------
( group and capture to \2:
----------------------------------------------------------------------
AND 'AND'
----------------------------------------------------------------------
) end of \2
----------------------------------------------------------------------
' '
----------------------------------------------------------------------
( group and capture to \3:
----------------------------------------------------------------------
.* any character except \n (0 or more
times (matching the most amount
possible))
----------------------------------------------------------------------
) end of \3
----------------------------------------------------------------------
)? end of grouping
----------------------------------------------------------------------
$ before an optional \n, and the end of the
string
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
答案 3 :(得分:0)
更改你的正则表达式,以便在他的第一个句子可选后创建空格:
(.*\\S) ?(AND (.*))?$
或者您可以使用split()
来消费AND
以及任何周围的空格:
String sentences = sentence.spli("\\s*AND\\s*");
答案 4 :(得分:0)
你的案子2有点奇怪......
但我会这样做
String[] parts = sentence.split("(?<=AND)|(?=AND)"));
你检查parts.length
。如果length == 1,则为case2。你只需要数组中的句子,你可以添加空字符串作为你的“group2 / 3”
如果在case1中你直接parts
:
[foo sentence A , AND, foo sentence B]