通过java正则表达式拆分

时间:2014-10-03 16:52:33

标签: java regex

我有一个字符串:

Snt:It was the most widespread day of environmental action in the planet's history
====================
-----------
Snt:Five years ago, I was working for just over minimum wage
====================
-----------

我想用

拆分字符串
====================
-----------

并且从第一句话中删除Snt:。 什么是最好的方式?

我使用了这个正则表达式,但它没有用!

String[] content1 =content.split("\\n\\====================\\n\\-----------\\n");

提前致谢。

4 个答案:

答案 0 :(得分:3)

怎么样?
Pattern p = Pattern.compile("^Snt:(.*)$", Pattern.MULTILINE);
Matcher m = p.matcher(str);

while (m.find()) {
    String sentence = m.group(1);
}

而不是用split进行攻击并进行额外的解析,这个只是查找以“Snt”开头的行,然后捕获后面的内容。

答案 1 :(得分:2)

由于数据的结构方式,我会将这个概念从分裂转换为匹配器。这样你就可以很好地计算Snt

private static final String VAL = "Snt:It was the most widespread day of environmental action in the planet's history\n"
        + "====================\n"
        + "-----------\n"
        + "Snt:Five years ago, I was working for just over minimum wage\n"
        + "====================\n"
        + "-----------";

public static void main(String[] args) {
    List<String> phrases = new ArrayList<String>();
    Matcher mat = Pattern.compile("Snt:(.+?)\n={20}\n-{11}\\s*").matcher(VAL);
    while (mat.find()) {
        phrases.add(mat.group(1));
    }

    System.out.printf("Value: %s%n", phrases); 
}

我使用正则表达式:"Snt:(.+?)\n={20}\n-{11}\\s*"

这假定文件中的第一个单词是Snt:,然后它将下一个短语分组,直到分隔符。它将使用任何尾随空格,使表达式为下一条记录做好准备。

此过程的优点是匹配匹配单个记录,而不是具有与一个记录的末尾部分匹配的表达式,也许是下一个记录的开头。

答案 2 :(得分:1)

由于最后一行不存在新行,因此它不会与最后==--行匹配。您需要在最后添加行结束$,以替代正则表达式中的\n

String s = "Snt:It was the most widespread day of environmental action in the planet's history\n" +
"====================\n" +
"-----------\n" +
"Snt:Five years ago, I was working for just over minimum wage\n" +
"====================\n" +
"-----------";
String m = s.replaceAll("(?m)^Snt:", "");
String[] tok = m.split("\\n\\====================\\n\\-----------(?:\\n|$)");
System.out.println(Arrays.toString(tok));

输出:

[It was the most widespread day of environmental action in the planet's history, Five years ago, I was working for just over minimum wage]

答案 3 :(得分:0)

Matcher m = Pattern.compile("([^=\\-]+)([=\\-]+[\\t\\n\\s]*)+").matcher(str);   
while (m.find()) {
    String match = m.group(1);
    System.out.println(match);
}