Java在字

时间:2016-01-07 00:30:48

标签: java regex string

我非常感谢Java代码的一些帮助,以分割以下输入:

word1 key="value with space" word3 -> [ "word1", "key=\"value with space\"", "word3" ]
word1 "word2 with space" word3 -> [ "word1", "word2 with space", "word3" ]
word1 word2 word3 -> [ "word1" , "word2", "word3" ]

第一个样本输入是艰难的。第二个单词在字符串中间有引号,而不是在开头。我找到了几种处理中间例子的方法,如Split string on spaces in Java, except if between quotes (i.e. treat \"hello world\" as one token)

中所述

3 个答案:

答案 0 :(得分:1)

您可以对字符串进行简单的迭代,而不是使用正则表达式:

public static String[] splitWords(String str) {
        List<String> array = new ArrayList<>(); 
        boolean inQuote = false; // Marker telling us if we are between quotes
        int previousStart = -1;  // The index of the beginning of the last word
        for (int i = 0; i < str.length(); i++) {
            char c = str.charAt(i);
            if (Character.isWhitespace(c)) {
                if (previousStart != -1 && !inQuote) {
                    // end of word
                    array.add(str.substring(previousStart, i));
                    previousStart = -1;
                }
            } else {
                // possibly new word
                if (previousStart == -1) previousStart = i;
                // toggle state of quote
                if (c == '"')
                    inQuote = !inQuote;
            }
        }
        // Add last segment if there is one
        if (previousStart != -1) 
            array.add(str.substring(previousStart));
        return array.toArray(new String [array.size()]);
    }

此方法的优点是能够根据需要正确识别空间附近的引号。例如,以下是单个字符串:

a"b c"d"e f"g

答案 1 :(得分:0)

这可以通过混合使用正则表达式和替换来完成。只需找到首先用引号括起来的文本,然后用非空格替换。然后,您可以根据空格拆分字符串并替换回密钥文本。

    String s1 = "word1 key=\"value with space\" word3";

    List<String> list = new ArrayList<String>();
    Matcher m = Pattern.compile("\"([^\"]*)\"").matcher(s1);
    while (m.find())
        s1 = s1.replace(m.group(1), m.group(1).replace(" ", "||")); // replaces the spaces between quotes with ||

    for(String s : s1.split(" ")) {
        list.add(s.replace("||", " ")); // switch back the text to a space.
        System.out.println(s.replace("||", " ")); // just to see output
    }

答案 2 :(得分:0)

可以通过在正则表达式中使用前瞻来完成拆分:

String[] words = input.split(" +(?=(([^\"]*\"){2})*[^\"]*$)");

这是一些测试代码:

String[] inputs = { "word1 key=\"value with space\" word3","word1 \"word2 with space\" word3", "word1 word2 word3"};
for (String input : inputs) {
    String[] words = input.split(" +(?=(([^\"]*\"){2})*[^\"]*$)");
    System.out.println(Arrays.toString(words));
}

输出继电器:

[word1, key="value with space", word3]
[word1, "word2 with space", word3]
[word1, word2, word3]