提取引号之间的子字符串忽略\“

时间:2012-09-12 09:18:19

标签: java regex

我的文件包含一些行,例如

"This is a string." = "This is a string's content."
" Another \" example \"" = " New example."
"My string
can have several lines." = "My string can have several lines."

我需要提取子字符串:

This is a string.
This is a string's content.
 Another \" example \"
 New example.
My string
can have several lines.
My string can have several lines.

这是我的代码:

String regex = "\".*?\"\\s*?=\\s*?\".*?\"";
Pattern pattern = Pattern.compile(regex,Pattern.DOTALL);
Matcher matcher = pattern.matcher(file);

目前,我可以获得“=”的左右两部分。但是当我的子串包含“\”时,我的正则表达式做得不对。

有人可以帮我写正确的正则表达式吗?我试过\“^ [\\”]而不是\“,但它不起作用..

谢谢你。

3 个答案:

答案 0 :(得分:3)

List<String> matchList = new ArrayList<String>();
Pattern regex = Pattern.compile(
    "\"          # Match a quote\n" +
    "(           # Capture in group number 1:\n" +
    " (?:        # Match either...\n" +
    "  \\\\.     # an escaped character\n" +
    " |          # or\n" +
    "  [^\"\\\\] # any character except quotes or backslashes\n" +
    " )*         # Repeat as needed\n" +
    ")           # End of capturing group\n" +
    "\"          # Match a quote", 
    Pattern.COMMENTS);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
    matchList.add(regexMatcher.group(1));
} 

答案 1 :(得分:0)

对不起,我在一个我无法测试的地方,但确实

\".*?(?:[^\\]\")\\s*=\\s*\".*?(?:[^\\]\")

工作?

我刚刚更换了     \" 同     (?:[^\\]\") 所以如果他们之前的字符是\就不会匹配。

答案 2 :(得分:-1)

/"([^"\\]*(?:\\.[^"\\]*)*)"/

SourceAlso see this previous question

相关问题