匹配两个字符串之间的所有

时间:2015-03-13 05:43:07

标签: regex r

假设我有这个字符串:

string <- "I2-1-EX-1-I3-1-EX-1-I2-1-I1-1-EX-1-I3-1-I2-1-EX-1-I2-1-I2-1-I1-1-I3-1-N2-1-I1-1-I1-1-I2-1-N2-1-N3-1-I1-1-NR-1-FA-1-NR-1-I3-1-I1-1-NR-1-N1-1-EX-1-QU-1-I3-1-NR-1-FA-1-EX-1-QU-1-NR-1-I2-1-I2-1-I2-1-NR-1-TR-1-I1-1-I2-1-I3-1-NR-1-I1-1-I1-1-EX-1-NR-1-NR-1-I1-1-NR-1-NR-1-I3-1-I2-1-NR-1-I1-1-QU-1-QU-1-I1-1-TR-1-QU-1-NR-1-NR-1-QU-1-TR-1-NR-1-I1-1-TR-1-I1-1-FA-1-I1-1-I2-1-QU-1-TR-1-FA-1-EX-1-QU-1-QU-1-QU-1-NR-1-QU-1-I1-1-TR-1-FA-1-QU-1-FA-1-FA-1-TR-1-FA-1-QU-1-EX-1-QU-1-I1-1-QU-1-QU-1-FA-1-FA-1-QU-1-QU-1-FA-1-FA-1-I3-1-NR-1-FA-1-I1-1-I2-1-FA-1-QU-1-FA-1-I2-1-FA-1-NR-1-I1-1-NR-1-TR-1-NR-1-EX-1-NR-1-NR-1-EX-1-TR-1-I3-1-I1-1-NR-1-NR-1-FA-1-I1-1-TR-1-EX-1-NR-1-NR-1-I1-1-I1-1-NR-1-I1-1-NR-1-EX-1-EX-1-EX-1-NR-1-NR-1-NR-1-FA-1-FA"

我想匹配包含"I"的两个标记之间发生的所有事情。例如,这意味着从字符串的开头匹配:

-EX-
-EX-
-EX-
-EX-
-N2-
-N2-1-N3-
-NR-1-FA-1-NR-
etc...

如何使用正则表达式(理想情况下适用于R)来实现此匹配?

我尝试了类似(?=<1|2|3).*(?=I)的内容,但似乎没有用。我上面的正则表达式的基本原理是,所有我以1,2或3结束,这将是一个后视应该找到的左手边界,而我是前瞻应该找到的右手边界。

2 个答案:

答案 0 :(得分:4)

好像您正在尝试获取介于I[123]-11-I[123]之间的所有字符。 \K keeps the text matched so far out of the overall regex match(?:(?!I[123]).)*?只有在I中不是任何单个字符时才匹配任何单个字符,否则匹配将失败。

I[123]

DEMO

答案 1 :(得分:2)

> strsplit(string, "I\\d-\\d")
[[1]]
 [1] ""                                                   
 [2] "-EX-1-"                                             
 [3] "-EX-1-"                                             
 [4] "-"                                                  
 [5] "-EX-1-"                                             
 [6] "-"                                                  
 [7] "-EX-1-"                                             
 [8] "-"                                                  
 [9] "-"                                                  
[10] "-"                                                  
[11] "-N2-1-"                                             
[12] "-"                                                  
[13] "-"                                                  
[14] "-N2-1-N3-1-"                                        
[15] "-NR-1-FA-1-NR-1-"                                   
[16] "-"                                                  
[17] "-NR-1-N1-1-EX-1-QU-1-"                              
[18] "-NR-1-FA-1-EX-1-QU-1-NR-1-"                         
[19] "-"                                                  
[20] "-"                                                  
[21] "-NR-1-TR-1-"                                        
[22] "-"                                                  
[23] "-"                                                  
[24] "-NR-1-"                                             
[25] "-"                                                  
[26] "-EX-1-NR-1-NR-1-"                                   
[27] "-NR-1-NR-1-"                                        
[28] "-"                                                  
[29] "-NR-1-"                                             
[30] "-QU-1-QU-1-"                                        
[31] "-TR-1-QU-1-NR-1-NR-1-QU-1-TR-1-NR-1-"               
[32] "-TR-1-"                                             
[33] "-FA-1-"                                             
[34] "-"                                                  
[35] "-QU-1-TR-1-FA-1-EX-1-QU-1-QU-1-QU-1-NR-1-QU-1-"     
[36] "-TR-1-FA-1-QU-1-FA-1-FA-1-TR-1-FA-1-QU-1-EX-1-QU-1-"
[37] "-QU-1-QU-1-FA-1-FA-1-QU-1-QU-1-FA-1-FA-1-"          
[38] "-NR-1-FA-1-"                                        
[39] "-"                                                  
[40] "-FA-1-QU-1-FA-1-"                                   
[41] "-FA-1-NR-1-"                                        
[42] "-NR-1-TR-1-NR-1-EX-1-NR-1-NR-1-EX-1-TR-1-"          
[43] "-"                                                  
[44] "-NR-1-NR-1-FA-1-"                                   
[45] "-TR-1-EX-1-NR-1-NR-1-"                              
[46] "-"                                                  
[47] "-NR-1-"                                             
[48] "-NR-1-EX-1-EX-1-EX-1-NR-1-NR-1-NR-1-FA-1-FA" 

如果您想将数字范围限制为1:3,请使用此模式:"I[1-3]-[1-3]"