具有可选组部分的正则表达式

时间:2015-11-04 16:02:20

标签: regex

我有这两种字符串匹配和分组:

<133>[S=88121248] [SID:1073710562] (   lgr_psbrdif)(72811810  )   #38:OpenChannel:on Trunk 0 BChannel:9 CID=38 with VoiceCoder: g711Alaw64k20 VbdCoder: InvalidCoder255 DetectorSide: 0 FaxModemDet NO_FAX_MODEM_DETECTED 

<133>[S=88209541] (     sip_stack)(73281971  )   TcpTransportObject#430::DispatchQueueEvent(EVENT_RECEIVER_DISCONNECT) - Closing connection  

我需要匹配两者并获得特定组。我使用这种模式:

<(.*)>\[S=(.*)\] (\[SID:(.*?)\])?(.*)

我匹配的是:

Match0: <133>[S=88121248] [SID:1073710562] ......the full line  
Group1: 133  
Group2: 88121248] [SID:1073710562  
Group3:  
Group4:  
Group5: ......the full line  

Match1: <133>[S=88209541] ......the full line  
Group1: 133  
Group2: 88209541  
Group3:   
Group4:  
Group5: ......the full line  

我需要什么:

Match0: <133>[S=88121248] [SID:1073710562] ......the full line  
Group1: 133  
Group2: 88121248  
Group3: 1073710562  
Group4:  
Group5: ......the full line  


Match1: <133>[S=88209541] ......the full line  
Group1: 133  
Group2: 88209541  
Group3:  
Group4:  
Group5: ......the full line  

要恢复两者的匹配都没问题,但分组不是。第二个字符串匹配并分组正常,但第一个字符串没有。

1 个答案:

答案 0 :(得分:2)

你使用贪婪的明星.*犯了一个典型的错误,从而超出你想要的比赛。

要匹配两个分隔符之间的任何内容,最好使用否定字符类,例如<([^>]*)><之间的>

所以这会奏效:

^<([^>]*)>\[S=([^\]]*)\]\s+(?:\[SID:([^\]]*)\]\s+)?(.*)

故障:

^<([^>]*)>                # something between < and > at the start of the line
\[S=([^\]]*)\]\s+         # something between "[S=" and "]"
(?:\[SID:([^\]]*)\]\s+)?  # something between "[SID:" and "]", optional
(.*)                      # rest of the string

请注意非捕获括号(?:...),以消除结果中未使用的组。

匹配

MATCH 1
1.  [1-4]   `133`
2.  [8-16]  `88121248`
3.  [23-33] `1073710562`
4.  [35-218]    `(   lgr_psbrdif)(72811810  )   #38:OpenChannel:on Trunk 0 BChannel:9 CID=38 with VoiceCoder: g711Alaw64k20 VbdCoder: InvalidCoder255 DetectorSide: 0 FaxModemDet NO_FAX_MODEM_DETECTED `

MATCH 2
1.  [220-223]   `133`
2.  [227-235]   `88209541`
3.  n/a
4.  [237-360]   `(     sip_stack)(73281971  )   TcpTransportObject#430::DispatchQueueEvent(EVENT_RECEIVER_DISCONNECT) - Closing connection  `