RegEx模式不符合要求

时间:2014-09-14 12:16:22

标签: python regex

我正在使用正则表达式

[^A-Za-z](email,|help|BGN|won't|go|corner|issues|disconected|We|group|No|send|Bv|connecting|has|Pittsburgh,|Many|(Akustica,|Toluca|cannot|Restarting|they|not|PI2|one|condition|entire|LAN|experincing|bar|Exchange,|server|Are|PA)|OutLook|right|says|Rose|Montalvo|back|computer|are|Jane|thier|Disconnected|Nrd|and/or|network|for|Appears|e-mail|unable|Connected|then|Broadview,|issue|email|shows|available|be|we|exchange|error|address|based|My|Microsoft|received|working|created|receive|impacted|WIFI|through|connection|including|or|IL|outlook|via|facility|Everyone's|servers|Also|message|"The|your|Status|doesn't|service|SI-MBX82.de.bosch.com,|next|appears|"disconnected"|Encryption|eMail/file|today|"Waiting|"send/receive"|but|it|trying|SAP|disconnected|e-mails|this|getting|can|of|connect|Incorrect|manually|is|site|an|folder"|cant|Other|have|in|Receiving|if|Plant|no|SI-MBX80.de.bosch.com|that|when|online|persists."|Customer|administrator|users|update|applications|"Disconnected"|SI-MBX81.de.bosch.com|The|on|lower|Some|It|contact|In|the|having)[^A-Za-z]

并申请,但无法在句子

中找到"Jane"
 "Issue with eMail/file Encryption Incorrect email address created for Jane Rose Montalvo."

虽然Jane出现在我正在使用的上述模式中。

可能是什么原因?

3 个答案:

答案 0 :(得分:2)

问题是你的正则表达式在单词之前和之后捕获\s,它也是匹配条件。

Hello Jane

因此,一旦Hello被捕获,Jane就会被遗忘,并且它无法匹配,因为它之前没有空格。您应该将其设为断言而不是匹配。

使用(?< = [^ a-zA-Z])而不是简单[^ a-zA-Z]。参见演示。

http://regex101.com/r/lU7jH1/9

答案 1 :(得分:2)

由于字符重叠。只需在前瞻中使用捕获组以捕获重叠的字符,

(?=[^A-Za-z](email,|help|BGN|won't|go|corner|issues|disconected|We|group|No|send|Bv|connecting|has|Pittsburgh,|Many|(Akustica,|Toluca|cannot|Restarting|they|not|PI2|one|condition|entire|LAN|experincing|bar|Exchange,|server|Are|PA)|OutLook|right|says|Rose|Montalvo|back|computer|are|Jane|thier|Disconnected|Nrd|and/or|network|for|Appears|e-mail|unable|Connected|then|Broadview,|issue|email|shows|available|be|we|exchange|error|address|based|My|Microsoft|received|working|created|receive|impacted|WIFI|through|connection|including|or|IL|outlook|via|facility|Everyone's|servers|Also|message|"The|your|Status|doesn't|service|SI-MBX82\.de\.bosch\.com,|next|appears|"disconnected"|Encryption|eMail/file|today|"Waiting|"send/receive"|but|it|trying|SAP|disconnected|e-mails|this|getting|can|of|connect|Incorrect|manually|is|site|an|folder"|cant|Other|have|in|Receiving|if|Plant|no|SI-MBX80\.de\.bosch\.com|that|when|online|persists\."|Customer|administrator|users|update|applications|"Disconnected"|SI-MBX81\.de\.bosch.com|The|on|lower|Some|It|contact|In|the|having)[^A-Za-z])

DEMO

答案 2 :(得分:0)

如果由于某种原因你不能或不想修改你的模式并且你想要捕获重叠的匹配,你可以在循环中使用re.search - 将搜索的起点移动到角色就在上一场比赛开始之后。

#recursive
def foo(s, p, start = 0):
    m = p.search(s, start)
    if not m:
        return ''
    return m.group() + foo(s, p, m.start() + 1)

#iterative
def foo1(s, p):
    result = ''
    m = p.search(s, 0)
    while m:
        result += m.group()
        m = p.search(s, m.start() + 1)
    return result

print foo(s, re.compile(p))
print foo1(s, re.compile(p))

>>> 

 eMail/file  Encryption  Incorrect  email  address  created  for  Jane  Rose  Montalvo.
 eMail/file  Encryption  Incorrect  email  address  created  for  Jane  Rose  Montalvo.
>>> 
相关问题