正则表达式匹配带有或带有单词的字符串

时间:2015-07-18 13:35:53

标签: regex elasticsearch fluentd

我正在尝试解析以两种不同格式报告的modsecurity审核日志......其中一种如下

[modsecurity] [client 111.222.333.444 [domain somedomain.com] [403] [/apache/20150718/20150718-1412/20150718-141258-VapQ2kDQOQ1qTs5mQAsHDQAAAIs]  [file \"/etc/httpd/modsecurity.d/10_asl_rules.conf\"] [line \"93\"] [id \"392301\"] [rev \"7\"] [msg \"Atomicorp.com WAF Rules: Request Containing Content, but Missing Content-Type header\"] [severity \"NOTICE\"] [tag \"no_ar\"] Access denied with connection close (phase 1). Match of \"rx ^0$\" against \"REQUEST_HEADERS:Content-Length\" required.

和第二个

[modsecurity] [client 111.222.333.444] [domain somedomain.com] [200] [/apache/20150718/20150718-1429/20150718-142952-VapUz0DQOQ1qTs5mQAsHfAAAAIg]  [file "/etc/httpd/modsecurity.d/localrules.conf"] [line "3"] [id "999999"] [msg "My WAF Rules - Blocking Wordpress Login Attempt by Country Code"] Warning. Matched phrase "CN" at GEO:COUNTRY_CODE.

我使用正则表达式匹配第二条规则:

 /^\[(?<app>\w+)\](\s+)\[client (?<src_ip>\d+.\d+.\d+.\d+)\](\s+)\[domain (?<domain>.*)\](\s+)\[(?<rcode>\d+)\](\s+)\[(?<audit_data>.*)\](\s+)\[(?<modsec_file>.*)\](\s+)\[line "(?<modsec_line>\d+)"\](\s+)\[id "(?<modsec_ruleid>\d+)"\](\s+)\[msg "(?<modsec_msg>.*)"\].*$/

但是,如果值

,我将能够匹配两种格式
[rev "\d+"]

不存在然后只要其他所有内容匹配就无关紧要。这可能吗?

感谢。

1 个答案:

答案 0 :(得分:0)

这个正则表达式适合你:

/^\[(?<app>\w+)\]\s+\[client (?<src_ip>\d+.\d+.\d+.\d+)\]\s+\[domain (?<domain>.*?)\]\s+\[(?<rcode>\d+)\]\s+\[(?<audit_data>.*?)\]\s+\[(?<modsec_file>.*?)\]\s+\[line "(?<modsec_line>\d+)"\]\s+\[id "(?<modsec_ruleid>\d+)"\]\s+(?:\[rev "\d+"\]\s+)?\[msg "(?<modsec_msg>.*)"\].*$/

RegEx Demo