我正在尝试在日志文件的每一行中搜索特定字符串,如果匹配,我需要能够从该特定错误中获取主机信息。
考虑以下日志条目:
05-05-2014 00:02:02,771 [HttpProxyServer-thread-1314] ERROR fd - Empty user name specified in NTLM authentication. Prompting for auth again.
Host=tools.google.com, Port=80, Client ip=/10.253.168.128, port=37271, User-Agent: Google Update/1.3.23.9;winhttp;cup-ecdsa
05-05-2014 00:02:02,771 [HttpProxyServer-thread-2156] ERROR fd - Empty user name specified in NTLM authentication. Prompting for auth again.
Host=tools.google.com, Port=80, Client ip=/10.253.168.148, port=37273, User-Agent: Google Update/1.3.23.9;winhttp;cup-ecdsa
05-05-2014 00:02:02,802 [HttpProxyServer-thread-604] ERROR fd - Empty user name specified in NTLM authentication. Prompting for auth again.
Host=tools.google.com, Port=80, Client ip=/10.253.168.92, port=37280, User-Agent: Google Update/1.3.23.9;winhttp;cup
这是我的代码:
for line in log_file:
if bool(re.search( r'Empty user name specified in NTLM authentication. Prompting for auth again.', line)):
host = re.search(r'Host=(\D+.\D+.\D+,)', line).group(1)
问题是主机信息与错误不在同一行。它在下一行。我如何获得re.search(r' Host =(\ D +。\ D +。\ D +,)',line).group(1)在下一行搜索" line& #34;目前在?
答案 0 :(得分:2)
只需插入
即可line = next(log_file)
在for
循环中您目前拥有的两个陈述之间的。
答案 1 :(得分:0)
编写一个匹配2个连续行的正则表达式,您可以从中提取每个行的主机信息,并循环匹配而不是逐行读取文件,或者添加一个在行匹配时设置的标志错误,如果为给定行设置了该标志,则提取主机信息&重置标志而不是测试错误。
答案 2 :(得分:0)
试试这个:
>>> import re
>>> fp = open('log_file')
>>> line = fp.readline()
>>> while line:
... if 'Empty user name specified in NTLM authentication. Prompting for auth again.' in line:
... host = re.search(r'Host=(\D+.\D+.\D+,)', fp.readline()).group(1)
... # ^^^^^^^^^^^^^^
... # this makes re search in the next line
... print host
... line = fp.readline()
...
tools.google.com,
tools.google.com,
tools.google.com,