从输出的特定元素列出

时间:2017-10-31 21:15:03

标签: python

我正在构建一个脚本来从服务器日志中获取数据。数据以下列格式显示,显示时间戳和出现频率。

ggplot(mtcars, aes(x=disp, y=mpg, color=factor(am))) +
   theme_bw() + 
   geom_point() + 
   geom_smooth(method = 'lm', se=FALSE) + 
   geom_abline(aes(intercept=40, slope = (-1/10), fill='Comparison Line 1'), show.legend = TRUE) +
   geom_abline(aes(intercept=25, slope = (-1/30), fill='Comparison Line 2'), show.legend = TRUE)

我正在尝试创建一个仅显示破折号之后的数字的列表:

20:52:37 - 3

20:52:38 - 8

20:52:39 - 28

20:52:40 - 58

20:52:41 - 59

20:52:42 - 51

20:52:43 - 37

20:52:44 - 22

20:52:45 - 4

20:52:47 - 14

20:52:48 - 15

20:52:49 - 12

20:52:50 - 4

20:52:51 - 5

20:52:52 - 12

20:52:53 - 5

我尝试拆分输出,然后只添加所需的元素但仍然遇到错误。尝试拆分破折号和新行代码,然后只需为每个数字添加正确的位置:

[3,8,28,etc.,etc.]

2 个答案:

答案 0 :(得分:1)

您可以使用re.findall

import re
s = """
 20:52:37 - 3

 20:52:38 - 8

 20:52:39 - 28

 20:52:40 - 58

 20:52:41 - 59
 ....
 """

data = map(int, re.findall('(?<=\s-\s)\d+', s))

输出:

[3, 8, 28, 58, 59...]

答案 1 :(得分:0)

要删除尾随换行符,您可以使用rstrip()

res = []

with open('server.log') as f:
    lines = (line.rstrip() for line in f)  # to remove trailing newlines
    lines = (line for line in lines if line)  # to remove blank lines
    res = [int(line.split(' - ')[-1]) for line in lines]

<强>输出:

>>> res
[3, 8, 28, 58, 59, 51, 37, 22, 4, 14, 15, 12, 4, 5, 12, 5]
相关问题