解析响应时间分钟明智的python

时间:2013-09-27 19:07:05

标签: python parsing

我有一个看起来像这样的输入文件,我想按分钟计算响应时间。

datapoint,time,transaction,PT,Responsetime,errorcode
a06i0000003uNQOAA2,2013-09-26T19:15:55.873+0000,EditMode,57,109.877193,0
a06i0000003uNQOAA2,2013-09-26T19:15:55.875+0000,Update,58,733.741379,0
a06i0000003uNQOAA2,2013-09-26T19:15:55.875+0000,ViewObject,94,386.893617,0
a06i0000003uNQOAA2,2013-09-26T19:16:25.889+0000,EditMode,110,109.209091,0
a06i0000003uNQOAA2,2013-09-26T19:16:25.889+0000,Update,109,743.660550,0
a06i0000003uNQOAA2,2013-09-26T19:16:25.890+0000,ViewObject,181,376.198895,0
a06i0000003uNQOAA2,2013-09-26T19:16:55.904+0000,EditMode,162,109.080247,0
a06i0000003uNQOAA2,2013-09-26T19:16:55.904+0000,Update,161,738.683230,0
a06i0000003uNQOAA2,2013-09-26T19:16:55.904+0000,ViewObject,266,372.627820,0
a06i0000003uNQOAA2,2013-09-26T19:17:25.918+0000,EditMode,212,108.580189,0
a06i0000003uNQOAA2,2013-09-26T19:17:25.919+0000,Update,213,735.244131,0
a06i0000003uNQOAA2,2013-09-26T19:17:25.919+0000,ViewObject,350,362.394286,0
a06i0000003uNQOAA2,2013-09-26T19:17:55.933+0000,EditMode,263,107.954373,0
a06i0000003uNQOAA2,2013-09-26T19:17:55.933+0000,Update,264,732.598485,0
a06i0000003uNQOAA2,2013-09-26T19:17:55.934+0000,ViewObject,431,359.965197,0
a06i0000003uNQOAA2,2013-09-26T19:18:25.947+0000,EditMode,314,107.815287,0
a06i0000003uNQOAA2,2013-09-26T19:18:25.948+0000,Update,315,733.292063,0
a06i0000003uNQOAA2,2013-09-26T19:18:25.948+0000,ViewObject,516,360.098837,0
a06i0000003uNQOAA2,2013-09-26T19:18:55.961+0000,EditMode,368,107.559783,0
a06i0000003uNQOAA2,2013-09-26T19:18:55.961+0000,Update,366,731.808743,0
a06i0000003uNQOAA2,2013-09-26T19:18:55.962+0000,ViewObject,600,359.780000,0
a06i0000003uNQOAA2,2013-09-26T19:19:25.975+0000,EditMode,418,107.406699,0
a06i0000003uNQOAA2,2013-09-26T19:19:25.976+0000,Update,419,731.613365,0
a06i0000003uNQOAA2,2013-09-26T19:19:25.976+0000,ViewObject,686,358.169096,0
a06i0000003uNQOAA2,2013-09-26T19:19:55.989+0000,EditMode,470,107.265957,0
a06i0000003uNQOAA2,2013-09-26T19:19:55.990+0000,Update,467,732.107066,0
a06i0000003uNQOAA2,2013-09-26T19:19:55.990+0000,ViewObject,768,360.317708,0
a06i0000003uNQOAA2,2013-09-26T19:20:26.003+0000,EditMode,521,107.149712,0
a06i0000003uNQOAA2,2013-09-26T19:20:26.004+0000,Update,521,733.990403,0
a06i0000003uNQOAA2,2013-09-26T19:20:26.004+0000,ViewObject,853,361.735053,0
a06i0000003uNQOAA2,2013-09-26T19:20:56.018+0000,EditMode,572,107.117133,0
a06i0000003uNQOAA2,2013-09-26T19:20:56.018+0000,Update,572,733.139860,0
a06i0000003uNQOAA2,2013-09-26T19:20:56.018+0000,ViewObject,937,361.497332,0
a06i0000003uNQOAA2,2013-09-26T19:21:26.032+0000,EditMode,623,106.855538,0
a06i0000003uNQOAA2,2013-09-26T19:21:26.032+0000,Update,623,732.057785,0
a06i0000003uNQOAA2,2013-09-26T19:21:26.032+0000,ViewObject,1020,361.191176,0
a06i0000003uNQOAA2,2013-09-26T19:21:56.046+0000,EditMode,674,107.112760,0
a06i0000003uNQOAA2,2013-09-26T19:21:56.046+0000,Update,674,731.721068,0
a06i0000003uNQOAA2,2013-09-26T19:21:56.046+0000,ViewObject,1106,360.622966,0
a06i0000003uNQOAA2,2013-09-26T19:22:26.059+0000,EditMode,724,107.041436,0

这是我想出的程序,但它给了我整个响应时间,而不是特定的每一分钟。不知道我哪里错了。任何指针都将非常感激。

import numpy as np
from scipy import stats

rtlist = []
reqpslist = []

newFile = open('100ulog.csv','r')
FILE = newFile.readlines()
newFile.close()


for line in FILE:
    newline1 = line.split(":")
    newline2 = line.split(",")
    min = newline1[1]
    if newline1[1] == min:
        rtlist.append(newline2[4])
        reqpslist.append(newline2[3])
        print rtlist

    else:
        rtlist[:] = []
        min = min+1

1 个答案:

答案 0 :(得分:0)

我要继续猜猜我想你想要什么。如果你编辑你的问题,我会编辑我的答案。 您希望按分钟获得响应时间。让我们首先解析整个文件,得到有趣的部分 - a)分钟,b)PT,c)响应时间。

我们将使用re

>>> import re
>>> data = open('100ulog.csv','r').read()
>>> lst = re.findall('.+?,.+?T\d+:(\d+):.+?,.+?,(\d+),(\d+\.\d+),', data)
>>> # This will return a list of interesting tuples like: [('16', '181', '376.198895'),...]

现在我们可以随心所欲地做任何事情。假设我们要构建一个字典,分钟是它的键,值是pt和响应时间的元组(我们将使用collections.defaultdict

>>> from collections import defaultdict
>>> dic = defaultdict(list)
>>> for item in lst:
...     dic[int(item[0])].append(item[1:3])

修改

示例:

>>> data
'a06i0000003uNQOAA2,2013-09-26T19:15:55.873+0000,EditMode,57,109.877193,0\na06i0
000003uNQOAA2,2013-09-26T19:15:55.875+0000,Update,58,733.741379,0\na06i0000003uN
QOAA2,2013-09-26T19:15:55.875+0000,ViewObject,94,386.893617,0\na06i0000003uNQOAA
2,2013-09-26T19:16:25.889+0000,EditMode,110,109.209091,0\na06i0000003uNQOAA2,201
3-09-26T19:16:25.889+0000,Update,109,743.660550,0\na06i0000003uNQOAA2,2013-09-26
T19:16:25.890+0000,ViewObject,181,376.198895,0\na06i0000003uNQOAA2,2013-09-26T19
:16:55.904+0000,EditMode,162,109.080247,0\na06i0000003uNQOAA2,2013-09-26T19:16:5
5.904+0000,Update,161,738.683230,0\na06i0000003uNQOAA2,2013-09-26T19:16:55.904+0
000,ViewObject,266,372.627820,0\na06i0000003uNQOAA2,2013-09-26T19:17:25.918+0000
,EditMode,212,108.580189,0\na06i0000003uNQOAA2,2013-09-26T19:17:25.919+0000,Upda
te,213,735.244131,0\na06i0000003uNQOAA2,2013-09-26T19:17:25.919+0000,ViewObject,
350,362.394286,0\na06i0000003uNQOAA2,2013-09-26T19:17:55.933+0000,EditMode,263,1
07.954373,0\na06i0000003uNQOAA2,2013-09-26T19:17:55.933+0000,Update,264,732.5984
85,0\na06i0000003uNQOAA2,2013-09-26T19:17:55.934+0000,ViewObject,431,359.965197,
0\na06i0000003uNQOAA2,2013-09-26T19:18:25.947+0000,EditMode,314,107.815287,0\na0
6i0000003uNQOAA2,2013-09-26T19:18:25.948+0000,Update,315,733.292063,0\na06i00000
03uNQOAA2,2013-09-26T19:18:25.948+0000,ViewObject,516,360.098837,0\na06i0000003u
NQOAA2,2013-09-26T19:18:55.961+0000,EditMode,368,107.559783,0\na06i0000003uNQOAA
2,2013-09-26T19:18:55.961+0000,Update,366,731.808743,0\na06i0000003uNQOAA2,2013-
09-26T19:18:55.962+0000,ViewObject,600,359.780000,0'
>>> import re
>>> from collections import defaultdict
>>> lst = re.findall('.+?,.+?T\d+:(\d+):.+?,.+?,(\d+),(\d+\.\d+),', data)
>>> dic = defaultdict(list)
>>> for item in lst:
...     dic[int(item[0])].append(item[1:3])
...
>>> dic
defaultdict(<type 'list'>, {16: [('110', '109.209091'), ('109', '743.660550'), (
'181', '376.198895'), ('162', '109.080247'), ('161', '738.683230'), ('266', '372
.627820')], 17: [('212', '108.580189'), ('213', '735.244131'), ('350', '362.3942
86'), ('263', '107.954373'), ('264', '732.598485'), ('431', '359.965197')], 18:
[('314', '107.815287'), ('315', '733.292063'), ('516', '360.098837'), ('368', '1
07.559783'), ('366', '731.808743'), ('600', '359.780000')], 15: [('57', '109.877
193'), ('58', '733.741379'), ('94', '386.893617')]})