Python返回过去30天的数据

时间:2016-11-07 03:14:31

标签: python django python-2.7 python-3.x

我有这个日志文件样本,我需要计算过去一个月,三个月和一年的条目。以下是日志文件的几行

10/14/2015 10:04:25 AM Following file:<open file 'dirs/tmp/bundle_21241.dat.json', mode 'r' at 0x8b73498> has invalid json which is ignored
11/15/2015 10:42:53 PM Following file:<open file 'dirs/tmp/bundle_21241.dat.json', mode 'r' at 0xa314498> has invalid json which is ignored
10/21/2015 10:16:42 AM Following hmac:94e301ff67773de56194165451535ba223cd27588221363290fbfcb96d9d0539  with is already in database so dropping
11/21/2015 10:16:42 AM The data for the duplicate Hmac is : HF 13300100012015-06-15 19:11:47+0000+ 12.61 0.430  1686.00
10/21/2015 10:16:42 AM Following hmac:c35330404902c0b1bb5c6d0718407ea12b25a464433bd1e69152ccc0e0b89c9f  with is already in database so dropping
10/17/2015 10:16:42 AM The data for the duplicate Hmac is : HF 13300100012015-06-15 19:30:21+0000+ 12.61 0.010  1686.00
10/11/2015 10:16:42 AM Following hmac:8df71a9f6b6f0a0adb48c052767045f37ec34fce9c002a1c0c5ebc38ed500bf8  with is already in database so dropping
10/15/2015 10:16:42 AM The data for the duplicate Hmac is : HF 13300100012015-06-15 19:45:40+0000+ 12.61 0.018  1686.00
12/21/2015 10:16:42 AM Following hmac:fda9f5756461a8bc2922c55e75a31cf4915e6b0d016ecb786666624a0f04a02f  with is already in database so dropping
12/10/2015 10:16:42 AM The data for the duplicate Hmac is : HF 13300100012015-06-15 20:01:01+0000+ 12.60 0.048  1686.00
07/21/2015 10:16:42 AM Following hmac:84d9cdb2145b7c3e0fa2d099070b7bd291c652f30ca25c69240e33ebbd2b8677  with is already in database so dropping

这是我的代码

from datetime import date
from datetime import time
from datetime import datetime
from datetime import timedelta
import os

def fileCount(fileName):

    with open(fileName) as FileObj:

        Count = 0
        today_date = date.today()
        One_Year = str(today_date -  timedelta(days=365))
        One_Month = str(today_date -  timedelta(days=30))
        Three_Months = str(today_date -  timedelta(days=90))

        while True:

            line = FileObj.readline()

            record_date = ('-'.join(line[:10].split('/'))).split(" ")

            if not line:

                break

            if "Following hmac" in line:

                try:
                    convert_date = datetime.strptime(record_date[0], '%m-%d-%Y')

                    #print "Difference is ", todayDate -  convert_date.date()

                    #print convert_date.date()

                    date_diff = str(today_date - convert_date.date())

                    #print dateDiff[:8]

                    if date_diff[:8] < One_Month:

                        Count += 1

                        #print "Last 30 Days Failed HMAC is ", Count

                    else:

                        continue

                #print convert_date.date()

                except ValueError:

                    print 'This line has a problem:', record_date


        print "The Total Number of Failed HMAC is ", Count      

# Call The function
def main():

    filePath = 'file.txt'

    fileCount(filePath)

if __name__ == "__main__":

    main()

我是编程新手,不太了解日期算术。目前我得到了答案,但他们似乎没有返回正确的值。目标是遍历每一行并计算最后30,60和365天间隔的行数。我的代码目前包含过去30天的测试,但我收到了错误的值。

1 个答案:

答案 0 :(得分:1)

您需要将所有内容转换为日期时间对象才能比较项目。通过在列表中定义它们并使用Python Counter()来相应地计算它们,处理所有不同的范围也会更容易。这样可以更容易地扩展范围。

from datetime import datetime, timedelta
from collections import Counter


def fileCount(fileName):
    log_entry_counts = Counter()
    today = datetime.today()

    date_ranges = [
        ('three months', today - timedelta(days=90)),
        ('month', today - timedelta(days=30)),
        ('year', today - timedelta(days=365))]

    with open(fileName) as f_input:
        for line in f_input:
            if "Following hmac" in line:
                log_date = datetime.strptime(line[:10], '%m/%d/%Y')

                for text, dr in date_ranges:
                    if log_date >= dr:
                        log_entry_counts[text] += 1

    total = 0

    for text, count in log_entry_counts.items():
        print "Failed HMAC in the last {}: {}".format(text, count)
        total += count

    print "Total failed HMAC:", total

fileCount('input.txt')

这将使您的输出看起来像:

Failed HMAC in the last three months: 1
Failed HMAC in the last month: 1
Failed HMAC in the last year: 2
Total failed HMAC: 4
相关问题