提取客户ID及其总订单

时间:2016-11-17 04:55:02

标签: python shell scripting

我有

之类的日志文件
"01-01-2012 01:13:36 sometext date customerid:1768 orders:3 apples" 
"01-09-2013 01:18:34 sometext date customerid:1567678 orders:4 oranges" 
"08-10-2000 08:08:28 sometext date customerid:156 orders:5 grapes" 

如何创建一个python程序,用Python报告客户ID及其订单总数。谢谢你的帮助

注意:我可以使用python内置函数(startwith等)提取客户ID和订单,并保存在不同的列表中,我无法生成包含客户ID及其总订单的报告。

2 个答案:

答案 0 :(得分:0)

import re
rex = re.compile("sometext date customer:(\d+) orders;(\d+)")
output = []
for data in logs:
    b = rex.search(data)
    output.append({"customer_id":b.group(1), "orders": b.group(2)})

print output

logs是日志文件中的数据(打开文件和readlines以从文件中读取数据)

答案 1 :(得分:0)

data = {}
with open('log.txt', 'r') as f:
    for line in f:
        id_user = [int(s) for s in line.split() if s.isdigit()][0] # this basically means to use the first digit in the line as the user id.
        if not id_user in data:
            data[id_user] = []
        data[id_user].append(line)

for id_user, lines in data.items():
    print(id_user, len(lines))
OP评论后

已编辑

data = {}
with open('log.txt', 'r') as f:
    for line in f:
        customer_id = [s for s in f.split() if s.find('customerid') != -1][0].split(':')[1]
        if not customer_id in data:
            data[customer_id] = []
        data[customer_id].append(line)

for customer_id, lines in data.items():
    print(customer_id, len(lines))