使用python

时间:2019-06-27 11:25:19

标签: python

根据您的所有评论,我进行了如下修改

For example:
Let say there are 3 log files in a folder, which are 20190626.txt, 20190625.txt, 20190624.txt

The log format like below and let says all the content in the txts are the same:
2019-06-26 server2 rcd[308]: Loaded 12 packages in 'ximian-red-carpet' 20190626
2019-06-26 server2 rcd[308]: id=304 COMPLETE Download 20190626
2019-06-26 server2 rcd[308]: Unable to downloaded licenses info 20190626
2019-06-26 server2 /USR/SBIN/CRON[6808]: (root) CMD ( /usr/lib/sa/sa1 ) 20190626
2019-06-26 server2 /USR/SBIN/CRON[6837]: (root) CMD ( /usr/lib/sa/sa1 ) 20190626

And I want to replace "2019-06-26" according to the file name which are "20190625", "20190624", but replaced in "2019-06-25", 2019-06-24" and so on.

Then in 20190624.txt, the content will become:
2019-06-24 server2 rcd[308]: Loaded 12 packages in 'ximian-red-carpet' 20190624
2019-06-24 server2 rcd[308]: id=304 COMPLETE Download 20190624
2019-06-24 server2 rcd[308]: Unable to downloaded licenses info 20190624
2019-06-24 server2 /USR/SBIN/CRON[6808]: (root) CMD ( /usr/lib/sa/sa1 ) 20190624
2019-06-24 server2 /USR/SBIN/CRON[6837]: (root) CMD ( /usr/lib/sa/sa1 ) 20190624

并以相同的方式处理文件中的其余日志

目前,我只能想到的逻辑是

  1. 对于目录中的文件,
  2. 打开文件
  3. 保存文件名(YYYYMMDD),
  4. 将YYYYMMDD更改为YYYY-MM-DD,
  5. 在所有日志文件中将所有“ 2019-06-26”替换为“ YYYY-MM-DD”,将“ 20190626”替换为“ YYYYMMDD”
  6. 保存文件

对于代码,我尝试如下:

    #!/usr/bin/python
    import re
    import os
    import sys
    import string

    mylist = os.listdir('C:/Users/xxx')
    length = len(mylist)
    olddate = '2019-06-26'

    for i in range (0, length - 1):
        newfilename = mylist[i]

        with open(newfilename) as f:    
            newdate = newfilename <<<< This is wrong, because newfilename is "2019062X.txt", but I want it to be "20xx-xx-xx" according to the file name    
            rtext=f.read().replace(olddate, newdate)                    
        with open(newfilename,"w") as f:
            f.write(rtext)

谢谢您的帮助!

1 个答案:

答案 0 :(得分:0)

您可以在python re模块中使用正则表达式。

要使用正则表达式替换文本,请使用re.sub函数:

import re
'''old file data'''
oldfilename = "20480824.txt"
oldtext = " \nblabla foo 2048-08-24 this \n\
2048-08-24 foo bar \n\
2048-08-24: Socket created..."

'''new file data'''
newfilename = "20481023.txt"

'''compute new file data'''
olddate = re.sub(r'(\d{4})(\d{2})(\d{2})\.txt', '\g<1>-\g<2>-\g<3>', oldfilename)
newdate = re.sub(r'(\d{4})(\d{2})(\d{2})\.txt', '\g<1>-\g<2>-\g<3>', newfilename)
newtext = re.sub(r'{}'.format(olddate), '{}'.format(newdate), oldtext)

print("---olddate : " + olddate)
print("---newdate : " + newdate)
print("---oldtext : " + oldtext)
print("---newtext : " + newtext)

输出为:

---olddate : 2048-08-24                                                                                                                              
---newdate : 2048-10-23                                                                                                                              
---oldtext :                                                                                                                                         
blabla foo 2048-08-24 this                                                                                                                           
2048-08-24 foo bar                                                                                                                                   
2048-08-24: Socket created...                                                                                                                        
---newtext :                                                                                                                                         
blabla foo 2048-10-23 this                                                                                                                           
2048-10-23 foo bar                                                                                                                                   
2048-10-23: Socket created...