使用python修复txt文件中的损坏行

时间:2015-11-01 20:30:56

标签: python-3.x

我是编程的新手。我试图在txt文件中解析并格式化'破损'行(文件中的流氓lf而不是\ cr \ lf windows格式)。使用python 3.4并阅读这些类型的帖子我已经设法读取源文件并创建一个文件,其中仅包含已损坏的行,并删除了所有lf,因此它的一条长行。现在我需要阅读该行并计算这种格式为'< |>'的分隔符并且在第36个之后添加换行符然后继续计算下一个36并添加换行符等。我尝试了一些不同的东西但是因为我不确定是否需要.tell()然后.seek()来插入\ n。有关如何在第36个分隔符之后插入换行符的建议吗?

my_count = 36 # define the number of delimiters to count
LineNumber = 1 # define line counter 
FileName = 'Broken_Registrations.txt'  # variable to define filename
target = open('Target.txt','w',encoding='utf-8') # open a file to write fixed lines
with open(FileName,encoding="utf8") as file:
    for line in file:                            # open file read
        cnt=line.count('<|>')                    # count delimiters
        if cnt == mycount:                       # count until mycount then
            target.write(line).append("\n")  # write line and append new line char
print('DONE!')  # let me know when you finished         
target.close() # close the file opened outside of the with

1 个答案:

答案 0 :(得分:0)

好吧我管理它,它一直很简单,虽然可能有更有效的方法来做到这一点但这对我有用

#import pdb
#pdb.set_trace()
my_count = 36
LineNumber = 1 # define line counter 
FileName = 'Broken_Registrations.txt'  # variable to define filename
target = open('Target.txt','w',encoding='utf-8') # open a file to write fixed lines
with open(FileName,encoding="utf8") as file:
    for line in file: # open file read
        cnt=line.count('<|>') # count delimiters
        if cnt == my_count: # count until mycount then
            line = line.rstrip() # remove whitespace
            target.write(line +"\n") # write line and append new line char
print('DONE!')  # let me know when you finished         
target.close() # close the file opened outside of the with