从文本文件中删除行 - python

时间:2014-10-28 02:21:10

标签: python

我有一个名为test.txt的文本文件,其中包含:

1.  DOC2454556
2.  PEO123PEO123PEO123PEO123
3.  PEO123PEO14PEO123P45O124
4.  PEO123PEO153PEO16PEO1563
5.  SIRFORHE
6.  DOCHELLO
7.  PEO123PEO123PEO123PEO123
8.  PEO123PEO123PEO123PEO123
9.  PEO123PEO123PEO123PEO123
10. SIRFORHE
11. DOC29993
12. PEO193PEO123PEO323PEO123
13. PEO623PEO14PEO142P45O124
14. PEO153PEO143PEO16PEO1563
15. SIRFORHE

这是我的代码:

f= open("C:/Users/JohnDoe/Desktop/test.txt", "r");
print (f.read())
f.close()

这给了我一个输出:

1.  DOC2454556
2.  PEO123PEO123PEO123PEO123
3.  PEO123PEO14PEO123P45O124
4.  PEO123PEO153PEO16PEO1563
5.  SIRFORHE
6.  DOCHELLO
7.  PEO123PEO123PEO123PEO123
8.  PEO123PEO123PEO123PEO123
9.  PEO123PEO123PEO123PEO123
10. SIRFORHE
11. DOC29993
12. PEO193PEO123PEO323PEO123
13. PEO623PEO14PEO142P45O124
14. PEO153PEO143PEO16PEO1563
15. SIRFORHE

我想制作一个只给出如下输出然后删除其余内容的过程:

6.  DOCHELLO
7.  PEO123PEO123PEO123PEO123
8.  PEO123PEO123PEO123PEO123
9.  PEO123PEO123PEO123PEO123
10. SIRFORHE

它应该只在DOCHELLO之后保持DOCHELLO和第一个SIRFORHE之间的线,这意味着6到10。 基本上它会保留特定范围的行并删除其他所有行。

4 个答案:

答案 0 :(得分:1)

对于简单的行选择问题,这个答案肯定有点过分,但它说明了Python的一个很好的属性:通常一种非常普遍的行为或处理模式可以通过一种远远超出原始用例的方式来表达。而不是创建一次性工具,您可以创建非常灵活,高度可重用的元工具。

所以不用多说了,一个生成器只返回由两个终端字符串限定的文件行,并带有一个通用的预处理工具:

import os

def bounded_lines(filepath, start=None, stop=None,
                  preprocess = lambda l: l[:-1]):
    """
    Generator that returns lines from a given file.
    If start is specifed, emits no lines until the
    start line is seen. If stop is specified, stops
    emitting lines after the stop line is seen.
    (The start and stop lines are themselves emitted.)
    Each line can be pre-processed before comparison
    or yielding. By default, this is just to strip the
    final character (the newline or \n) off. But you
    can specify arbitrary transformations, such as
    stripping spaces off the string, folding its case,
    or just whatever.
    """
    preprocess = lambda x: x if preprocess is None else preprocess
    filepath = os.path.expanduser(filepath)
    with open(filepath) as f:
        # find start indicator, yield it
        for rawline in f:
            line = preprocess(rawline)
            if line == start:
                yield line
                break
        # yield lines until stop indicator found, yield
        # it and stop
        for rawline in f:
            line = preprocess(rawline)
            yield line
            if line == stop:
                raise StopIteration


for l in bounded_lines('test.txt', 'DOCHELLO', 'SIRFORHE'):
    print l

答案 1 :(得分:0)

以下代码将用于您的目的

flines = open("test.txt").readlines()                                             
printLine = 0                                                                     
for line in flines:                                                               
   line = line.strip('\n')                                                        
   if 'DOCHELLO' in line:                                                         
      printLine = 1                                                               
   elif 'SIRFORHE' in line:
      if printLine == 1:                                                 
          print line                                                                  
          printLine = 0                                                               
   if printLine == 1:                                                             
       print line

答案 2 :(得分:0)

代码未经过测试

使用布尔值来控制线捕获。当DOCHELLO出现时,只需将布尔值设置为true,然后将每行直到SIRFORHE添加到我们想要的行列表中。

当出现SIRFORHE时,我们将列表添加到字典中,然后清除列表。 通过这种方式,您可以了解记录行的顺序。

# Open the file
f = open("C:/Users/JohnDoe/Desktop/result.txt", "r")
flines = f.readlines()

# Create some variables
start = False
start_counter = 0
data_seq = {}     # data sequ
lines = []

# Record wanted lines
for line in flines:
    if 'DOCHELLO' == line :
        start_counter += 1
        start = True
    elif 'SIRFORHE' == line :
        data_seq[start_counter] = lines
        lines =[] # Reset lines
        start = False

    if start: lines += [line]

# Output the recorded lines
for i, lines in data_seq.items():
    print
    print 'On the %d occurrence of capturing' % i
    print lines
    print

答案 3 :(得分:-1)

这绝不是最有效的方法,但它可以解决问题。

start = 6
end = 10
output= open("C:/Users/JohnDoe/Desktop/result.txt", "w")
lineCounter = 0
for line in f:
    if(lineCounter >= start):
       output.write(line)
       if(line == end):
           break
    lineCounter += 1
f.close()
output.close()