解析文本文件中的特定数据

时间:2018-07-20 11:11:33

标签: python python-3.x

我正在解析具有这种格式的文本文件

    ...some lines before this...
    MY TEST MATRIX (ROWS)
     2X+00  2X+00  1X+00  
     2X+00  2X+00  1K+00  
     2X+00  2X+00  1X+00
    MY TEST END
     2Y+00  2Y+00  1E+00  
     2Y+00  2Z+00  1E+00  
     2Y+00  2F+00  1E+00
    STOP
    ---some lines after this

我正在尝试在一个数组中读取“我的测试矩阵”和“我的测试结束”之间的值,而在另一个数组中读取“我的测试结束值”和“停止”之间的值。

这是我到目前为止写的:

       file_open = open("%s" %filename,"r")
       all_lines = file_open.readlines()
           for line in all_lines:
             line = line.strip()                           
             if line[0] !="MY TEST MATRIX (ROWS)":

不幸的是,这会读取所有行。 我想知道是否有人可以分享一些有关如何读取这些块之间的数组中的数字数据的想法。任何建议都会有所帮助。

1 个答案:

答案 0 :(得分:0)

使用re.findall

例如:

import re
s = """...some lines before this...
    MY TEST MATRIX (ROWS)
     2X+00  2X+00  1X+00  
     2X+00  2X+00  1K+00  
     2X+00  2X+00  1X+00
    MY TEST END
     2Y+00  2Y+00  1E+00  
     2Y+00  2Z+00  1E+00  
     2Y+00  2F+00  1E+00
    STOP
    ---some lines after this"""

firstValue = re.findall(r"(?<=MY TEST MATRIX).*?(?=MY TEST END)", s, flags=re.DOTALL)
print([i.strip() for i in firstValue])

secondValue = re.findall(r"(?<=MY TEST END).*?(?=STOP)", s, flags=re.DOTALL)
print([i.strip() for i in secondValue])

输出:

['(ROWS)\n     2X+00  2X+00  1X+00  \n     2X+00  2X+00  1K+00  \n     2X+00  2X+00  1X+00']
['2Y+00  2Y+00  1E+00  \n     2Y+00  2Z+00  1E+00  \n     2Y+00  2F+00  1E+00']

没有重新

演示:

firstValue = [[]]
secondValue = [[]]
checkFirst = False
checkSecond = False
with open(filename, "r") as infile:
    for line in infile:
        if line.strip().startswith("MY TEST MATRIX"):
            checkFirst = True
        if line.strip().startswith("MY TEST END"):
            checkFirst = False
            checkSecond = True
        if line.strip().startswith("STOP"):
            checkSecond = False  

        if checkFirst:
            firstValue[-1].append(line) 

        if checkSecond:
            secondValue[-1].append(line)          

print(firstValue)
print(secondValue)