Python脚本在文本文件中搜索字符串并在匹配前插入另一个文本文件

时间:2017-07-14 22:38:40

标签: python text-parsing

我是python的新手,远离编写我自己的脚本。 对于我使用lilypond的工作,我需要一个解析文本文件的脚本, 搜索字符串并在匹配前插入另一个文本文件。 我一直在搜索这种剧本,但我没有找到任何剧本。

所以我最终结合了我在这里和其他网站上找到的snipets并提出了这个脚本,它正在运行:

#!/usr/bin/env python

# usage:
# $ python thisfile.py text.txt searchstring insert.txt

import sys

f2 = open(sys.argv[3])
data = f2.read()
f2.close()

with open(sys.argv[1], "r+") as f1:
    a = [x.rstrip() for x in f1]
    index = 0
    for item in a:
        if item.startswith(sys.argv[2]):
            a.insert(index, data)
            break
        index += 1
    f1.seek(0)
    f1.truncate()
    for line in a:
        f1.write(line + "\n")
虽然我做了这个工作,但我还没有详细了解那里发生了什么,如果它有任何好处,如果不是如何使它变得更好。

1 个答案:

答案 0 :(得分:1)

我会尝试解释每一行的作用:

#!/usr/bin/env python

# usage:
# $ python thisfile.py text.txt searchstring insert.txt

# this line just imports the sys library used here to read command line arguments
import sys

# This opens the file provided has the 3rd argument.
# By default, it will open the file just for reading.
f2 = open(sys.argv[3])
# This just reads the contents of the file
# and stores them in-memory as `data`
data = f2.read()
# This closes the file.
# Essentially telling the OS that you are done with the file.
# After this, trying to read the file again would raise an error.
f2.close()

# the `with ... as ...:` provides another way of reading a file.
# the file will be automatically closed when the program exits this block (even when an error occurs)
#
# the second parameter "r+" passed to `open`
# tells the OS that you'll need to read and write to this file.
with open(sys.argv[1], "r+") as f1:
    # This is using list comprehension to convert`f1` into a list
    # where every line in f1 will be an element of the list (`for x in f1` or `for line in f1`)
    # additionally, `rstrip()` is also being called for every line
    # this strips all trailling white space from each line (including the new line character)
    a = [x.rstrip() for x in f1]

    index = 0
    # this will iterate through every element of `a` (essentially every line of `f1`)
    for item in a:
        # is this line starts with the given pattern
        if item.startswith(sys.argv[2]):
            # insert the data read in the beginning into the current index of `a`
            # this pushes all remaining lines forward
            a.insert(index, data)
            # since the match was found, break out of the for loop
            break
        # at every iteration of the for loop, keep track of the current index
        index += 1

    # this return to the beginning of the file
    # since `a = [x.rstrip() for x in f1]` moved you to the end of file
    f1.seek(0)
    # this clears all the remaining content of `f1`.
    # since we just returned to the beginning, clears the file
    f1.truncate()
    # for every line in `a` (that now includes the data read from `f2`
    for line in a:
        # write it to `f1`, reintroducing the new line character "\n"
        f1.write(line + "\n")