搜索txt文件中的字符串/ else打印不存在

时间:2017-05-23 22:39:56

标签: python for-loop if-statement search text

我遇到了一个问题,我正在尝试编写一个程序来梳理“某些”搜索条件的配置文件,如果它们匹配,则打印“它就在那里”,如果不打印“它不在这里”。以下是我到目前为止的情况:

import sys
import fnmatch
import re

check = ["test1", "test2", "test3"]

 for f in filter(os.path.isfile, sys.argv[1:]): ##open doc arg
    for line in open(f).readlines(): ##loop for reading line by line
        if re.match(check[0], line): ##match at beginning for check
            print(check[0], "is in place") ##print if match == true
        elif re.search(check[0], line): ##if not check search (full file)
            print(check[0], "is not in place") ##print if true
    for line in open(f).readlines():
        if re.match(check[1], line):
            print(check[1], "is in place")
        elif ((re.search(check[1], line)) == None):
            print(check[1], "is not in place")

所以问题是,如果我打印一个else语句,那么每个行(全部1500个)都会打印,因为循环逐行运行。有没有办法搜索整个文档而不是逐行搜索?

5 个答案:

答案 0 :(得分:1)

是的,这可以使用read()。但请注意,如果您的文件很大,那么在您的内存中一次加载整个文件可能不是一个好主意。

此外,您多次循环遍历同一文件,尝试通过仅迭代文件一次并立即搜索check数组中的所有值来避免这种情况。此外,尽量避免使用正则表达式,因为它们可能很慢。这样的事情也可以起作用:

for line in open(f).readlines():
    for check_value in check:
        if check_value in line:
            print "{} is in place.".format(check_value)

答案 1 :(得分:1)

使用else循环的for子句和break语句。还要注意,只需迭代文件本身即可;无需明确阅读所有行。 (我还添加了with以确保文件已关闭。)

with open(f) as infile:
    for line in infile:
        if re.match(check[0], line):
            print(check[0], "is in place")
            break     # stop after finding one match
    else:             # we got to the end of the file without a match
        print(check[0], "is not in place")

你甚至可以把它写成那些流行的生成器表达式之一:

with open(f) as infile:
    if any(re.match(check[0], line) for line in infile):
        print(check[0], "is in place")
    else:
        print(check[0], "is not in place")

由于正在打印的邮件非常相似,您可以进一步对其进行编码 -

with open(f) as infile:
    print(check[0], "is" if any(re.match(check[0], line) for line in infile) else "is not", "in place")

答案 2 :(得分:0)

要阅读整个文件,您可以使用read()代替readlines()

with open(f) as fil:
    lines = fil.read()

如果您在文件中查找的内容只是一个字符串,则表示您不需要re

if check[0] in lines:
    print(check[0], "is in place")

答案 3 :(得分:0)

我猜你可以把文件读成字符串并使用简单的if x in...,即:

with open("text_contains.txt") as f:
    text =  f.read().lower() # remove .lower() for caseSensiTive matching
for x in ["test1", "test2", "test3"]:
    if x in text:
        print("{} is in place".format(x))
    else:
        print("{} is not in place".format(x))

答案 4 :(得分:0)

如果你真的需要逐行读取文件(我假设你需要出现的那一行),那么:

import sys
import fnmatch
import re

searchTerms = ["test1", "test2", "test3"]
occurrences = {}

# Initialise occurrences list for each term:

for term in searchTerms:
    occurrences[term] = []

# Read line by line and check if any of the terms is present in that specific
# line. If it is, save the occurrence.

for f in filter(os.path.isfile, sys.argv[1:]):
    for line in open(f).readlines():
        for term in searchTerms:
            if re.match(term, line):
                occurrences[term].append(line)

# For each term, print all the lines with occurrences, if any, or 'not found'
# otherwise:

for term in searchTerms:
    if len(occurrences[term]) > 0:
        print("'%s' found in lines: %s" % ", ".join(occurrences[term]))
    else:
        print("'%s' not found" % term)

但是,如果您只需要检查该术语是否存在,无论该行是什么,只需使用read一次读取整个文件:

for f in filter(os.path.isfile, sys.argv[1:]):
    with open(f) as file:
        text = file.read()

        for term in searchTerms:
            if re.match(term, text):
                print("'%s' found" % term)
            else:
                print("'%s' not found" % term)
相关问题