搜索多个关键字的字符串列表

时间:2015-08-06 18:32:14

标签: python

我有两个python列表,一个是关键字列表,另一个是文件名列表。我需要根据我拥有的关键字解析文件名列表。我希望python将文件名与关键字匹配,然后根据匹配的关键字执行操作。

我看起来像这样:

keywords = ["_CMD_","_COMM_","_RETRANSMIT_"]
file_list = ['2B_CMD_2015.txt','2C_CMD_2015.txt','RETRANSMIT_2015.txt']

for f_name in file_list:
  for keyword in keywords:
    if keyword in f_name:
      #perform operation based on what keyword is matched
    else:
      #print an error

我遇到的问题是,因为它遍历关键字,所以它会输出错误,直到找到文件名中的关键字,然后执行操作,但是我只想要打印如果在搜索的文件名中找不到任何关键字,则会出现错误

我尝试使用any(),但似乎在找到匹配后停止检查文件。例如,使用

for keyword in keywords:
  if any(keyword in f_name for f_name in file_list):
    print f_name
    print keyword

返回

2B_CMD_2015.txt
_CMD_
2B_CMD_2015.txt
_RETRANSMIT_

哪个不对。

修改 还尝试使用正则表达式,但不确定我是否正确地做到了这一点:

for keyword in keywords:
  for item in wordlist:
    if re.search(keyword,item) is not None:
        print keyword
        print item
    else:
        print "nope"

返回:

nope
nope
nope
_CMD_
2B_CMD_2015.txt
_CMD_
2C_CMD_2015.txt
nope
nope
nope
_RETRANSMIT_
_RETRANSMIT_2015.txt
nope
nope
nope

任何人都可以帮我解决这个问题吗?我觉得不应该这么困难。

5 个答案:

答案 0 :(得分:3)

考虑使用for-else代替if-else

for f_name in file_list:
  for keyword in keywords:
    if keyword in f_name:
      print "Found keyword %s in name %s"%(keyword, f_name)
      break
  else:
    print "Found no keyword"

注意缩进级别。 else块与for匹配,而不是if。另请注意,如果您要避免执行if,则break必须以for-else结尾。

答案 1 :(得分:1)

for-else可以帮到你。如果内部else循环未被中断,则for子句将被执行,只有在找到匹配项时才会执行。请注意,这意味着只考虑第一个匹配项,它不会查找更多匹配项。

keywords = ["_CMD_","_COMM_","_RETRANSMIT_"]
file_list = ['2B_CMD_2015.txt','2C_CMD_2015.txt','RETRANSMIT_2015.txt']

for f_name in file_list:
  for keyword in keywords:
    if keyword in f_name:
      #perform operation based on what keyword is matched
      break
  else:
    #print an error

答案 2 :(得分:1)

执行此操作的基本方法是设置标志:

for f_name in file_list:
    flag = False
    for keyword in keywords:
        if keyword in f_name:
            flag = True
            #perform operation based on what keyword is matched
    if not flag:
        #print an error

答案 3 :(得分:0)

使用any过滤列表,然后使用它:

keywords = ["_CMD_","_COMM_","_RETRANSMIT_"]
file_list = ['2B_CMD_2015.txt','2C_CMD_2015.txt','RETRANSMIT_2015.txt']
filtered = [file_name for file_name in file_list if any(keyword in file_name for keyword in keywords)]
if filtered:
    # do stuff with 'filtered'
    print("processing files...")
else:
    print("error")

示例:

>>> keywords = ["_CMD_","_COMM_","_RETRANSMIT_"]
>>> file_list = ['2B_CMD_2015.txt','2C_CMD_2015.txt','RETRANSMIT_2015.txt']
>>> filtered = [file_name for file_name in file_list if any(keyword in file_name for keyword in keywords)
...
... ]
>>> filtered
['2B_CMD_2015.txt', '2C_CMD_2015.txt']

答案 4 :(得分:0)

我建议将keywords列为将每个关键字与处理程序相关联的元组列表。您可以使用for..else构造来处理不匹配的文件。考虑例如:

def handleCmd(fn):
    print "handleCmd: " + fn

def handleComm(fn):
    print "handleComm: " + fn

def handleRetransmit(fn):
    print "handleRetransmit: " + fn

keywords = [ ( "_CMD_", handleCmd ),
             ( "_COMM_", handleComm ),
             ( "RETRANSMIT_", handleRetransmit ),
           ]


file_list = ['2B_CMD_2015.txt','2C_CMD_2015.txt','RETRANSMIT_2015.txt','bogus.t>

for fn in file_list:
    for kw, handle in keywords:
        if kw in fn:
            handle(fn)
            break
    else:
        print "OH NOE"

打印

handleCmd: 2B_CMD_2015.txt
handleCmd: 2C_CMD_2015.txt
handleRetransmit: RETRANSMIT_2015.txt
OH NOE