检查字符串是否包含列表项

时间:2015-03-20 21:47:49

标签: python

我有以下脚本来检查字符串是否包含列表项:

word = ['one',
        'two',
        'three']
string = 'my favorite number is two'
if any(word_item in string.split() for word_item in word):
    print 'string contains a word from the word list: %s' % (word_item)

这有效,但我试图打印字符串包含的列表项。我做错了什么?

4 个答案:

答案 0 :(得分:3)

问题是您使用的是if语句而不是for语句,因此您的print仅运行(最多)一次(如果至少有一个字)匹配),并且在那时,any已经贯穿整个循环。

这是做你想做的最简单的方法:

words = ['one',
         'two',
         'three']
string = 'my favorite number is two'
for word in words:
    if word in string.split():
        print('string contains a word from the word list: %s' % (word))

如果您希望此功能出于某种原因,您可以这样做:

for word in filter(string.split().__contains__, words):
    print('string contains a word from the word list: %s' % (word))

由于某个人必然会回答与性能相关的答案,即使这个问题与性能无关,将字符串拆分一次会更有效,并且取决于您要检查的字数,转换它到set也可能有用。


关于评论中的问题,如果你想要多个单词"单词",有两个简单的选项:添加空格然后搜索完整字符串中的单词,或者带有单词边界的正则表达式

最简单的方法是在文本之前和之后添加空格字符进行搜索,然后搜索' ' + word + ' '

phrases = ['one',
           'two',
           'two words']
text = "this has two words in it"

for phrase in phrases:
    if " %s " % phrase in text:
        print("text '%s' contains phrase '%s'" % (text, phrase))

对于正则表达式,只需使用\b字边界:

import re

for phrase in phrases:
    if re.search(r"\b%s\b" % re.escape(phrase), text):
        print("text '%s' contains phrase '%s'" % (text, phrase))

哪一个是更好的"很难说,但正则表达式可能效率显着降低(如果这对你很重要)。


如果你不关心单词边界,你可以这样做:

phrases = ['one',
           'two',
           'two words']
text = "the word 'tone' will be matched, but so will 'two words'"

for phrase in phrases:
    if phrase in text:
        print("text '%s' contains phrase '%s'" % (text, phrase))

答案 1 :(得分:1)

set(word).intersection(string.split())

答案 2 :(得分:1)

如果你有一个像'ninety five'这样的单词,你可以拆分该单词并检查所有单词与字符串中的一组单词相交:

words = ['one',
        'two',
        'three', "fifty ninety"]
string = set('my favorite number is two fifty five'.split())

for word in words:
    spl = word.split()
    if len(spl) > 1:
        if all(string.intersection([w]) for w in spl):
            print(word)
    elif string.intersection([word]):
        print(word)

它也会为ninety five返回True,因此您需要决定是否可行,但对单个单词使用intersection会很有效。确保将字符串包装在列表或元组中,或"foo"将成为{"f","o"}

您也可以使用set.issuperset代替all

for word in words:
    spl = word.split()
    if len(spl) > 1:
        if string.issuperset(spl):
            print(word)
    elif string.intersection([word]):
        print(word)

答案 3 :(得分:0)

您可以使用set交叉点:

word = ['one', 'two', 'three']
string = 'my favorite number is two'
co_occuring_words = set(word) & set(string.split())
for word_item in co_occuring_words:
    print 'string contains a word from the word list: %s' % (word_item)