Python,在字符串

时间:2015-10-12 12:01:21

标签: python text filter

我只想问一下如何在字符串中找到数组中的单词? 我需要进行过滤,找到我保存在我的数组中的单词,用户在文本窗口中键入文本窗口。

我需要在数组或列表中有30多个单词。 然后用户在文本框中键入文本 然后脚本应该找到所有单词。

像垃圾邮件一样过滤我的问题。

谢谢

3 个答案:

答案 0 :(得分:0)

import re

words = ['word1', 'word2', 'word4']
s = 'Word1 qwerty word2, word3 word44'
r = re.compile('|'.join([r'\b%s\b' % w for w in words]), flags=re.I)
r.findall(s)

>> ['Word1', 'word2']

答案 1 :(得分:0)

尝试

words = ['word1', 'word2', 'word4']
s = 'word1 qwerty word2, word3 word44'
s1 = s.split(" ")
i = 0
for x in s1:
     if(x in words):
          print x
          i++

print "count is "+i

输出

 'word1'
 'word2'

  count is 2

答案 2 :(得分:0)

解决方案1使用正则表达式方法,该方法将返回数据中找到的关键字的所有实例。解决方案2将返回数据

中找到的关键字的所有实例的索引
import re

dataString = '''Life morning don't were in multiply yielding multiply gathered from it. She'd of evening kind creature lesser years us every, without Abundantly fly land there there sixth creature it. All form every for a signs without very grass. Behold our bring can't one So itself fill bring together their rule from, let, given winged our. Creepeth Sixth earth saying also unto to his kind midst of. Living male without for fruitful earth open fruit for. Lesser beast replenish evening gathering.
Behold own, don't place, winged. After said without of divide female signs blessed subdue wherein all were meat shall that living his tree morning cattle divide cattle creeping rule morning. Light he which he sea from fill. Of shall shall. Creature blessed.
Our. Days under form stars so over shall which seed doesn't lesser rule waters. Saying whose. Seasons, place may brought over. All she'd thing male Stars their won't firmament above make earth to blessed set man shall two it abundantly in bring living green creepeth all air make stars under for let a great divided Void Wherein night light image fish one. Fowl, thing. Moved fruit i fill saw likeness seas Tree won't Don't moving days seed darkness.
'''

keyWords = ['Life', 'stars', 'seed', 'rule']

#---------------------- SOLUTION 1

print 'Solution 1 output:'
for keyWord in keyWords:
    print re.findall(keyWord, dataString)

#---------------------- SOLUTION 2

print '\nSolution 2 output:'

for keyWord in keyWords:
    index = 0
    indexes = []
    indexFound = 0

    while indexFound != -1:
        indexFound = dataString.find(keyWord, index)

        if indexFound not in indexes:
            indexes.append(indexFound)

        index += 1

    indexes.pop(-1)
    print indexes

输出:

Solution 1 output:
['Life']
['stars', 'stars']
['seed', 'seed']
['rule', 'rule', 'rule']

Solution 2 output:
[0]
[765, 1024]
[791, 1180]
[295, 663, 811]