查找字符串中字符串的子序列

时间:2010-09-09 02:48:45

标签: python

我想创建一个函数来检查字符串中是否存在其他字符串的出现 但是,正在检查的子字符串可能会在主字符串中被其他字母中断。

例如:

a = 'abcde'
b = 'ace'
c = 'acb'

有问题的函数应返回b a,而不是c

我试过set(a)。交集(set(b))已经存在,我的问题是它返回ca

3 个答案:

答案 0 :(得分:11)

您可以将预期的序列转换为正则表达式:

import re

def sequence_in(s1, s2):
    """Does `s1` appear in sequence in `s2`?"""
    pat = ".*".join(s1)
    if re.search(pat, s2):
        return True
    return False

# or, more compactly:
def sequence_in(s1, s2):
    """Does `s1` appear in sequence in `s2`?"""
    return bool(re.search(".*".join(s1), s2))

a = 'abcde' 
b = 'ace' 
c = 'acb'

assert sequence_in(b, a)
assert not sequence_in(c, a)

“ace”变成正则表达式“a。* c。* e”,它按顺序找到这三个字符,可能有插入字符。

答案 1 :(得分:5)

这样的事情......

def issubstr(substr, mystr, start_index=0):
    try:
        for letter in substr:
            start_index = mystr.index(letter, start_index) + 1
        return True
    except: return False

...或

def issubstr(substr, mystr, start_index=0):
    for letter in substr:
        start_index = mystr.find(letter, start_index) + 1
        if start_index == 0: return False
    return True

答案 2 :(得分:3)

def issubstr(s1, s2):
    return "".join(x for x in s2 if x in  s1) == s1

>>> issubstr('ace', 'abcde')
True

>>> issubstr('acb', 'abcde')
False