在Python中拆分字符串,但在子字符串中包含空格

时间:2017-02-12 17:23:56

标签: python

我有一个字符串,我想拆分成某些类型的列表。例如,我想将Starter Main Course Dessert拆分为[Starter, Main Course, Dessert]

我不能使用split(),因为它会拆分Main Course类型。我怎么能分裂?需要正则表达式吗?

2 个答案:

答案 0 :(得分:3)

如果您有可接受的单词列表,则可以使用正则表达式联合:

import re

acceptable_words = ['Starter', 'Main Course', 'Dessert', 'Coffee', 'Aperitif']
pattern = re.compile("("+"|".join(acceptable_words)+")", re.IGNORECASE)
# "(Starter|Main Course|Dessert|Coffee|Aperitif)"

menu = "Starter Main Course NotInTheList dessert"
print pattern.findall(menu)
# ['Starter', 'Main Course', 'dessert']

如果您只想指定应匹配哪些特殊子字符串,可以使用:

acceptable_words = ['Main Course', '\w+']

答案 1 :(得分:0)

我认为指定特殊'更为实际。只有两个单词的代币。

special_words = ['Main Course', 'Something Special']
sentence = 'Starter Main Course Dessert Something Special Date'

words = sentence.split(' ')
for i in range(len(words) - 1):
    try:
        idx = special_words.index(str(words[i]) + ' ' + words[i+1])
        words[i] = special_words[idx]
        words[i+1] = None
    except ValueError:
        pass

words = list(filter(lambda x: x is not None, words))
print(words)