从字符串列表中删除特定项目

时间:2015-07-24 19:45:35

标签: python string list

我有一个字符串列表,我想从中删除每个字符串中的特定元素。以下是我到目前为止的情况:

s = [ "Four score and seven years ago, our fathers brought forth on",
      "this continent a new nation, conceived in liberty and dedicated"]

result = []
for item in s:
    words = item.split()
    for item in words:
        result.append(item)

print(result,'\n')

for item in result:
    g = item.find(',.:;')
    item.replace(item[g],'')
print(result)

输出结果为:

['Four', 'score', 'and', 'seven', 'years', 'ago,', 'our', 'fathers', 'brought', 'forth', 'on', 'this', 'continent', 'a', 'new', 'nation,', 'conceived', 'in', 'liberty', 'and', 'dedicated']

在这种情况下,我希望新列表包含所有单词,但除了引号和撇号之外,它不应包含任何标点符号。

 ['Four', 'score', 'and', 'seven', 'years', 'ago', 'our', 'fathers', 'brought', 'forth', 'on', 'this', 'continent', 'a', 'new', 'nation', 'conceived', 'in', 'liberty', 'and', 'dedicated']

即使使用find函数,结果也似乎相同。如何在没有标点符号的情况下更正打印?如何改进代码?

4 个答案:

答案 0 :(得分:2)

您可以使用re.split指定要拆分的正则表达式,在这种情况下,所有内容都不是数字或数字。

import re
result = []
for item in s:
    words = re.split("[^A-Za-z0-9]", s)
    result.extend(x for x in words if x) # Include nonempty elements

答案 1 :(得分:2)

分割字符串后,您可以删除想要删除的所有字符:

for item in s:
    words = item.split()
    for item in words:
        result.append(item.strip(",."))  # note the addition of .strip(...)

您可以将要删除的任何字符添加到.strip()的String参数中,所有这些都在一个字符串中。上面的例子删除了逗号和句点。

答案 2 :(得分:1)

s = [ "Four score and seven years ago, our fathers brought forth on", "this continent a new nation, conceived in liberty and dedicated"]

# Replace characters and split into words
result = [x.translate(None, ',.:;').split() for x in s] 

# Make a list of words instead of a list of lists of words (see http://stackoverflow.com/a/716761/1477364)
result = [inner for outer in result for inner in outer] 

print s

输出:

['Four', 'score', 'and', 'seven', 'years', 'ago', 'our', 'fathers', 'brought', 'forth', 'on', 'this', 'continent', 'a', 'new', 'nation', 'conceived', 'in', 'liberty', 'and', 'dedicated']

答案 3 :(得分:1)

或者,你可以在

中添加一个循环
for item in result:
    g = item.find(',.:;')
    item.replace(item[g],'')

并拆分,.:; 只需添加一个标点符号数组,如

punc = [',','.',':',';']

然后在for item in result:内遍历它,如

for p in punc:
    g = item.find(p)
    item.replace(item[g],'')

所以完整的循环是

punc = [',','.',':',';']
for item in result:
    for p in punc:
        g = item.find(p)
        item.replace(item[g],'')

我对此进行了测试,它确实有效。