删除字符串中第一次出现的单词

时间:2021-07-21 19:48:28

标签: python duplicates remove

test = 'User Key Account Department Account Start Date'

我想从字符串中删除重复的单词。 this question 的解决方案运行良好...

def unique_list(l):
     ulist = []
     [ulist.append(x) for x in l if x not in ulist]
     return ulist

test = ' '.join(unique_list(test.split()))

但它只保留后续的重复项。我想删除字符串中的第一个匹配项,以便测试字符串读取“用户密钥部门帐户开始日期”。

6 个答案:

答案 0 :(得分:3)

这应该可以:

test = 'User Key Account Department Account Start Date'

words = test.split()

# if word doesn't exist in the rest of the word list, add it
test = ' '.join([word for i, word in enumerate(words) if word not in words[i+1:]])

print(test)  # User Key Department Account Start Date

答案 1 :(得分:1)

如果您只想保留每个单词的最后一次出现,那么只需从后面开始,然后继续前进。

tokens = test.split()
final = []

for word in tokens[::-1]:
    if word in final:
        continue
    else:
        final.append(word)

print(" ".join(final[::-1]))
>> 'User Key Department Account Start Date'

答案 2 :(得分:1)

这是一种方法:

l=test.split()
m=set([i for i in l if test.count(i)>1])

for i in m:
    l.remove(i)

res = ' '.join(l)

>>> print(res)
'User Key Department Account Start Date'

答案 3 :(得分:1)

您可以将源字符串转换为列表,然后在使用 unique_list 函数之前反转列表,然后在转换回字符串之前再次反转列表。

def unique_list(l):
     ulist = []
     [ulist.append(x) for x in l if x not in ulist]
     return ulist


orig="User Key Account Department Account Start Date"
orig_list=orig.split()
orig_list.reverse()

uniq_rev=unique_list(orig_list)
uniq_rev.reverse()

print(orig)
print(' '.join(uniq_rev))

示例:

$ python rev.py 
User Key Account Department Account Start Date
User Key Department Account Start Date

答案 4 :(得分:0)

如果你喜欢它的功能:

from functools import reduce
from collections import Counter

import re


if __name__ == '__main__':
    sentence = 'User Key Account Department Account Start Date'

    result = reduce(
        lambda sentence, word: re.sub(rf'{word}\s*', '', sentence, count=1),
        map(
            lambda item: item[0],
            filter(
                lambda item: item[1] > 1,
                Counter(sentence.split()).items()
            )
        ),
        sentence
    )

    print(result)
    # User Key Department Account Start Date

答案 5 :(得分:-1)

将所有元素放入一个集合中。

将您的句子标记为字符串并插入到集合中。

set<std::string> s;

s.insert("aa");
s.insert("bb");
s.insert("cc");
s.insert("cc");
s.insert("dd");