从字符串列表中删除重复项

时间:2019-01-03 20:03:40

标签: python

好吧,所以我有一个这样的列表,我需要删除重复的值,以便最终得到-Joe Blow,Don Wiliams,Clark Gordon ... 我正在尝试这段似乎无效的代码。我还尝试将列表转换成一个集合,但没有成功。

有什么想法吗? 谢谢

@media only screen and (max-width: 1399px) {
    #LeftFloatAds {
        display: none;
    }
}

7 个答案:

答案 0 :(得分:5)

将您的字符串转换为列表,将其转换为集合,然后使用''将其重新加入。强制转换为集合时,请通过按原始字符串的索引进行排序来保留顺序。

for s in dupes:
    print(' '.join(sorted(set(s.split()), key=s.index)))

输出:

Joe Blow
Don Williams
Clark Gordon
Albert Riddle

编辑:如果要更改列表的位置,

def remove_duplicates(dupes):
    for i in range(len(dupes)):
        dupes[i] = ' '.join(sorted(set(dupes[i].split()), key=dupes[i].index))

答案 1 :(得分:1)

漫长而稳定的方式:

dupes = ["Joe Joe Joe Blow","Don Don Williams", "Clark Clark Gordon", "Albert Riddle"]

rv = [[]]
for d in dupes:
    seen = set()
    for e in d.split():         # split each string into its name, add the name to the 
        if e not in seen:       # last list in rv and to the set 'seen' that remembers
            rv[-1].append(e)    # the seen ones.
            seen.add(e)
    rv[-1] = ' '.join(rv[-1])   # done with one name, replace the list with joined values
    rv.append([])               # and append an empty, new list for the next name

dupes = [k for k in rv if k]    # remove the empty list at the end and overwrite dupes

print(dupes)

输出:

['Joe Blow', 'Don Williams', 'Clark Gordon', 'Albert Riddle']

答案 2 :(得分:1)

您可以使用re.sub方法用一个单词代替单词的重复:

import re
def remove_duplicates(string):
    return re.sub(r'\b(\w+)\b(?:\s+\1)+', r'\1', string)

这样:

[remove_duplicates(dupe) for dupe in dupes]

返回:

['Joe Blow', 'Don Williams', 'Clark Gordon', 'Albert Riddle']

答案 3 :(得分:1)

您可以使用itertools.groupby

from itertools import groupby
def remove_duplicates(string):
    return ' '.join(k for k, _ in groupby(string.split()))

这样:

[remove_duplicates(dupe) for dupe in dupes]

返回:

['Joe Blow', 'Don Williams', 'Clark Gordon', 'Albert Riddle']

答案 4 :(得分:0)

当顺序很重要时,collections.OrderedDict会派上用场:

/usr/local/Cellar/elasticsearch@2.4/2.4.6/libexec/config

输出

/usr/local/etc/elasticsearch/jvm.options

答案 5 :(得分:0)

已经有很多好的答案,您还可以尝试使用Counter:

from collections import Counter

counters = [Counter(d.split()) for d in dupes]
final = [' '.join(c.keys()) for c in counters]

# ['Joe Blow', 'Don Williams', 'Clark Gordon', 'Albert Riddle']

答案 6 :(得分:-2)

请使用设置哪个

  list(set(l)) 
  # where l is your str