在python列表中,我想删除所有重复的少于'k'的元素。 例如,如果k == 3,则我们的列表为:
l = [a,b,c,c,c,a,d,e,e,d,d]
然后输出必须是:
[c,c,c,d,d,d]
什么是快速的方法(我的数据很大),有什么很好的pythonic建议吗?
这是我编写的代码,但我认为这不是最快,最Python化的方式:
from collections import Counter
l = ['a', 'b', 'c', 'c', 'c', 'a', 'd', 'e', 'e', 'd', 'd']
counted = Counter(l)
temp = []
for i in counted:
if counted[i] < 3:
temp.append(i)
new_l = []
for i in l:
if i not in temp:
new_l.append(i)
print(new_l)
答案 0 :(得分:5)
You can use collections.Counter
to construct a dictionary mapping values to counts. Then use a list comprehension to filter for counts larger than a specified value.
from collections import Counter
L = list('abcccadeedd')
c = Counter(L)
res = [x for x in L if c[x] >=3]
# ['c', 'c', 'c', 'd', 'd', 'd']
答案 1 :(得分:1)
I would use a Counter from collections:
from collections import Counter
count_dict = Counter(l)
[el for el in l if count_dict[el]>2]
答案 2 :(得分:1)
A brute-force option would be to get the number of occurrences per item, then filter that output. The collections.Counter
object works nicely here:
l = [a,b,c,c,c,a,d,e,e,d,d]
c = Counter(l)
# Counter looks like {'a': 2, 'b': 1, 'c': 3...}
l = [item for item in l if c[item]>=3]
Under the hood, Counter
acts as a dictionary, which you can build yourself like so:
c = {}
for item in l:
# This will check if item is in the dictionary
# if it is, add to current count, if it is not, start at 0
# and add 1
c[item] = c.get(item, 0) + 1
# And the rest of the syntax follows from here
l = [item for item in l if c[item]>=3]
答案 3 :(得分:0)
此选项有任何缺点吗?
l = ['a','b','c','c','c','a','d','e','e','d','d']
res = [ e for e in l if l.count(e) >= 3]
#=> ['c', 'c', 'c', 'd', 'd', 'd']