返回序列中的重复项

时间:2011-11-08 20:23:38

标签: clojure duplicates sequence

我能想到的最好的是:

(defn dups [seq]
  (map (fn [[id freq]] id) 
       (filter (fn [[id freq]] (> freq 1))
               (frequencies seq))))

有更简洁的方法吗?

4 个答案:

答案 0 :(得分:17)

使用列表理解:

(defn dups [seq]
  (for [[id freq] (frequencies seq)  ;; get the frequencies, destructure
        :when (> freq 1)]            ;; this is the filter condition
   id))                              ;; just need the id, not the frequency

答案 1 :(得分:13)

(map key (remove (comp #{1} val) 
                 (frequencies seq)))

答案 2 :(得分:5)

如果你想根据列表中项目的某些属性找到重复项(即它是一个地图列表或一个记录/ java对象列表)

(defn dups-with-function
  [seq f]
  (->> seq
       (group-by f)
       ; filter out map entries where its value has only 1 item 
       (remove #(= 1 (count (val %))))))

(let [seq [{:attribute    :one
            :other-things :bob}
           {:attribute    :one
            :other-things :smith}
           {:attribute    :two
            :other-things :blah}]]
  (dups-with-function seq :attribute))

输出:

 ([:one
   [{:attribute :one, :other-things :bob}
    {:attribute :one, :other-things :smith}]])

如果您有一个java对象列表,并且想要找到所有具有重复名字的对象:

(dups-with-function my-list #(.getFirstName %))

答案 3 :(得分:1)

完成工作的最小过滤器和频率oneliner:

(filter #(< 1 ((frequencies col) %)) col)

然而,它在大数据上表现不佳。您必须通过以下方式帮助编译器:

(let [ freqs (frequencies col) ]
  (filter #(< 1 (freqs %)) col))