Unordered_set迭代器随机

时间:2015-09-18 04:50:43

标签: c++ data-structures hash hashtable unordered-set

我读到了谷歌关于设计一个支持快速插入,擦除和擦除随机元素的类的访谈问题。我在考虑cpp中的unordered_set,插入和擦除已经存在。然后为了删除随机元素,我认为unordered_set的begin()方法指向一个随机元素,我可以抓住它的值并从集合中删除它。这总是用于从集合中删除随机值吗?谢谢!

编辑:如果你能想到其他一些数据结构,请随意发表评论,不必是无序的。

1 个答案:

答案 0 :(得分:1)

我认为取begin()的值不会随意。可能自己做一些随机化会更好。一种方法是 从哈希表中随机选择一个存储桶并获取该存储桶的begin()值:

#include <unordered_set>
#include <random>

// Assume that T is some arbitrary type for which std::hash<T> is defined
std::unordered_set<T> myset; 

// put some elements into the set

unsigned bcount = myset.bucket_count(); // get the number of buckets
std::mt19937 rng(time(0)); // random number generator (seeded with time(0))

// returns a number in [0, bcount - 1]
uniform_int_distribution<unsigned> d(0, bcount - 1); 

// returns a random bucket index
unsigned rbucket = d(rng); 

// returns the beginning element of the selected bucket
auto it = myset.begin(rbucket); 
myset.erase(it); // removes the selected element

这肯定比取begin()的值更随机,但仍然不统一,因为桶的开头元素是首选。如果您想保证整个容器的均匀分布,您只需在[r0]中取一个随机值myset.size()-1,然后遍历该集合即可到达该元素:

#include <unordered_set>
#include <random>

// Assume that T is some arbitrary type for which std::hash<T> is defined
std::unordered_set<T> myset;

// put some elements into the set

std::mt19937 rng(time(0)); // random number generator (seeded with time(0))
uniform_int_distribution<unsigned> d(0, myset.size() - 1); 

// returns a random number from [0, myset.size() - 1]
unsigned r = d(rng); 

// iterates through the container to the r-th element
auto it = myset.begin();
for(; it != myset.end() && r > 0; ++it, r--);
myset.erase(it); // erasing the selected element

这会删除具有(伪)均匀概率的元素,但效率不高,因为它需要通过容器进行迭代。我认为使用std::unordered_set不能做得更好。