Question

我指的是MSDN（https://msdn.microsoft.com/en-us/library/bb293080(v=vs.110).aspx）：

说明：

如果其他参数表示的集合是HashSet   使用与当前HashSet相同的相等比较器进行集合   对象，此方法是O（n）操作。否则，这个方法是一个   O（n + m）运算，其中n是Count，m是元素的数量   在其他地方。

我试图了解平等比较器扮演的角色。

如果other也是HashSet，则交叉点可以这样工作：

T[] array = this.ToArray(); // O(1)
foreach (T item in array) // iterate through this => O(n)
    if (!other.Contains(item)) // other is a HashSet => O(1)
        this.Remove(item); // this is a HashSet => O(1)

如MSDN所述，总共O(n)。但正如我所理解的那样，如果O(n)是other，它应始终为HashSet - 无论它有什么相等的比较！

如果other不是HashSet，我们在上面的代码段中other.Contains会有更大的复杂性（例如O(log m) SortedSet或{ {1}} O(m)。因为我们有嵌套操作，所以我们必须将这些数字相乘（List为O(n*log m)或SortedSet为O(n*m)）以获得总复杂度，这比指定的{更差} {1}}。因此，List不是O(n+m)的情况的方法似乎有所不同。

也许是这样做的：

other

所以我们按照MSDN的说明得到HashSet。同样，我无法看出平等比较器在复杂性中扮演的角色。

由于微软在设计/实施HashSet<T> intersectionSet = new HashSet<T>(this.Comparer); // O(1) foreach (T item in other) // iterate through other => O(m) if (this.Contains(item)) // this is a HashSet => O(1) intersectionSet.Add(item); // intersectionSet is a HashSet => O(1) this.Clear(); // O(n) foreach (T item in intersectionSet) // O(m) in the worst case, because intersectionSet can have at most m elements this.Add(item); // O(1)方面投入了大量的思考和人力，我相信他们的版本（相等比较器在时间复杂度中发挥作用）是最好的。所以我认为我在推理中犯了一些错误。你能指点我一下吗？

Answer 1

如果other也是HashSet，则交集可以这样工作：

T[] array = this.ToArray(); // O(1)
foreach (T item in array) // iterate through this => O(n)
    if (!other.Contains(item)) // other is a HashSet => O(1)  A
        this.Remove(item); // this is a HashSet => O(1)       B

如果两个哈希集使用不同的相等比较器，那么这将是一个不正确的实现。我用A标记的行将使用other的相等比较器，我用B标记的行将使用this的相等比较器。因此，行other.Contains(item)检查错误的内容：它会检查other是否认为它包含item。应该检查的是this是否认为other包含item。

但是除了数组创建（不是O（1），以及Microsoft可以通过使用HashSet的私有字段避免使用），你提出的内容几乎就是你能做到的请参阅the reference source Microsoft实际上在等式比较器匹配的情况下。

为什么HashSet <t> .IntersectWith的时间复杂度取决于相等比较器？

1 个答案: