检查数组中是否存在数组中的所有项目

时间:2018-06-06 22:43:41

标签: vb.net linq compare

我有一些遗留代码,包括:

启动时填充DictionarydictParts)约。 350,000个项目(在运行时不会更改)。 dictPartsSystem.Collections.Generic.Dictionary(Of String, System.Data.DataRow)

dictParts中的每个项目都是System.Collections.Generic.KeyValuePair(Of String, System.Data.DataRow)

ArrayarrOut)经常添加和删除项目(通常在数组中的2-6项之间)。 arrOutSystem.Array仅包含string s。

每次数组更改时,我都需要查看:

  • 数组中的所有项目都存在于索引
  • 数组中的某些项目存在于索引

我认为每次数组更改时循环索引350,000将会产生巨大的性能影响,并希望LINQ能够提供帮助。

我尝试了以下内容:

Private Sub btnTest_Click(sender As System.Object, e As System.EventArgs) Handles btnTest.Click

    Dim dictParts = New Dictionary(Of Integer, String) _
                    From {{1, "AA-10-100"}, _
                          {2, "BB-20-100"}, _
                          {3, "CC-30-100"}, _
                          {4, "DD-40-100"}, _
                          {5, "EE-50-100"}}


    Dim arrOut() As String = {"AA-10-100", "BB-20-100", "CC-30-100"}

    'Tried
    Dim allPartsExist As IEnumerable(Of String) = arrOut.ToString.All(dictParts)
    'And this
    Dim allOfArrayInIndex As Object = arrOut.ToString.Intersect(dictParts).Count() = arrOut.ToString.Count

End Sub

我一直收到错误:无法转换类型为'System.Collections.Generic.Dictionary 2[System.Int32,System.String]' to type 'System.Collections.Generic.IEnumerable 1 [System.Char]'的对象。

请有人可以告诉我哪里出错了。

2 个答案:

答案 0 :(得分:1)

为了学点东西,我尝试了@ emsimpson92建议的哈希集。也许它可以为你工作。

Imports System.Text
Public Class HashSets
    Private shortList As New HashSet(Of String)
    Private longList As New HashSet(Of String)

    Private Sub HashSets_Load(sender As Object, e As EventArgs) Handles MyBase.Load
        shortList.Add("AA-10-100")
        shortList.Add("BB-20-100")
        shortList.Add("DD-40-101")
        Dim dictParts As New Dictionary(Of Integer, String) _
        From {{1, "AA-10-100"},
                          {2, "BB-20-100"},
                          {3, "CC-30-100"},
                          {4, "DD-40-100"},
                          {5, "EE-50-100"}}
        For Each kv As KeyValuePair(Of Integer, String) In dictParts
            longList.Add(kv.Value)
        Next
    'Two alternative ways to fill the hashset
    '1. remove the New from the declaration
    'longList = New HashSet(Of String)(dictParts.Values)
    '2. Added in Framework 4.7.2
    'Enumerable.ToHashSet(Of TSource) Method (IEnumerable(Of TSource))
    'longList = dictParts.Values.ToHashSet()
    End Sub

    Private Sub CompareHashSets()
        Debug.Print($"The short list has {shortList.Count} elements")
        DisplaySet(shortList)
        Debug.Print($"The long list has {longList.Count}")
        shortList.ExceptWith(longList)
        Debug.Print($"The items missing from the longList {shortList.Count}")
        DisplaySet(shortList)
        'Immediate Window Results
        'The Short list has 3 elements
        '{ AA-10-100
        'BB-20 - 100
        'DD-40 - 101
        '}
        'The Long list has 5
        'The items missing from the longList 1
        '{ DD-40-101
        '}
    End Sub

    Private Shared Sub DisplaySet(ByVal coll As HashSet(Of String))
        Dim sb As New StringBuilder()
        sb.Append("{")
        For Each s As String In coll
            sb.AppendLine($" {s}")
        Next
        sb.Append("}")
        Debug.Print(sb.ToString)
    End Sub

    Private Sub btnCompare_Click(sender As Object, e As EventArgs) Handles btnCompare.Click
        CompareHashSets()
    End Sub
End Class

注意:如果字典中存在重复值(不是重复键,重复值),则从字典填充哈希集的代码将不起作用,因为哈希集中的元素必须是唯一的。

答案 1 :(得分:1)

使用HashSet与包含350,000个值的原始Dictionary进行测试,最后添加的匹配项目为DictionaryHashSet的速度提高了15,000倍

针对原始Dictionary进行测试:

Dim AllInDict = arrOut.All(Function(a) dictParts.ContainsValue(a))
Dim SomeInDict = arrOut.Any(Function(a) dictParts.ContainsValue(a))

HashSet创建确实需要花费四次Dictionary次搜索的时间,因此如果您每四次搜索更频繁地更改Dictionary,则不值得。

Dim hs = New HashSet(Of String)(dictParts.Values)

然后您可以使用HashSet来测试成员资格,这比搜索整个Dictionary至少快14,000倍(当然,平均来说,它会快50%左右)。

Dim AllInDict2 = arrOut.All(Function(a) hs.Contains(a))
Dim SomeInDict2 = arrOut.Any(Function(a) hs.Contains(a))
相关问题