在C#中合并字典

时间:2008-11-16 17:39:56

标签: c# dictionary merge

在C#中合并两个或多个词典(Dictionary<T1,T2>)的最佳方法是什么? (像LINQ这样的3.0功能很好)。

我正在考虑方法签名:

public static Dictionary<TKey,TValue>
                 Merge<TKey,TValue>(Dictionary<TKey,TValue>[] dictionaries);

public static Dictionary<TKey,TValue>
                 Merge<TKey,TValue>(IEnumerable<Dictionary<TKey,TValue>> dictionaries);

编辑:从Jare​​dPar和Jon Skeet那里得到了一个很酷的解决方案,但我正在考虑处理重复键的问题。在发生冲突的情况下,只要它是一致的,将哪个值保存到dict无关紧要。

27 个答案:

答案 0 :(得分:276)

这部分取决于你遇到重复的事情。例如,您可以这样做:

var result = dictionaries.SelectMany(dict => dict)
                         .ToDictionary(pair => pair.Key, pair => pair.Value);

如果你得到任何重复的密钥,那将会爆炸。

编辑:如果您使用ToLookup,那么您将获得一个查找,每个键可以有多个值。您可以然后将其转换为字典:

var result = dictionaries.SelectMany(dict => dict)
                         .ToLookup(pair => pair.Key, pair => pair.Value)
                         .ToDictionary(group => group.Key, group => group.First());

这有点难看 - 而且效率低下 - 但这是在代码方面做到最快的方法。 (诚​​然,我还没有测试过。)

你当然可以编写自己的ToDictionary2扩展方法(名字更好,但我现在没有时间考虑一下) - 这不是很难做,只是覆盖(或忽略)重复键。重要的一点(在我看来)是使用SelectMany,并意识到字典支持迭代其键/值对。

答案 1 :(得分:231)

我会这样做:

dictionaryFrom.ToList().ForEach(x => dictionaryTo.Add(x.Key, x.Value));

简单易行。根据{{​​3}},它比大多数循环更快,因为它的底层实现通过索引而不是枚举器this blog post访问元素。

如果有重复,它当然会抛出异常,所以你必须在合并之前检查。

答案 2 :(得分:93)

好吧,我迟到了,但这是我用的。如果有多个键(“righter”键替换“lefter”键),它不会爆炸,可以合并多个词典(如果需要)并保留类型(限制它需要一个有意义的默认公共构造函数):

public static class DictionaryExtensions
{
    // Works in C#3/VS2008:
    // Returns a new dictionary of this ... others merged leftward.
    // Keeps the type of 'this', which must be default-instantiable.
    // Example: 
    //   result = map.MergeLeft(other1, other2, ...)
    public static T MergeLeft<T,K,V>(this T me, params IDictionary<K,V>[] others)
        where T : IDictionary<K,V>, new()
    {
        T newMap = new T();
        foreach (IDictionary<K,V> src in
            (new List<IDictionary<K,V>> { me }).Concat(others)) {
            // ^-- echk. Not quite there type-system.
            foreach (KeyValuePair<K,V> p in src) {
                newMap[p.Key] = p.Value;
            }
        }
        return newMap;
    }

}

答案 3 :(得分:44)

琐碎的解决方案是:

using System.Collections.Generic;
...
public static Dictionary<TKey, TValue>
    Merge<TKey,TValue>(IEnumerable<Dictionary<TKey, TValue>> dictionaries)
{
    var result = new Dictionary<TKey, TValue>();
    foreach (var dict in dictionaries)
        foreach (var x in dict)
            result[x.Key] = x.Value;
    return result;
}

答案 4 :(得分:20)

尝试以下

static Dictionary<TKey, TValue>
    Merge<TKey, TValue>(this IEnumerable<Dictionary<TKey, TValue>> enumerable)
{
    return enumerable.SelectMany(x => x).ToDictionary(x => x.Key, y => y.Value);
}

答案 5 :(得分:17)

Dictionary<String, String> allTables = new Dictionary<String, String>();
allTables = tables1.Union(tables2).ToDictionary(pair => pair.Key, pair => pair.Value);

答案 6 :(得分:14)

以下适用于我。如果有重复项,它将使用dictA的值。

public static IDictionary<TKey, TValue> Merge<TKey, TValue>(this IDictionary<TKey, TValue> dictA, IDictionary<TKey, TValue> dictB)
    where TValue : class
{
    return dictA.Keys.Union(dictB.Keys).ToDictionary(k => k, k => dictA.ContainsKey(k) ? dictA[k] : dictB[k]);
}

答案 7 :(得分:10)

我参加派对的时间已经很晚,可能会遗漏一些东西,但是如果要么没有重复的密钥,或者正如OP所说的那样,“如果发生冲突,将哪个值保存到dict并不重要只要它是一致的,“这个有什么问题(将D2合并到D1中)?

foreach (KeyValuePair<string,int> item in D2)
            {
                 D1[item.Key] = item.Value;
            }

看起来很简单,也许太简单了,我想知道我是否遗漏了什么。这是我在一些代码中使用的,我知道没有重复的密钥。不过,我还在测试中,所以如果我忽略了某些东西,我现在很想知道,而不是稍后再发现。

答案 8 :(得分:8)

这是我使用的辅助函数:

using System.Collections.Generic;
namespace HelperMethods
{
    public static class MergeDictionaries
    {
        public static void Merge<TKey, TValue>(this IDictionary<TKey, TValue> first, IDictionary<TKey, TValue> second)
        {
            if (second == null || first == null) return;
            foreach (var item in second) 
                if (!first.ContainsKey(item.Key)) 
                    first.Add(item.Key, item.Value);
        }
    }
}

答案 9 :(得分:6)

如何添加params重载?

此外,您应将其键入IDictionary以获得最大的灵活性。

public static IDictionary<TKey, TValue> Merge<TKey, TValue>(IEnumerable<IDictionary<TKey, TValue>> dictionaries)
{
    // ...
}

public static IDictionary<TKey, TValue> Merge<TKey, TValue>(params IDictionary<TKey, TValue>[] dictionaries)
{
    return Merge((IEnumerable<TKey, TValue>) dictionaries);
}

答案 10 :(得分:5)

考虑performance of dictionary key lookups and deletes,因为它们是哈希操作,并且考虑到问题的措辞是最佳方式,我认为下面是一个完全有效的方法,其他的有点过于复杂,恕我直言。

    public static void MergeOverwrite<T1, T2>(this IDictionary<T1, T2> dictionary, IDictionary<T1, T2> newElements)
    {
        if (newElements == null) return;

        foreach (var e in newElements)
        {
            dictionary.Remove(e.Key); //or if you don't want to overwrite do (if !.Contains()
            dictionary.Add(e);
        }
    }

或者如果您在多线程应用程序中工作并且您的字典需要是线程安全的,那么您应该这样做:

    public static void MergeOverwrite<T1, T2>(this ConcurrentDictionary<T1, T2> dictionary, IDictionary<T1, T2> newElements)
    {
        if (newElements == null || newElements.Count == 0) return;

        foreach (var ne in newElements)
        {
            dictionary.AddOrUpdate(ne.Key, ne.Value, (key, value) => value);
        }
    }

然后您可以将其包装以使其处理字典的枚举。无论如何,你正在考虑~O(3n)(所有条件都很完美),因为.Add()会在幕后做一个额外的,不必要但实际上是免费的Contains()。我认为它没有好转。

如果您想限制大型集合的额外操作,您应该总结您要合并的每个字典的Count并将目标字典的容量设置为该值,这样可以避免以后的成本调整大小。所以,最终产品是这样的......

    public static IDictionary<T1, T2> MergeAllOverwrite<T1, T2>(IList<IDictionary<T1, T2>> allDictionaries)
    {
        var initSize = allDictionaries.Sum(d => d.Count);
        var resultDictionary = new Dictionary<T1, T2>(initSize);
        allDictionaries.ForEach(resultDictionary.MergeOverwrite);
        return resultDictionary;
    }

请注意,我为此方法提供了IList<T> ...主要是因为如果你接受了IEnumerable<T>,那么你已经打开了自己的同一组的多个枚举,这可能是如果从延迟的LINQ语句中获得字典集合,则会非常昂贵。

答案 11 :(得分:3)

根据上面的答案,但添加一个Func参数让调用者处理重复项:

public static Dictionary<TKey, TValue> Merge<TKey, TValue>(this IEnumerable<Dictionary<TKey, TValue>> dicts, 
                                                           Func<IGrouping<TKey, TValue>, TValue> resolveDuplicates)
{
    if (resolveDuplicates == null)
        resolveDuplicates = new Func<IGrouping<TKey, TValue>, TValue>(group => group.First());

    return dicts.SelectMany<Dictionary<TKey, TValue>, KeyValuePair<TKey, TValue>>(dict => dict)
                .ToLookup(pair => pair.Key, pair => pair.Value)
                .ToDictionary(group => group.Key, group => resolveDuplicates(group));
}

答案 12 :(得分:3)

派对现在几乎已经死了,但这是一个“改进”版本的user166390,它进入了我的扩展库。 除了一些细节,我添加了一个委托来计算合并的值。

/// <summary>
/// Merges a dictionary against an array of other dictionaries.
/// </summary>
/// <typeparam name="TResult">The type of the resulting dictionary.</typeparam>
/// <typeparam name="TKey">The type of the key in the resulting dictionary.</typeparam>
/// <typeparam name="TValue">The type of the value in the resulting dictionary.</typeparam>
/// <param name="source">The source dictionary.</param>
/// <param name="mergeBehavior">A delegate returning the merged value. (Parameters in order: The current key, The current value, The previous value)</param>
/// <param name="mergers">Dictionaries to merge against.</param>
/// <returns>The merged dictionary.</returns>
public static TResult MergeLeft<TResult, TKey, TValue>(
    this TResult source,
    Func<TKey, TValue, TValue, TValue> mergeBehavior,
    params IDictionary<TKey, TValue>[] mergers)
    where TResult : IDictionary<TKey, TValue>, new()
{
    var result = new TResult();
    var sources = new List<IDictionary<TKey, TValue>> { source }
        .Concat(mergers);

    foreach (var kv in sources.SelectMany(src => src))
    {
        TValue previousValue;
        result.TryGetValue(kv.Key, out previousValue);
        result[kv.Key] = mergeBehavior(kv.Key, kv.Value, previousValue);
    }

    return result;
}

答案 13 :(得分:2)

@Tim:应该是评论,但评论不允许进行代码编辑。

Dictionary<string, string> t1 = new Dictionary<string, string>();
t1.Add("a", "aaa");
Dictionary<string, string> t2 = new Dictionary<string, string>();
t2.Add("b", "bee");
Dictionary<string, string> t3 = new Dictionary<string, string>();
t3.Add("c", "cee");
t3.Add("d", "dee");
t3.Add("b", "bee");
Dictionary<string, string> merged = t1.MergeLeft(t2, t2, t3);

注意:@Andrew Orsich将@ANeves的修改应用于解决方案,因此MergeLeft现在看起来像这样:

public static Dictionary<K, V> MergeLeft<K, V>(this Dictionary<K, V> me, params IDictionary<K, V>[] others)
    {
        var newMap = new Dictionary<K, V>(me, me.Comparer);
        foreach (IDictionary<K, V> src in
            (new List<IDictionary<K, V>> { me }).Concat(others))
        {
            // ^-- echk. Not quite there type-system.
            foreach (KeyValuePair<K, V> p in src)
            {
                newMap[p.Key] = p.Value;
            }
        }
        return newMap;
    }

答案 14 :(得分:2)

我知道这是一个老问题,但是既然我们现在有了LINQ,你可以在一行中这样做

Dictionary<T1,T2> merged;
Dictionary<T1,T2> mergee;
mergee.ToList().ForEach(kvp => merged.Add(kvp.Key, kvp.Value));

mergee.ToList().ForEach(kvp => merged.Append(kvp));

答案 15 :(得分:2)

请注意,如果使用扩展方法“添加”,则会使用集合初始值设定项来组合所需数量的字典,如下所示:

public static void Add<K, V>(this Dictionary<K, V> d, Dictionary<K, V> other) {
  foreach (var kvp in other)
  {
    if (!d.ContainsKey(kvp.Key))
    {
      d.Add(kvp.Key, kvp.Value);
    }
  }
}


var s0 = new Dictionary<string, string> {
  { "A", "X"}
};
var s1 = new Dictionary<string, string> {
  { "A", "X" },
  { "B", "Y" }
};
// Combine as many dictionaries and key pairs as needed
var a = new Dictionary<string, string> {
  s0, s1, s0, s1, s1, { "C", "Z" }
};

答案 16 :(得分:2)

是C#的新手,害怕看到复杂的答案。

以下是一些简单的答案。
合并d1,d2等字典,并处理所有重叠的键(以下示例中的“ b”):

示例1

{
    // 2 dictionaries,  "b" key is common with different values

    var d1 = new Dictionary<string, int>() { { "a", 10 }, { "b", 21 } };
    var d2 = new Dictionary<string, int>() { { "c", 30 }, { "b", 22 } };

    var result1 = d1.Concat(d2).GroupBy(ele => ele.Key).ToDictionary(ele => ele.Key, ele => ele.First().Value);
    // result1 is  a=10, b=21, c=30    That is, took the "b" value of the first dictionary

    var result2 = d1.Concat(d2).GroupBy(ele => ele.Key).ToDictionary(ele => ele.Key, ele => ele.Last().Value);
    // result2 is  a=10, b=22, c=30    That is, took the "b" value of the last dictionary
}

示例2

{
    // 3 dictionaries,  "b" key is common with different values

    var d1 = new Dictionary<string, int>() { { "a", 10 }, { "b", 21 } };
    var d2 = new Dictionary<string, int>() { { "c", 30 }, { "b", 22 } };
    var d3 = new Dictionary<string, int>() { { "d", 40 }, { "b", 23 } };

    var result1 = d1.Concat(d2).Concat(d3).GroupBy(ele => ele.Key).ToDictionary(ele => ele.Key, ele => ele.First().Value);
    // result1 is  a=10, b=21, c=30, d=40    That is, took the "b" value of the first dictionary

    var result2 = d1.Concat(d2).Concat(d3).GroupBy(ele => ele.Key).ToDictionary(ele => ele.Key, ele => ele.Last().Value);
    // result2 is  a=10, b=23, c=30, d=40    That is, took the "b" value of the last dictionary
}

有关更复杂的情况,请参见其他答案。
希望有帮助。

答案 17 :(得分:2)

选项1:如果您确定两个字典中没有重复的键,则取决于要发生的情况。比你能做的:

var result = dictionary1.Union(dictionary2).ToDictionary(k => k.Key, v => v.Value)

注意::如果字典中有重复的键,则会抛出错误。

选项2::如果可以拥有重复键,则必须使用where子句来处理重复键。

var result = dictionary1.Union(dictionary2.Where(k => !dictionary1.ContainsKey(k.Key))).ToDictionary(k => k.Key, v => v.Value)

注意:不会获得重复的密钥。如果有重复的密钥,那么它将获得Dictionary1的密钥。

选项3::如果要使用ToLookup。那么您将获得一个查询,该查询每个键可以具有多个值。您可以将该查询转换为字典:

var result = dictionaries.SelectMany(dict => dict)
                         .ToLookup(pair => pair.Key, pair => pair.Value)
                         .ToDictionary(group => group.Key, group => group.First());

答案 18 :(得分:1)

使用扩展方法合并。当存在重复键时它不会抛出异常,而是用第二个字典中的键替换这些键。

internal static class DictionaryExtensions
{
    public static Dictionary<T1, T2> Merge<T1, T2>(this Dictionary<T1, T2> first, Dictionary<T1, T2> second)
    {
        if (first == null) throw new ArgumentNullException("first");
        if (second == null) throw new ArgumentNullException("second");

        var merged = new Dictionary<T1, T2>();
        first.ToList().ForEach(kv => merged[kv.Key] = kv.Value);
        second.ToList().ForEach(kv => merged[kv.Key] = kv.Value);

        return merged;
    }
}

用法:

Dictionary<string, string> merged = first.Merge(second);

答案 19 :(得分:1)

using System.Collections.Generic;
using System.Linq;

public static class DictionaryExtensions
{
    public enum MergeKind { SkipDuplicates, OverwriteDuplicates }
    public static void Merge<K, V>(this IDictionary<K, V> target, IDictionary<K, V> source, MergeKind kind = MergeKind.SkipDuplicates) =>
        source.ToList().ForEach(_ => { if (kind == MergeKind.OverwriteDuplicates || !target.ContainsKey(_.Key)) target[_.Key] = _.Value; });
}

您可以跳过/忽略(默认)或覆盖重复项:并且Bob是您的叔叔,前提是您对Linq的性能不太挑剔,但是像我一样喜欢简洁的可维护代码:在这种情况下,您可以删除默认的MergeKind。 SkipDuplicates可以强制调用者进行选择,并使开发人员知道结果是什么!

答案 20 :(得分:1)

与我先前的回答相比,从使用上简化了,如果存在,则默认为无损合并,如果为true,则完全覆盖而不是使用枚举。它仍然可以满足我自己的需求,不需要任何高级代码:

using System.Collections.Generic;
using System.Linq;

public static partial class Extensions
{
    public static void Merge<K, V>(this IDictionary<K, V> target, IDictionary<K, V> source, bool overwrite = false)
    {
        source.ToList().ForEach(_ => {
            if ((!target.ContainsKey(_.Key)) || overwrite)
                target[_.Key] = _.Value;
        });
    }
}

答案 21 :(得分:0)

我会拆分@orip 的简单和非垃圾创建解决方案,以便在 AddAll() 之外提供一个就地 Merge() 来处理将一个字典添加到另一个字典的简单情况。

using System.Collections.Generic;
...
public static Dictionary<TKey, TValue>
    AddAll<TKey,TValue>(Dictionary<TKey, TValue> dest, Dictionary<TKey, TValue> source)
{
    foreach (var x in source)
        dest[x.Key] = x.Value;
}

public static Dictionary<TKey, TValue>
    Merge<TKey,TValue>(IEnumerable<Dictionary<TKey, TValue>> dictionaries)
{
    var result = new Dictionary<TKey, TValue>();
    foreach (var dict in dictionaries)
        result.AddAll(dict);
    return result;
}

答案 22 :(得分:0)

fromDic.ToList().ForEach(x =>
        {
            if (toDic.ContainsKey(x.Key))
                toDic.Remove(x.Key);
            toDic.Add(x);
        });

答案 23 :(得分:0)

来自@ user166390的版本,带有添加的IEqualityComparer参数,以允许不区分大小写的键比较。

    public static T MergeLeft<T, K, V>(this T me, params Dictionary<K, V>[] others)
        where T : Dictionary<K, V>, new()
    {
        return me.MergeLeft(me.Comparer, others);
    }

    public static T MergeLeft<T, K, V>(this T me, IEqualityComparer<K> comparer, params Dictionary<K, V>[] others)
        where T : Dictionary<K, V>, new()
    {
        T newMap = Activator.CreateInstance(typeof(T), new object[] { comparer }) as T;

        foreach (Dictionary<K, V> src in 
            (new List<Dictionary<K, V>> { me }).Concat(others))
        {
            // ^-- echk. Not quite there type-system.
            foreach (KeyValuePair<K, V> p in src)
            {
                newMap[p.Key] = p.Value;
            }
        }
        return newMap;
    }

答案 24 :(得分:0)

public static IDictionary<K, V> AddRange<K, V>(this IDictionary<K, V> one, IDictionary<K, V> two)
        {
            foreach (var kvp in two)
            {
                if (one.ContainsKey(kvp.Key))
                    one[kvp.Key] = two[kvp.Key];
                else
                    one.Add(kvp.Key, kvp.Value);
            }
            return one;
        }

答案 25 :(得分:0)

或:

public static IDictionary<TKey, TValue> Merge<TKey, TValue>( IDictionary<TKey, TValue> x, IDictionary<TKey, TValue> y)
    {
        return x
            .Except(x.Join(y, z => z.Key, z => z.Key, (a, b) => a))
            .Concat(y)
            .ToDictionary(z => z.Key, z => z.Value);
    }

结果是一个联合,其中重复的条目“y”获胜。

答案 26 :(得分:0)

使用EqualityComparer合并,将要比较的项目映射到不同的值/类型。在这里,我们将从KeyValuePair(枚举词典时的项目类型)映射到Key

public class MappedEqualityComparer<T,U> : EqualityComparer<T>
{
    Func<T,U> _map;

    public MappedEqualityComparer(Func<T,U> map)
    {
        _map = map;
    }

    public override bool Equals(T x, T y)
    {
        return EqualityComparer<U>.Default.Equals(_map(x), _map(y));
    }

    public override int GetHashCode(T obj)
    {
        return _map(obj).GetHashCode();
    }
}

用法:

// if dictA and dictB are of type Dictionary<int,string>
var dict = dictA.Concat(dictB)
                .Distinct(new MappedEqualityComparer<KeyValuePair<int,string>,int>(item => item.Key))
                .ToDictionary(item => item.Key, item=> item.Value);