从IEnumerable中过滤重复项

时间:2009-11-04 08:38:38

标签: c#

我有这段代码:

class MyObj {
    int Id;
    string Name;
    string Location;
}

IEnumerable<MyObj> list;

我想将列表转换为这样的字典:

list.ToDictionary(x => x.Name);

但是它告诉我我有重复的密钥。如何只保留每个键的第一项?

6 个答案:

答案 0 :(得分:19)

我想最简单的方法是按键分组并获取每组的第一个元素:

list.GroupBy(x => x.name).Select(g => g.First()).ToDictionary(x => x.name);

如果您的对象通过密钥实现Distinct来比较它们,您可以使用IEquatable

// I'll just randomly call your object Person for this example.
class Person : IEquatable<Person> 
{
    public string Name { get; set; }

    public bool Equals(Person other)
    {
        if (other == null)
            return false;

        return Name == other.Name;
    }

    public override bool Equals(object obj)
    {
        return base.Equals(obj as Person);
    }

    public override int GetHashCode()
    {
        return Name.GetHashCode();
    }
}

...

list.Distinct().ToDictionary(x => x.Name);

或者如果您不想这样做(可能因为您通常希望以不同的方式比较相等性,因此Equals已经在使用中),您可以自定义{{1}的实现仅针对这种情况:

IEqualityComparer

答案 1 :(得分:4)

您还可以创建自己的Distinct扩展重载方法,该方法接受Func&lt;&gt;选择不同的密钥:

public static class EnumerationExtensions
{
    public static IEnumerable<TSource> Distinct<TSource,TKey>(
        this IEnumerable<TSource> source, Func<TSource,TKey> keySelector)
    {
        KeyComparer comparer = new KeyComparer(keySelector);

        return source.Distinct(comparer);
    }

    private class KeyComparer<TSource,TKey> : IEqualityComparer<TSource>
    {
        private Func<TSource,TKey> keySelector;

        public DelegatedComparer(Func<TSource,TKey> keySelector)
        {
            this.keySelector = keySelector;
        }

        bool IEqualityComparer.Equals(TSource a, TSource b)
        {
            if (a == null && b == null) return true;
            if (a == null || b == null) return false;

            return keySelector(a) == keySelector(b);
        }

        int IEqualityComparer.GetHashCode(TSource obj)
        {
            return keySelector(obj).GetHashCode();
        }
    }
}

为任何错误的代码格式化道歉,我想减少页面上代码的大小。无论如何,你可以使用ToDictionary:

 var dictionary = list.Distinct(x => x.Name).ToDictionary(x => x.Name);

答案 2 :(得分:3)

list.Distinct().ToDictionary(x => x.Name);

答案 3 :(得分:2)

也许可以自己制作?例如:

public static class Extensions
{
    public static IDictionary<TKey, TValue> ToDictionary2<TKey, TValue>(
        this IEnumerable<TValue> subjects, Func<TValue, TKey> keySelector)
    {
        var dictionary = new Dictionary<TKey, TValue>();
        foreach(var subject in subjects)
        {
            var key = keySelector(subject);
            if(!dictionary.ContainsKey(key))
                dictionary.Add(key, subject);
        }
        return dictionary;
    }
}

var dictionary = list.ToDictionary2(x => x.Name);

没有测试过,但应该有效。 (它应该有一个比ToDictionary2更好的名字:p)

或者,您可以实现DistinctBy方法,例如:

public static IEnumerable<TSubject> DistinctBy<TSubject, TValue>(this IEnumerable<TSubject> subjects, Func<TSubject, TValue> valueSelector)
{
    var set = new HashSet<TValue>();
    foreach(var subject in subjects)
        if(set.Add(valueSelector(subject)))
            yield return subject;
}

var dictionary = list.DistinctBy(x => x.Name).ToDictionary(x => x.Name);

答案 4 :(得分:1)

这里的问题是ToDictionary扩展方法不支持具有相同键的多个值。一种解决方案是编写一个版本,然后使用它。

public static Dictionary<TKey,TValue> ToDictionaryAllowDuplicateKeys<TKey,TValue>(
  this IEnumerable<TValue> values,
  Func<TValue,TKey> keyFunc) {
  var map = new Dictionary<TKey,TValue>();
  foreach ( var cur in values ) {
    var key = keyFunc(cur);
    map[key] = cur;
  }
  return map;
}

现在转换为字典是直截了当的

var map = list.ToDictionaryAllowDuplicateKeys(x => x.Name);

答案 5 :(得分:0)

如果您具有与Name属性相同的值的MyObj的不同实例,则以下内容将起作用。这将是每个副本找到的第一个实例(对不起obj - obj2表示法,它只是示例代码):

list.SelectMany(obj => new MyObj[] {list.Where(obj2 => obj2.Name == obj.Name).First()}).Distinct();

编辑:Joren的解决方案更好,因为它不会在此过程中创建不必要的数组。