Python 3.x:如何比较包含字典的两个列表,其中顺序并不重要

a = {'color': 'red'}
b = {'shape': 'triangle'}
c = {'children': [{'color': 'red'}, {'age': 8},]}

test_a = [a, b, c] 
test_b = [b, c, a]

print(test_a == test_b)  # False
print(set(test_a) == set(test_b))  # TypeError: unhashable type: 'dict'


def areEqual(a, b):
    if len(a) != len(b):
        return False

    for d in a:
        if d not in b:
            return False
    return True

canonicalize(test_a) == canonicalize(test_b)


def canonicalize(x):
    if isinstance(x, dict):
        x = sorted((canonicalize(k), canonicalize(v)) for k, v in x.items())
    elif isinstance(x, and not isinstance(x, str):
        x = sorted(map(canonicalize, x))
            bool(x < x) # test for unorderable types like complex
        except TypeError:
            x = repr(x) # replace with something orderable
    return x


如果这些代码没有repr函数来描述构成其值的所有数据(例如,测试的是什么),那么这个代码可能会对不可迭代的,不可共享的对象做错事。 ==)。我选择了repr,因为它适用于任何类型的对象,而可能正确使用它(例如,它适用于complex)。对于具有repr看起来像构造函数调用的类,它也应该可以正常工作。对于已继承object.__repr__且具有repr <Foo object at 0xXXXXXXXX>输出的类,它至少不会崩溃,尽管对象将通过标识而不是值进行比较。我认为没有任何真正的通用解决方案,如果它们不能与repr一起使用,您可以为数据中添加的特殊情况添加一些特殊情况。

在这种情况下,它们是相同的dicts,因此您可以比较ID(docs)。请注意,如果您引入了值相同的新dict,则仍会有不同的处理方式。即d = {'color': 'red'}将被视为不等于a

sorted(map(id, test_a)) == sorted(map(id, test_b))

正如@jsbueno指出的那样,你可以使用kwarg key执行此操作。

sorted(test_a, key=id) == sorted(test_b, key=id)

如果两个列表中的元素都很浅,那么对它们进行排序,然后与相等进行比较的想法就可以了。 @ Alex解决方案的问题在于他只使用&#34; id&#34; - 但是如果不使用id,那么使用一个能够正确排序字典的函数,只需要工作:

def sortkey(element):
   if isinstance(element, dict):
         element = sorted(element.items())
   return repr(element)

sorted(test_a, key=sortkey) == sorted(test_b, key=sotrkey) 

(我使用repr来包装密钥,因为它会在比较之前将所有元素强制转换为字符串,如果不同的元素是不可共享的类型,这将避免typerror - 如果你使用Python 3几乎肯定会发生这种情况.X)


class QuasiUnorderedList(list):
    def __eq__(self, other):
        """This method isn't as ineffiecient as you think! It runs in O(1 + 2 + 3 + ... + n) time, 
        possibly better than recursively freezing/checking all the elements."""
        for item in self:
            for otheritem in other:
                if otheritem == item:
                # no break was reached, item not found.
                return False
        return True

这在O(1 + 2 + 3 + ... + n)持平。虽然低深度词典的速度很慢,但对于高深度的词典来说速度更快。


class FrozenDict(collections.Mapping, collections.Hashable):  # collections.Hashable = portability
    """Adapated from"""

    def __init__(self, *args, **kwargs):
        self._d = dict(*args, **kwargs)
        self._hash = None

    def __iter__(self):
        return iter(self._d)

    def __len__(self):
        return len(self._d)

    def __getitem__(self, key):
        return self._d[key]

    def __hash__(self):
        # It would have been simpler and maybe more obvious to
        # use hash(tuple(sorted(self._d.iteritems()))) from this discussion
        # so far, but this solution is O(n). I don't know what kind of
        # n we are going to run into, but sometimes it's hard to resist the
        # urge to optimize when it will gain improved algorithmic performance.
        # Now thread safe by CrazyPython
        if self._hash is None:
            _hash = 0
            for pair in self.iteritems():
                _hash ^= hash(pair)
        self._hash = _hash
        return _hash

def freeze(obj):
    if type(obj) in (str, int, ...):  # other immutable atoms you store in your data structure
        return obj
    elif issubclass(type(obj), list):  # ugly but needed
        return set(freeze(item) for item in obj)
    elif issubclass(type(obj), dict):  # for defaultdict, etc.
        return FrozenDict({key: freeze(value) for key, value in obj.items()})
        raise NotImplementedError("freeze() doesn't know how to freeze " + type(obj).__name__ + " objects!")

class FreezableList(list, collections.Hashable):
    _stored_freeze = None
    _hashed_self = None

    def __eq__(self, other):
        if self._stored_freeze and (self._hashed_self == self):
            frozen = self._stored_freeze
            frozen = freeze(self)
        if frozen is not self._stored_freeze:
            self._stored_hash = frozen
        return frozen == freeze(other)

    def __hash__(self):
        if self._stored_freeze and (self._hashed_self == self):
            frozen = self._stored_freeze
            frozen = freeze(self)
        if frozen is not self._stored_freeze:
            self._stored_hash = frozen
        return hash(frozen)

class UncachedFreezableList(list, collections.Hashable):
    def __eq__(self, other):
        """No caching version of __eq__. May be faster.
        Don't forget to get rid of the declarations at the top of the class!
        Considerably more elegant."""
        return freeze(self) == freeze(other)

    def __hash__(self):
        """No caching version of __hash__. See the notes in the docstring of __eq__2"""
        return hash(freeze(self))
