按特定值过滤字典

时间:2012-07-11 16:18:43

标签: python dictionary filtering

我有一个看起来像这样的词典

db = {
'ObjectID': ['-1', '6', '10', '13', '13', '13', '-1', '-1', '-1', '-1', '-1', '-1'], 
'Test_value': ['25', '0,28999999', '100,00000000', 'Geometry', '126641,847400000000', '473106,185600000030', ' ', ' ', ' ', ' ', ' ', ' '], 
'Has_error': ['true', 'true', 'true', 'true', 'true', 'true', 'false', 'false', 'false', 'false', 'false', 'false'], 
'Message': ['Table row counts are different', 'ObjectID 6 is different for Field DIKTE_BRUGDEK', 'ObjectID 10 is different for Field RICHTING_1',                'ObjectID 13 is different for Field GEOMETRIE', 'ObjectID 13 is different for Field X', 'ObjectID 13 is different for Field Y', 'Shape types are the          same', 'Feature types are the same', 'Feature class extents are the same', 'GeometryDefs are the same', 'Field properties are the same', 'Spatial             references are the same'], 'Identifier': ['Table', 'FeatureClass', 'FeatureClass', 'FeatureClass', 'FeatureClass', 'FeatureClass', 'FeatureClass',            'FeatureClass', 'FeatureClass', 'GeometryDef', 'Field', 'SpatialReference'], 
'Base_value': ['23', '0,19000000', '394,00000000', 'Geometry', '126530,700000000000', '473095,700000000010', ' ', ' ', ' ', ' ', ' ', ' ']}

我想根据'ObjectID'列表中的条目将其分解为一个较小的子集,即-1。 我的第一次尝试是建立一个值的索引,如:

filter_ind = []
for k,v in db.iteritems():
    for i in xrange(len(v)):
            if (k == 'ObjectID') and (int(v[i]) != -1):
                filter_ind.append(i) 

然后我尝试使用filter_ind作为排序过滤器来构建一个新的dict:     dict((k,v[i]) for i in filter_ind for k, v in db.iteritems())

我得到的只是最后一场比赛,因为v不再是一个列表: {'ObjectID':'13','Test_value':'473106,185600000030','Has_error':'true', 'Message':'ObjectID 13 is different for Field Y', 'Identifier':'FeatureClass','Base_value': '473095,700000000010'}

问题:是否有其他方法可以根据自身内部的某个值来过滤字典?如果这被认为是一种相对直接的方法,那么使用索引作为过滤器创建新字典的智能方法是什么?谢谢。

6 个答案:

答案 0 :(得分:4)

我觉得你有点过分复杂了。首先,不需要嵌套循环。您可以通过这种方式获得所需的指数:

oids = db['ObjectID']
for i, id in enumerate(oids):
    if id != -1
        filter_ind.append(i) 

或者更简洁,

filter_ind = [i for i, id in enumerate(oids) if id != '-1']

然后您可以使用ID来过滤单个列表:

dict((key, [val[i] for i in filter_ind]) for key, val in db.iteritems())

答案 1 :(得分:2)

这是我做的东西:

new_db=db.copy()
fltr=[x=='-1' for x in new_db['ObjectID']] #Not actually necessary, but makes the code a little more readable

for k,v in new_db.items():
    new_db[k]=[x for i,x in enumerate(new_db[k]) if fltr[i]]  #replace old lists with new filtered ones.

这与senderle发布的答案非常相似(我认为)。我使用布尔列表,而另一个答案使用索引。我可能效率不高,但我更容易理解。

答案 2 :(得分:2)

这是另一种选择:

from operator import itemgetter

iget = itemgetter(*(i for i, id in enumerate(db['ObjectID']) if int(id) != -1))
result = dict((k, list(iget(v))) for k, v in db.items())

答案 3 :(得分:1)

如果您使用的是2.7:

from itertools import compress
indexes = [(x != -1) for x in db['ObjectID']]
result = dict((k, compress(v, indexes)) for k, v in db.iteritems())

答案 4 :(得分:1)

这实际上是一个罕见的场合,您可以使用itertools.compress

from itertools import compress

sels = [x != '-1' for x in db['ObjectID']]
comp = {key: list(compress(vals, sels)) for key, vals in db.items()}

答案 5 :(得分:0)

我喜欢这个:

[dict(zip(db.keys(),e)) for e in zip(*db.values()) if e[0]!='-1']

它返回一个dicts列表,不包括带-1的那个。