Question

我有两个文件，一个只有Keys，另一个有Key and Value。我试图附加具有相应值的密钥文件或创建具有密钥和相应值的新输出文件。我个人可以完美地阅读关键和价值。我在将两者合并在一起时遇到了麻烦。它下面的代码一起显示了最终值。据我所知，第一个for循环结束，然后第二个循环开始。这就是我只从key和value文件中获取最后一项的原因。我该如何以简单的方式解决这个问题？

from collections import defaultdict

with open('input1', 'r') as classified_data:
    with open('input2', 'r') as edge_data:    
        with open('output', 'w') as outfile:  
        for row in classified_data:
            col = row.strip().split()
            key = col[0], col[1]
            #print key
        for row in edge_data:
            col = row.strip().split()           
            value = col[2], col[3], col[4]
            #print value
        print {key:value}

输入1：

输入2：

4945 3545 57 250848.0 4400.84210526 
3584 292 5 1645.0 329.0 
4824 2283 5 16867.0 3373.4 
1715 55 1 681.0 681.0 
5409 2822 2 3221.0 1610.5 
4955 656 6 3348.0 558.0 
4157 487 1 201.0 201.0 
2628 309 2 2466.0 1233.0 
3929 300 2 1742.0 871.0 
3730 489 12 10706.0 892.166666667 
5474 2336 2 1533.0 766.5 
3877 716 10 45028.0 4502.8 
3058 3045 12 17328.0 1444.0

Answer 1

如何收集元组列表中的键和值，最后将它们压缩成字典？

from collections import defaultdict

keys = []
values = []

with open('input1', 'r') as classified_data:
    # with open('output', 'w') as outfile:  
    for row in classified_data:
        col = row.strip().split()
        keys.append((col[0], col[1]))
        # print key
with open('input2', 'r') as edge_data:    
    for row in edge_data:
        col = row.strip().split()           
        values.append((col[2], col[3], col[4]))
        # print value

print dict(zip(keys,values))

Answer 2

听起来我想要使用从第一个文件中获取的密钥对第二个文件中的数据进行多次查找。如果是这种情况，并假设您可以将第二个文件放入内存中，那么我建议将第二个文件读入一个字典，其中包含您将用于查找的键：

edges = {}
with open('input2', 'r') as edge_data:    
    for row in edge_data:
        col = row.strip().split()
        edges[col[0], col[1]] = col[2], col[3], col[4]

然后执行查询，读取第一个文件并打印出匹配项：

with open('input1', 'r') as classified_data:
    for row in classified_data:
        key = tuple(row.strip().split())
        print key, edges.get(key)

如果你想根据数字的顺序匹配键，那么你可以修改最后一段代码来明确地尝试这两种组合：

with open('input1', 'r') as classified_data:
    for row in classified_data:
        a, b = row.strip().split()
        print a, b, edges.get((a, b), edges.get((b, a)))

追加或创建新词典

2 个答案: