如何使用python折叠节点

时间:2015-10-14 14:11:04

标签: python

我有以下代码,但在某处似乎有一个错误。我得到输出(a)但需要输出(b) - 见下文。任何人都可以看到我错在哪里?所有文件都以制表符分隔。

代码:

import sys

outfile_name = sys.argv[-1]
filename1 = sys.argv[-2]
filename2 = sys.argv[-3]
fileIn1 = open(filename1, "r")
fileIn2 = open(filename2, "r")
fileOut = open(outfile_name, "w")

dict = {}

a = open(filename1)
b = open(filename2)

for line in a:
words = line.split("\t")
if len(words) != 1:
    target = words[0]
    for word in words[1:]:
        dict[word] = target

for line in b:
words = line.split("\t")
if words[0] in dict.keys() and words[1] in dict.keys():
        fileOut.write(dict[words[0]] + "\t" + dict[words[1]] + "\n")
elif words[0] in dict.keys() and words[1] not in dict.keys():
        fileOut.write(dict[words[0]] + "\t" + words[1] + "\n")
elif words[0] not in dict.keys() and words[1] in dict.keys():
        fileOut.write(words[0] + "\t" + dict[words[1]] + "\n")
elif words[0] not in dict.keys() and words[1] not in dict.keys():
        fileOut.write(words[0] + "\t" + words[1] + "\n")


fileOut.close()

文件名1:

Area_1 Area_2
A   B
A   C
A   D
D   B
D   C
L   B
L   C
L   A
D   L
K   A
K   B
K   C
K   D
K   L
D   P
D   R
L   P
L   R
K   P
K   R
A   H
D   H
L   H
K   H
B   P
B   R
R   P
A   I
D   I
I   L
I   K
C   H
I   H
C   H
J   K
J   X
J   Y
J   Z
K   X
K   Y
Y   Z
K   Z
X   Y
X   Z
M   G
N   T
O   S
S   Q

文件名2:

Incident_00000001       A       D       L       K
Incident_00000002       B       P       R
Incident_00000003       C       F       W
Incident_00000004       J       I
M
N
O
Incident_00000005       Q       S
X
Y
Z
G
T

输出(b) - 我得到的不良输出:

Area_1  Area_2

Incident_00000001   B

Incident_00000001   C

Incident_00000001   D

Incident_00000001   B

Incident_00000001   C

Incident_00000001   B

Incident_00000001   C

Incident_00000001   A

Incident_00000001   L

K   A

K   B

K   C

K   D

K   L

Incident_00000001   P

Incident_00000001   Incident_00000002
Incident_00000001   P

Incident_00000001   Incident_00000002
K   P

K   Incident_00000002
Incident_00000001   H

Incident_00000001   H

Incident_00000001   H

K   H

Incident_00000002   P

Incident_00000002   Incident_00000002
R   P

Incident_00000001   Incident_00000003
Incident_00000001   Incident_00000003
I   L

I   Incident_00000004
Incident_00000003   H

I   H

Incident_00000003   H

Incident_00000004   Incident_00000004
Incident_00000004   X

Incident_00000004   Y

Incident_00000004   Z

K   X

K   Y

Y   Z

K   Z

X   Y

X   Z

M   G

N   T

O   S

Incident_00000005   Incident_00000005

我期待得到的结果(输出(c))是:

Area_1  Area_2
Incident_00000001   Incident_00000002
Incident_00000001   Incident_00000003
Incident_00000001   Incident_00000001
Incident_00000001   Incident_00000002
Incident_00000001   Incident_00000003
Incident_00000001   Incident_00000002
Incident_00000001   Incident_00000003
Incident_00000001   Incident_00000001
Incident_00000001   Incident_00000001
Incident_00000001   Incident_00000001
Incident_00000001   Incident_00000002
Incident_00000001   Incident_00000003
Incident_00000001   Incident_00000001
Incident_00000001   Incident_00000001
Incident_00000001   Incident_00000002
Incident_00000001   Incident_00000002
Incident_00000001   Incident_00000002
Incident_00000001   Incident_00000002
Incident_00000001   Incident_00000002
Incident_00000001   Incident_00000002
Incident_00000001   H
Incident_00000001   H
Incident_00000001   H
Incident_00000001   H
Incident_00000002   Incident_00000002
Incident_00000002   Incident_00000002
Incident_00000002   Incident_00000002
Incident_00000001   Incident_00000004
Incident_00000001   Incident_00000004
Incident_00000004   Incident_00000001
Incident_00000004   Incident_00000001
Incident_00000003   H
Incident_00000004   H
Incident_00000003   H
Incident_00000004   Incident_00000001
Incident_00000004   X
Incident_00000004   Y
Incident_00000004   Z
Incident_00000001   X
Incident_00000001   Y
Y   Z
Incident_00000001   Z
X   Y
X   Z
M   G
N   T
O   Incident_00000005
Incident_00000005   Incident_00000005

1 个答案:

答案 0 :(得分:1)

import csv

graph = {}
with open(filename2) as infile:
    for incident, *rest in csv.reader(infile, delimiter='\t'):
        if not rest: continue
        for node in rest:
            graph[node] = incident

with open('filename1') as infile, open('path/to/output', 'w') as outfile:
    writer = csv.writer(outfile, delimiter='\t')
    for source, dest in csv.reader(infile):
        if source in graph: source = graph[source]
        if dest in graph: dest = graph[dest]
        writer.writerow([source, dest])