我试图打印序列,但得到KeyError?在python中

时间:2016-10-17 14:57:47

标签: python python-2.7 python-3.x bioinformatics

我只得到第一个输入文件正确的输出,但我在我的第一个文件中有A,B,C,D链并获得正确的输出,在这种情况下,在第二个输入文件中没有c和d链在这种情况下,我们的代码是复制C和D的第一个文件数据的相同数据?而且我也无法增加这个ID:chainIDs = ['A','B','C','D']

我想搜索A到Z链ID,如果我用chainIDs = ['A','B','C','D']替换为chainIDs = ['A','B','C ','D','E','F','G','H']。

如果输入文件中没有任何链ID,则代码只会忽略并继续其他链接。

显示错误:

f.write(d[chainID][atomIDs[i+j]]+ '\n')  
KeyError: 'E'

脚本:

import os


d = {}
atomIDs = ['C4B', 'O4B', 'C1B', 'C2B', 'C3B', 'C4B', 'O4B', 'C1B']
chainIDs = ['A', 'B', 'C', 'D', 'E', 'F']
with open('filename.txt') as pdbline:
    for line in pdbline:
        filenames=line[:8]        
        with open(filenames) as pdbfile:
            for line in map(str.rstrip, pdbfile):                
                if line[:6] != "HETATM":
                    continue
                chainID = line[21:22]
                atomID = line[13:16].strip()
                if chainID not in chainIDs:
                    continue
                if atomID not in atomIDs:
                    continue
                try:
                    d[chainID][atomID] = line
                except KeyError:
                    d[chainID] = {atomID: line}
        n = 4
        for chainID in chainIDs:
            for i in range(len(atomIDs)-n+1):
                for j in range(n):
                    f = open(filenames+'out.pdb', 'a')
                    f.write(d[chainID][atomIDs[i+j]]+ '\n')  
                    f.close()

第一个输入文件:

HETATM15207  C4B NAD A 501      47.266 101.038   7.214  1.00 11.48           C  
HETATM15208  O4B NAD A 501      46.466 100.713   8.371  1.00 11.48           O  
HETATM15209  C3B NAD A 501      47.659  99.689   6.567  1.00 11.48           C  
HETATM15211  C2B NAD A 501      46.447  98.835   6.988  1.00 11.48           C  
HETATM15213  C1B NAD A 501      46.221  99.300   8.426  1.00 11.48           C  
HETATM15252  C4B NAD B 501      36.455 115.053  36.671  1.00 11.25           C  
HETATM15253  O4B NAD B 501      35.930 114.469  35.492  1.00 11.25           O  
HETATM15254  C3B NAD B 501      35.307 115.837  37.367  1.00 11.25           C  
HETATM15256  C2B NAD B 501      34.172 114.876  37.039  1.00 11.25           C  
HETATM15258  C1B NAD B 501      34.524 114.613  35.551  1.00 11.25           C  
HETATM15297  C4B NAD C 501      98.229 130.106  18.332  1.00 12.28           C  
HETATM15298  O4B NAD C 501      98.083 131.545  18.199  1.00 12.28           O  
HETATM15299  C3B NAD C 501      99.346 129.675  17.343  1.00 12.28           C  
HETATM15301  C2B NAD C 501     100.220 130.922  17.375  1.00 12.28           C  
HETATM15303  C1B NAD C 501      99.125 132.008  17.317  1.00 12.28           C  
HETATM15342  C4B NAD D 501      77.335 156.939  25.788  1.00 11.99           C  
HETATM15343  O4B NAD D 501      78.705 156.544  25.901  1.00 11.99           O  
HETATM15344  C3B NAD D 501      77.106 158.059  26.824  1.00 11.99           C  
HETATM15346  C2B NAD D 501      78.536 158.632  26.878  1.00 11.99           C  
HETATM15348  C1B NAD D 501      79.351 157.345  26.900  1.00 11.99           C  

第二个输入文件:

HETATM 2471  C4B NAD A 352      91.432  24.158  51.658  1.00 51.58           C  
HETATM 2472  O4B NAD A 352      92.697  23.519  52.005  1.00 47.28           O  
HETATM 2473  C3B NAD A 352      90.818  23.341  50.501  1.00 49.46           C  
HETATM 2475  C2B NAD A 352      91.477  22.027  50.635  1.00 48.07           C  
HETATM 2477  C1B NAD A 352      92.868  22.416  51.075  1.00 49.66           C  

获取第二个文件的结果:

HETATM 2471  C4B NAD A 352      91.432  24.158  51.658  1.00 51.58           C
HETATM 2472  O4B NAD A 352      92.697  23.519  52.005  1.00 47.28           O
HETATM 2477  C1B NAD A 352      92.868  22.416  51.075  1.00 49.66           C
HETATM 2475  C2B NAD A 352      91.477  22.027  50.635  1.00 48.07           C
HETATM 2472  O4B NAD A 352      92.697  23.519  52.005  1.00 47.28           O
HETATM 2477  C1B NAD A 352      92.868  22.416  51.075  1.00 49.66           C
HETATM 2475  C2B NAD A 352      91.477  22.027  50.635  1.00 48.07           C
HETATM 2473  C3B NAD A 352      90.818  23.341  50.501  1.00 49.46           C
HETATM 2477  C1B NAD A 352      92.868  22.416  51.075  1.00 49.66           C
HETATM 2475  C2B NAD A 352      91.477  22.027  50.635  1.00 48.07           C
HETATM 2473  C3B NAD A 352      90.818  23.341  50.501  1.00 49.46           C
HETATM 2471  C4B NAD A 352      91.432  24.158  51.658  1.00 51.58           C
HETATM 2475  C2B NAD A 352      91.477  22.027  50.635  1.00 48.07           C
HETATM 2473  C3B NAD A 352      90.818  23.341  50.501  1.00 49.46           C
HETATM 2471  C4B NAD A 352      91.432  24.158  51.658  1.00 51.58           C
HETATM 2472  O4B NAD A 352      92.697  23.519  52.005  1.00 47.28           O
HETATM 2473  C3B NAD A 352      90.818  23.341  50.501  1.00 49.46           C
HETATM 2471  C4B NAD A 352      91.432  24.158  51.658  1.00 51.58           C
HETATM 2472  O4B NAD A 352      92.697  23.519  52.005  1.00 47.28           O
HETATM 2477  C1B NAD A 352      92.868  22.416  51.075  1.00 49.66           C
HETATM15252  C4B NAD B 501      36.455 115.053  36.671  1.00 11.25           C
HETATM15253  O4B NAD B 501      35.930 114.469  35.492  1.00 11.25           O
HETATM15258  C1B NAD B 501      34.524 114.613  35.551  1.00 11.25           C
HETATM15256  C2B NAD B 501      34.172 114.876  37.039  1.00 11.25           C
HETATM15253  O4B NAD B 501      35.930 114.469  35.492  1.00 11.25           O
HETATM15258  C1B NAD B 501      34.524 114.613  35.551  1.00 11.25           C
HETATM15256  C2B NAD B 501      34.172 114.876  37.039  1.00 11.25           C
HETATM15254  C3B NAD B 501      35.307 115.837  37.367  1.00 11.25           C
HETATM15258  C1B NAD B 501      34.524 114.613  35.551  1.00 11.25           C
HETATM15256  C2B NAD B 501      34.172 114.876  37.039  1.00 11.25           C
HETATM15254  C3B NAD B 501      35.307 115.837  37.367  1.00 11.25           C
HETATM15252  C4B NAD B 501      36.455 115.053  36.671  1.00 11.25           C
HETATM15256  C2B NAD B 501      34.172 114.876  37.039  1.00 11.25           C
HETATM15254  C3B NAD B 501      35.307 115.837  37.367  1.00 11.25           C
HETATM15252  C4B NAD B 501      36.455 115.053  36.671  1.00 11.25           C
HETATM15253  O4B NAD B 501      35.930 114.469  35.492  1.00 11.25           O
HETATM15254  C3B NAD B 501      35.307 115.837  37.367  1.00 11.25           C
HETATM15252  C4B NAD B 501      36.455 115.053  36.671  1.00 11.25           C
HETATM15253  O4B NAD B 501      35.930 114.469  35.492  1.00 11.25           O
HETATM15258  C1B NAD B 501      34.524 114.613  35.551  1.00 11.25           C
HETATM15297  C4B NAD C 501      98.229 130.106  18.332  1.00 12.28           C
HETATM15298  O4B NAD C 501      98.083 131.545  18.199  1.00 12.28           O
HETATM15303  C1B NAD C 501      99.125 132.008  17.317  1.00 12.28           C
HETATM15301  C2B NAD C 501     100.220 130.922  17.375  1.00 12.28           C
HETATM15298  O4B NAD C 501      98.083 131.545  18.199  1.00 12.28           O
HETATM15303  C1B NAD C 501      99.125 132.008  17.317  1.00 12.28           C
HETATM15301  C2B NAD C 501     100.220 130.922  17.375  1.00 12.28           C
HETATM15299  C3B NAD C 501      99.346 129.675  17.343  1.00 12.28           C
HETATM15303  C1B NAD C 501      99.125 132.008  17.317  1.00 12.28           C
HETATM15301  C2B NAD C 501     100.220 130.922  17.375  1.00 12.28           C
HETATM15299  C3B NAD C 501      99.346 129.675  17.343  1.00 12.28           C
HETATM15297  C4B NAD C 501      98.229 130.106  18.332  1.00 12.28           C
HETATM15301  C2B NAD C 501     100.220 130.922  17.375  1.00 12.28            C
HETATM15299  C3B NAD C 501      99.346 129.675  17.343  1.00 12.28           C
HETATM15297  C4B NAD C 501      98.229 130.106  18.332  1.00 12.28           C
HETATM15298  O4B NAD C 501      98.083 131.545  18.199  1.00 12.28           O
HETATM15299  C3B NAD C 501      99.346 129.675  17.343  1.00 12.28           C
HETATM15297  C4B NAD C 501      98.229 130.106  18.332  1.00 12.28           C
HETATM15298  O4B NAD C 501      98.083 131.545  18.199  1.00 12.28           O
HETATM15303  C1B NAD C 501      99.125 132.008  17.317  1.00 12.28           C
HETATM15342  C4B NAD D 501      77.335 156.939  25.788  1.00 11.99           C
HETATM15343  O4B NAD D 501      78.705 156.544  25.901  1.00 11.99           O
HETATM15348  C1B NAD D 501      79.351 157.345  26.900  1.00 11.99           C
HETATM15346  C2B NAD D 501      78.536 158.632  26.878  1.00 11.99            C
HETATM15343  O4B NAD D 501      78.705 156.544  25.901  1.00 11.99           O
HETATM15348  C1B NAD D 501      79.351 157.345  26.900  1.00 11.99           C
HETATM15346  C2B NAD D 501      78.536 158.632  26.878  1.00 11.99           C
HETATM15344  C3B NAD D 501      77.106 158.059  26.824  1.00 11.99           C
HETATM15348  C1B NAD D 501      79.351 157.345  26.900  1.00 11.99           C
HETATM15346  C2B NAD D 501      78.536 158.632  26.878  1.00 11.99           C
HETATM15344  C3B NAD D 501      77.106 158.059  26.824  1.00 11.99           C
HETATM15342  C4B NAD D 501      77.335 156.939  25.788  1.00 11.99           C
HETATM15346  C2B NAD D 501      78.536 158.632  26.878  1.00 11.99           C
HETATM15344  C3B NAD D 501      77.106 158.059  26.824  1.00 11.99           C
HETATM15342  C4B NAD D 501      77.335 156.939  25.788  1.00 11.99           C
HETATM15343  O4B NAD D 501      78.705 156.544  25.901  1.00 11.99           O
HETATM15344  C3B NAD D 501      77.106 158.059  26.824  1.00 11.99           C
HETATM15342  C4B NAD D 501      77.335 156.939  25.788  1.00 11.99           C
HETATM15343  O4B NAD D 501      78.705 156.544  25.901  1.00 11.99           O
HETATM15348  C1B NAD D 501      79.351 157.345  26.900  1.00 11.99           C

预期输出:我想按以下顺序打印当前输入文件中的所有链ID(仅显示链ID):

HETATM 2471  C4B NAD A 352      91.432  24.158  51.658  1.00 51.58           C
HETATM 2472  O4B NAD A 352      92.697  23.519  52.005  1.00 47.28           O
HETATM 2477  C1B NAD A 352      92.868  22.416  51.075  1.00 49.66           C
HETATM 2475  C2B NAD A 352      91.477  22.027  50.635  1.00 48.07           C
HETATM 2472  O4B NAD A 352      92.697  23.519  52.005  1.00 47.28           O
HETATM 2477  C1B NAD A 352      92.868  22.416  51.075  1.00 49.66           C
HETATM 2475  C2B NAD A 352      91.477  22.027  50.635  1.00 48.07           C
HETATM 2473  C3B NAD A 352      90.818  23.341  50.501  1.00 49.46           C
HETATM 2477  C1B NAD A 352      92.868  22.416  51.075  1.00 49.66           C
HETATM 2475  C2B NAD A 352      91.477  22.027  50.635  1.00 48.07           C
HETATM 2473  C3B NAD A 352      90.818  23.341  50.501  1.00 49.46           C
HETATM 2471  C4B NAD A 352      91.432  24.158  51.658  1.00 51.58           C
HETATM 2475  C2B NAD A 352      91.477  22.027  50.635  1.00 48.07           C
HETATM 2473  C3B NAD A 352      90.818  23.341  50.501  1.00 49.46           C
HETATM 2471  C4B NAD A 352      91.432  24.158  51.658  1.00 51.58           C
HETATM 2472  O4B NAD A 352      92.697  23.519  52.005  1.00 47.28           O
HETATM 2473  C3B NAD A 352      90.818  23.341  50.501  1.00 49.46           C
HETATM 2471  C4B NAD A 352      91.432  24.158  51.658  1.00 51.58           C
HETATM 2472  O4B NAD A 352      92.697  23.519  52.005  1.00 47.28           O
HETATM 2477  C1B NAD A 352      92.868  22.416  51.075  1.00 49.66           C

1 个答案:

答案 0 :(得分:0)

只需将输出循环更改为:

for chainID in chainIDs:
    if chainID in d:
        for atom_id in d[chainID]:
            with open(filenames+'out.pdb', 'a') as f:
                f.write(d[chainID][atom_id] + '\n')                  
相关问题