Python:字符串格式,其中行长度可以变化

时间:2013-03-06 12:01:52

标签: python string

所以我有以下脚本:

def single_to_tripple(res):
    aa = {'R':'ARG','H':'HIS','K':'LYS','D':'ASP','E':'GLU','S':'SER','T':'THR','N':'ASN','Q':'GLN','C':'CYS','U':'SEC','G':'GLY','P':'PRO','A':'ALA','I':'ILE','L':'LEU','M':'MET','F':'PHE','W':'TRP','Y':'TYR','V':'VAL'}
    return(aa[res])
seq = 'ASALKDYYAIMGVKPTDDLKTIKTAYRRLARKYHPDVSKEPDAEARFKEVAEAWEVLSDEQRRAEYDQMWQHRNDPQFNRQFHHGDGQSFNAEDFDDIFSSIFGQHARQSRQRPATRGHDIEIEVAVFLEETLTEHKRTISYNLPVYNAFGMIEQEIPKTLNVKIPAGVGNGQRIRLKGQGTPGENGGPNGDLWLVIHIAPHPLFDIVGQDLEIVVPVSPWEAALGAKVTVPTLKESILLTIPPGSQAGQRLRVKGKGLVSKKQTGDLYAVLKIVMPPKPDENTAALWQQLADAQSSFDPRKDWGKA'
length = len(seq)

for i,v in enumerate(xrange(0,len(seq),13)):
    line = seq[v:v+13]
    out_line = ('{:<3} '*13).format(single_to_tripple(line[0]),single_to_tripple(line[1]),single_to_tripple(line[2]),single_to_tripple(line[3]),single_to_tripple(line[4]),single_to_tripple(line[5]),single_to_tripple(line[6]),single_to_tripple(line[7]),single_to_tripple(line[8]),single_to_tripple(line[9]),single_to_tripple(line[10]),single_to_tripple(line[11]),single_to_tripple(line[12]))
    print out_line

我使用脚本每13个元素拼接seq字符串,然后将拼接字符串中的每个元素从其单字母代码转换为single_to_tripple中的tripple字母代码。我的数据输出需要包含13个以空格分隔的列。如果拼接不包含13个元素,则在最后一个拼接处出现问题。我怎么能像往常一样抓住这个并格式化字符串?

我在for循环中使用enumerate因为我需要稍后添加行号。

我当前的代码输出:

ALA SER ALA LEU LYS ASP TYR TYR ALA ILE MET GLY VAL 
LYS PRO THR ASP ASP LEU LYS THR ILE LYS THR ALA TYR 
ARG ARG LEU ALA ARG LYS TYR HIS PRO ASP VAL SER LYS 
GLU PRO ASP ALA GLU ALA ARG PHE LYS GLU VAL ALA GLU 
ALA TRP GLU VAL LEU SER ASP GLU GLN ARG ARG ALA GLU 
TYR ASP GLN MET TRP GLN HIS ARG ASN ASP PRO GLN PHE 
ASN ARG GLN PHE HIS HIS GLY ASP GLY GLN SER PHE ASN 
ALA GLU ASP PHE ASP ASP ILE PHE SER SER ILE PHE GLY 
GLN HIS ALA ARG GLN SER ARG GLN ARG PRO ALA THR ARG 
GLY HIS ASP ILE GLU ILE GLU VAL ALA VAL PHE LEU GLU 
GLU THR LEU THR GLU HIS LYS ARG THR ILE SER TYR ASN 
LEU PRO VAL TYR ASN ALA PHE GLY MET ILE GLU GLN GLU 
ILE PRO LYS THR LEU ASN VAL LYS ILE PRO ALA GLY VAL 
GLY ASN GLY GLN ARG ILE ARG LEU LYS GLY GLN GLY THR 
PRO GLY GLU ASN GLY GLY PRO ASN GLY ASP LEU TRP LEU 
VAL ILE HIS ILE ALA PRO HIS PRO LEU PHE ASP ILE VAL 
GLY GLN ASP LEU GLU ILE VAL VAL PRO VAL SER PRO TRP 
GLU ALA ALA LEU GLY ALA LYS VAL THR VAL PRO THR LEU 
LYS GLU SER ILE LEU LEU THR ILE PRO PRO GLY SER GLN 
ALA GLY GLN ARG LEU ARG VAL LYS GLY LYS GLY LEU VAL 
SER LYS LYS GLN THR GLY ASP LEU TYR ALA VAL LEU LYS 
ILE VAL MET PRO PRO LYS PRO ASP GLU ASN THR ALA ALA 
LEU TRP GLN GLN LEU ALA ASP ALA GLN SER SER PHE ASP 
Traceback (most recent call last):
  File "make_seq_res.py", line 10, in <module>
    out_line = ('{:<3} '*13).format(single_to_tripple(line[0]),single_to_tripple(line[1]),single_to_tripple(line[2]),single_to_tripple(line[3]),single_to_tripple(line[4]),single_to_tripple(line[5]),single_to_tripple(line[6]),single_to_tripple(line[7]),single_to_tripple(line[8]),single_to_tripple(line[9]),single_to_tripple(line[10]),single_to_tripple(line[11]),single_to_tripple(line[12]))
IndexError: string index out of range

3 个答案:

答案 0 :(得分:3)

你必须手动输入这么多变量这个事实应该给你一个暗示,你所做的工作超出了产生输出的必要。

在不改变原始代码的情况下,可以这样做:

for i,v in enumerate(xrange(0,len(seq),13)):
    line = seq[v:v+13]
    out_line = ' '.join('{:<3}'.format(single_to_tripple(part)) for part in line)
    print out_line

正如Martijn所指出的,三元组总是三个字符,所以你实际上可以跳过格式化:

out_line = ' '.join(single_to_tripple(part) for part in line)

答案 1 :(得分:2)

您只需将字符串连接在一起,无需格式化:

for i,v in enumerate(xrange(0,len(seq),13)):
    line = seq[v:v+13]
    print ' '.join([single_to_tripple(part) for part in line])

这里不需要过于复杂的事情。 : - )

请注意,在使用str.join()时,请对生成器表达式使用列表推导(因此包括[...]),.join()将转换为列表无论如何使列表理解更快。

结果(最后3行):

ILE VAL MET PRO PRO LYS PRO ASP GLU ASN THR ALA ALA
LEU TRP GLN GLN LEU ALA ASP ALA GLN SER SER PHE ASP
PRO ARG LYS ASP TRP GLY LYS ALA

您还可以使用基于itertools的分组器来简化循环:

from itertools import izip_longest

def grouper(n, iterable, padvalue=None):
    "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
    return izip_longest(*[iter(iterable)]*n, fillvalue=padvalue)

aa = {'R':'ARG','H':'HIS','K':'LYS','D':'ASP','E':'GLU','S':'SER','T':'THR','N':'ASN','Q':'GLN','C':'CYS','U':'SEC','G':'GLY','P':'PRO','A':'ALA','I':'ILE','L':'LEU','M':'MET','F':'PHE','W':'TRP','Y':'TYR','V':'VAL', None: ''}
def single_to_tripple(res):
    return(aa[res])

for line in grouper(13, seq):
    print ' '.join([single_to_tripple(part) for part in line])

我通过移动函数的映射 来增强single_to_tripple()函数(无需定义每个每个 >你调用它的时间),并添加一个None键(石斑鱼用最后一组填充None个值。)

答案 2 :(得分:0)

您可以保存行长度而不是使用它:

def single_to_tripple(res):
    aa = {'R':'ARG','H':'HIS','K':'LYS','D':'ASP','E':'GLU','S':'SER','T':'THR','N':'ASN','Q':'GLN','C':'CYS','U':'SEC','G':'GLY','P':'PRO','A':'ALA','I':'ILE','L':'LEU','M':'MET','F':'PHE','W':'TRP','Y':'TYR','V':'VAL'}
    return(aa[res])

seq = 'ASALKDYYAIMGVKPTDDLKTIKTAYRRLARKYHPDVSKEPDAEARFKEVAEAWEVLSDEQRRAEYDQMWQHRNDPQFNRQFHHGDGQSFNAEDFDDIFSSIFGQHARQSRQRPATRGHDIEIEVAVFLEETLTEHKRTISYNLPVYNAFGMIEQEIPKTLNVKIPAGVGNGQRIRLKGQGTPGENGGPNGDLWLVIHIAPHPLFDIVGQDLEIVVPVSPWEAALGAKVTVPTLKESILLTIPPGSQAGQRLRVKGKGLVSKKQTGDLYAVLKIVMPPKPDENTAALWQQLADAQSSFDPRKDWGKA'
length = len(seq)

for i,v in enumerate(xrange(0,len(seq),13)):
    line = seq[v:v+13]
    length = len(line)
    out_line = ('{:<3} '*length).format(*[single_to_tripple(a) for a in line])
    print out_line
相关问题