如何在出现模式后附加到字符串?我知道字符串是不可变的。但是,如果有办法做到这一点?
例如.. 输入:
condor t airline airline
eight n 0 flightnumber
nine n 0 flightnumber
five n 0 flightnumber
hallo t 0 sentence
预期产出:
<s> <callsign> <airline> condor </airline>
<flightnumber> eight nine five </flightnumber>
</callsign> hallo </s>
程序:
import re
import string
import csv
out = ''
with open('input.txt', 'r') as f:
reader = csv.reader(f, delimiter='\t')
for row in reader:
if (row == "\n"):
out += "\n"
if 'airline' in row:
print '<callsign> <airline>' + row[0] + '</airline></callsign>'
if 'sentence' in row:
print '<s>' + row[0] + '</s>'
if 'flightnumber' in row:
print '<flightnumber>' + row[0] + '</flightnumber>'
产地:
<callsign> <airline>condor</airline></callsign>
<flightnumber>eight</flightnumber>
<flightnumber>nine</flightnumber>
<flightnumber>five</flightnumber>
<s>hallo</s>
有没有办法让这个^成为预期输出中的那个?
答案 0 :(得分:0)
您创建一个新的字符串,其中模式替换为自身,然后是您要添加的内容,并将原始字符串替换为新字符串。
但是,从您的示例看,您需要的不仅仅是简单的替换;您需要收集带有flightnumber
的行,以便将其内容合并到一个标记中。
我认为您需要提供更多有关您希望遵循哪些规则的详细信息,以便获得更详细的答案。
答案 1 :(得分:0)
您可以使用字符串格式和zip来执行此操作:
txt='''\
condor t airline airline
eight n 0 flightnumber
nine n 0 flightnumber
five n 0 flightnumber
hallo t 0 sentence'''
template='''\
<s> <callsign> <airline> {} </airline>
<flightnumber> {} </flightnumber>
</callsign> {} </s>'''
col=zip(*[line.split() for line in txt.splitlines()])[0]
print template.format(col[0], ' '.join(col[1:4]), col[4])
打印:
<s> <callsign> <airline> condor </airline>
<flightnumber> eight nine five </flightnumber>
</callsign> hallo </s>