Python正则表达式:在引号之间查找和替换逗号

时间:2013-08-26 08:22:51

标签: python regex

我有一个字符串,

line = '12/08/2013,3,"9,25",42:51,"3,08","12,9","13,9",159,170,"3,19",437,'

我想在引号之间用“:”找到并替换逗号。 寻找结果

line = '12/08/2013,3,9:25,42:51,3:08,12:9,13:9,159,170,3:19,437,'

到目前为止,我已经能够将此模式与

相匹配
import re
re.findall('(\"\d),(.+?\")', line)

然而,我想我应该使用

re.compile(...something..., line)
re.sub(':', line)

有谁知道怎么做? 谢谢,labjunky

3 个答案:

答案 0 :(得分:7)

>>> import re
>>> line = '12/08/2013,3,"9,25",42:51,"3,08","12,9","13,9",159,170,"3,19",437,'
>>> re.sub(r'"(\d+),(\d+)"', r'\1:\2', line)
'12/08/2013,3,9:25,42:51,3:08,12:9,13:9,159,170,3:19,437,'

\1\2指的是匹配的群组。


非正则表达式解决方案:

>>> ''.join(x if i % 2 == 0 else x.replace(',', ':')
            for i, x in enumerate(line.split('"')))
'12/08/2013,3,9:25,42:51,3:08,12:9,13:9,159,170,3:19,437,'

答案 1 :(得分:0)

import re
line = '12/08/2013,3,"9,25",42:51,"3,08","12,9","13,9",159,170,"3,19",437,'
r = ""
for t in re.split(r'("[^"]*")', line):
    if t[0] == '"': 
        t = t.replace(",", ":")[1:-1]
    r += t
print r

打印:

12/08/2013,3,9:25,42:51,3:08,12:9,13:9,159,170,3:19,437,

答案 2 :(得分:0)

还有一种通用的正则表达式解决方案,可以替换双引号(或单引号)之间的任何类型的固定(和非固定的)模式:用相应的模式匹配双引号或单引号的子字符串,并使用可作为re.sub的替代参数来调用,您可以在其中操纵匹配项:

  1. 在双引号之间用逗号替换逗号,并在中删除双引号(当前OP方案):

    re.sub(r'"([^"]*)"', lambda x: x.group(1).replace(',', ':'), line)demo
    # => 12/08/2013,3,9:25,42:51,3:08,12:9,13:9,159,170,3:19,437,

  2. 在双引号之间用逗号替换逗号,并在保留处使用双引号:

    re.sub(r'"[^"]*"', lambda x: x.group(0).replace(',', ':'), line)demo
    # => 12/08/2013,3,"9:25",42:51,"3:08","12:9","13:9",159,170,"3:19",437,

  3. 用逗号将和单引号之间的逗号替换为冒号并将保持的单引号与双引号之间的逗号替换:

    re.sub(r''''[^']*'|"[^"]*"''', lambda x: x.group(0).replace(',', ':'), '''0,1,"2,3",'4,5',''')demo
    # => 0,1,"2:3",'4:5',

此外,如果您需要处理转义的单引号和双引号,请考虑使用r"'[^\\']*(?:\\.[^\\']*)*'"(对于单引号的子字符串),r'"[^\\"]*(?:\\.[^\\"]*)*"'(对于双引号的子字符串)或同时使用-r''''[^\\']*(?:\\.[^\\']*)*'|"[^\\"]*(?:\\.[^\\"]*)*"'''而不是上面的模式。

相关问题