使用re.findall()替换所有匹配项

时间:2015-09-19 16:12:07

标签: python regex

使用re.findall()我设法在字符串中返回正则表达式的多个匹配项。但是我返回的对象是字符串中的匹配列表。这不是我想要的。

我想要的是用其他东西替换所有匹配。我尝试使用类似于在re.sub中使用的类似语法来执行此操作:

import json
import re

regex = re.compile('([a-zA-Z]\"[a-zA-Z])', re.S)

filepath = "C:\\Python27\\Customer Stuff\\Austin Tweets.txt"

f = open(filepath, 'r')
myfile = re.findall(regex, '([a-zA-Z]\%[a-zA-Z])', f.read())
print myfile

但是,这会产生以下错误:

Traceback (most recent call last):
  File "C:/Python27/Customer Stuff/Austin's Script.py", line 9, in <module>
    myfile = re.findall(regex, '([a-zA-Z]\%[a-zA-Z])', f.read())
  File "C:\Python27\lib\re.py", line 177, in findall
    return _compile(pattern, flags).findall(string)
  File "C:\Python27\lib\re.py", line 229, in _compile
    bypass_cache = flags & DEBUG
TypeError: unsupported operand type(s) for &: 'str' and 'int'

任何人都可以在最后一点语法中帮助我,我需要用原始Python对象中的其他内容替换所有匹配项吗?

修改

根据收到的评论和答案,这里是我试图将一个正则表达式与另一个正则表达式:

import json
import re

regex = re.compile('([a-zA-Z]\"[a-zA-Z])', re.S)
regex2 = re.compile('([a-zA-Z]%[a-zA-Z])', re.S)

filepath = "C:\\Python27\\Customer Stuff\\Austin Tweets.txt"

f = open(filepath, 'r')
myfile = f.read()
myfile2 = re.sub(regex, regex2, myfile)
print myfile

现在产生以下错误:

Traceback (most recent call last):
  File "C:/Python27/Customer Stuff/Austin's Script.py", line 11, in <module>
    myfile2 = re.sub(regex, regex2, myfile)
  File "C:\Python27\lib\re.py", line 151, in sub
    return _compile(pattern, flags).sub(repl, string, count)
  File "C:\Python27\lib\re.py", line 273, in _subx
    template = _compile_repl(template, pattern)
  File "C:\Python27\lib\re.py", line 258, in _compile_repl
    p = sre_parse.parse_template(repl, pattern)
  File "C:\Python27\lib\sre_parse.py", line 706, in parse_template
    s = Tokenizer(source)
  File "C:\Python27\lib\sre_parse.py", line 181, in __init__
    self.__next()
  File "C:\Python27\lib\sre_parse.py", line 183, in __next
    if self.index >= len(self.string):
TypeError: object of type '_sre.SRE_Pattern' has no len()

3 个答案:

答案 0 :(得分:11)

import re

regex = re.compile('([a-zA-Z]\"[a-zA-Z])', re.S)
myfile =  'foo"s bar'
myfile2 = regex.sub(lambda m: m.group().replace('"',"%",1), myfile)
print(myfile2)

答案 1 :(得分:2)

根据评论中的建议,请使用re.sub()

onOptionsItemSelected

其中,replacement是您的匹配将被替换的字符串。

答案 2 :(得分:1)

我发现使用函数来执行此类替换而不是lambda更清楚。它可以在替换文本之前轻松对匹配的文本执行任意数量的转换:

import re

def replace_double_quote(match):
    text = match.group()
    return text.replace('"', '%')

regex = re.compile('([a-zA-Z]\"[a-zA-Z])')
myfile = 'foo"s bar and bar"s foo'
regex.sub(replace_double_quote, myfile)

返回foo%s bar and bar%s foo。请注意,它会替换所有匹配项。