Question

所以，我有以下txt文件：

test1.txt（它们都在同一行。）

(hello)(bye)

text2.txt（它分为两个不同的行。）

(This actually works)
(Amazing!)

我有以下正则表达式

\((.*?)\)

显然选择了括号内的所有单词。

我想要做的是将test1.txt中的（）内的单词替换为test2.txt中的（）内的单词，将test1.txt保留为：

(This actually works)(Amazing!)

我尝试了以下代码，但它似乎无法运行。我做错了什么？

import re

pattern = re.compile("\((.*?)\)")

for line in enumerate(open("test1.txt")):
    match = re.finditer(pattern, line)

for line in enumerate(open("test2.txt")):
    pattern.sub(match, line)

我认为我犯了一个很大的错误，它是我在python中的第一个程序之一。

Answer 1

好的，有几个问题：

finditer方法返回匹配对象，而不是字符串。 findall返回匹配的字符串组列表
你说的是相反的。你想用test2中的数据替换test1中的数据吗不是吗？
enumerate返回一个元组，因此var line不是一行，而是[line_number, line_string_content]的列表。我在最后一个代码块中使用它。

所以你可以尝试先抓住内容：

pattern = re.compile("\((.*?)\)")
for line in open("test2.txt"):
    match = pattern.findall(line)
#match contains the list ['Amazing!'] from the last line of test2, your variable match is overwritten on each line of the file...

注意：如果编译模式，可以将其用作对象来调用re方法。

如果你想逐行做（大文件？）另一种选择是加载整个文件并创建多行正则表达式。

matches = []
for line in open("test2.txt"):
    matches.extend(pattern.findall(line))
#matches contains the list ['This actually works','Amazing!']

然后用匹配项替换括号的内容：

for line in open("test1.txt"):
    for i, match in enumerate(pattern.findall(line)):
        re.sub(match, matches[i], line)

注意：如果test1.txt中的(string in parenthesis)多于test2.txt中的with open('fileout.txt', 'w') as outfile: for line in enumerate(open("test1.txt")): #another writing for the same task (in one line!) newline = [re.sub(match, matches[i], line) for i, match in enumerate(pattern.findall(line))][0] outfile.write(newline)，则执行此操作会引发异常...

如果你想写一个输出文件，你应该做

@types

Answer 2

您可以使用running build running build_exe Traceback (most recent call last): File "setup.py", line 39, in <module> executables = executables File "/usr/lib/python2.7/dist-packages/cx_Freeze/dist.py", line 349, in setup distutils.core.setup(**attrs) File "/usr/lib/python2.7/distutils/core.py", line 151, in setup dist.run_commands() File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands self.run_command(cmd) File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command cmd_obj.run() File "/usr/lib/python2.7/distutils/command/build.py", line 128, in run self.run_command(cmd_name) File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command self.distribution.run_command(command) File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command cmd_obj.run() File "/usr/lib/python2.7/dist-packages/cx_Freeze/dist.py", line 219, in run freezer.Freeze() File "/usr/lib/python2.7/dist-packages/cx_Freeze/freezer.py", line 618, in Freeze import cx_Freeze.util ImportError: No module named util的功能允许可调用作为替换模式，并创建现场re.sub()功能以完成lambda的匹配，以达到您的结果，例如

test2.txt

包含import re # slightly changed to use lookahead and lookbehind groups for a proper match/substitution pattern = re.compile(r"(?<=\()(.*?)(?=\))") # you can also use r"(\(.*?\))" if you're preserving the brackets with open("test2.txt", "r") as f: # open test2.txt for reading words = pattern.findall(f.read()) # grabs all the found words in test2.txt with open("test1.txt", "r+") as f: # open test1.txt for reading and writing # read the content of test1.txt and replace each match with the next `words` list value content = pattern.sub(lambda x: words.pop(0) if words else x.group(), f.read()) f.seek(0) # rewind the file to the beginning f.write(content) # write the new, 'updated' content f.truncate() # truncate the rest of the file (if any)的内容：

(hello)(bye)

和test1.txt包含：

(This actually works)
(Amazing!)

执行上述脚本会将test2.txt更改为：

(This actually works)(Amazing!)

它还会通过迭代替换最多test1.txt中找到的匹配项来解释文件中的不匹配（例如，如果您的test2.txt包含test1.txt，它将更改为{ {1}}）。

python regex - 如何将txt文件中的组替换为另一个txt文件中的另一个组？

2 个答案: