所以,我有以下txt文件:
test1.txt(它们都在同一行。)
(hello)(bye)
text2.txt(它分为两个不同的行。)
(This actually works)
(Amazing!)
我有以下正则表达式
\((.*?)\)
显然选择了括号内的所有单词。
我想要做的是将test1.txt中的()内的单词替换为test2.txt中的()内的单词,将test1.txt保留为:
(This actually works)(Amazing!)
我尝试了以下代码,但它似乎无法运行。我做错了什么?
import re
pattern = re.compile("\((.*?)\)")
for line in enumerate(open("test1.txt")):
match = re.finditer(pattern, line)
for line in enumerate(open("test2.txt")):
pattern.sub(match, line)
我认为我犯了一个很大的错误,它是我在python中的第一个程序之一。
答案 0 :(得分:2)
好的,有几个问题:
finditer
方法返回匹配对象,而不是字符串。
findall
返回匹配的字符串组列表line
不是一行,而是[line_number, line_string_content]
的列表。我在最后一个代码块中使用它。所以你可以尝试先抓住内容:
pattern = re.compile("\((.*?)\)")
for line in open("test2.txt"):
match = pattern.findall(line)
#match contains the list ['Amazing!'] from the last line of test2, your variable match is overwritten on each line of the file...
注意:如果编译模式,可以将其用作对象来调用re方法。
如果你想逐行做(大文件?) 另一种选择是加载整个文件并创建多行正则表达式。
matches = []
for line in open("test2.txt"):
matches.extend(pattern.findall(line))
#matches contains the list ['This actually works','Amazing!']
然后用匹配项替换括号的内容:
for line in open("test1.txt"):
for i, match in enumerate(pattern.findall(line)):
re.sub(match, matches[i], line)
注意:如果test1.txt中的(string in parenthesis)
多于test2.txt中的with open('fileout.txt', 'w') as outfile:
for line in enumerate(open("test1.txt")):
#another writing for the same task (in one line!)
newline = [re.sub(match, matches[i], line) for i, match in enumerate(pattern.findall(line))][0]
outfile.write(newline)
,则执行此操作会引发异常...
如果你想写一个输出文件,你应该做
@types
答案 1 :(得分:0)
您可以使用running build
running build_exe
Traceback (most recent call last):
File "setup.py", line 39, in <module>
executables = executables
File "/usr/lib/python2.7/dist-packages/cx_Freeze/dist.py", line 349, in setup
distutils.core.setup(**attrs)
File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
dist.run_commands()
File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
self.run_command(cmd)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
cmd_obj.run()
File "/usr/lib/python2.7/distutils/command/build.py", line 128, in run
self.run_command(cmd_name)
File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
self.distribution.run_command(command)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
cmd_obj.run()
File "/usr/lib/python2.7/dist-packages/cx_Freeze/dist.py", line 219, in run
freezer.Freeze()
File "/usr/lib/python2.7/dist-packages/cx_Freeze/freezer.py", line 618, in Freeze
import cx_Freeze.util
ImportError: No module named util
的功能允许可调用作为替换模式,并创建现场re.sub()
功能以完成lambda
的匹配,以达到您的结果,例如
test2.txt
包含import re
# slightly changed to use lookahead and lookbehind groups for a proper match/substitution
pattern = re.compile(r"(?<=\()(.*?)(?=\))")
# you can also use r"(\(.*?\))" if you're preserving the brackets
with open("test2.txt", "r") as f: # open test2.txt for reading
words = pattern.findall(f.read()) # grabs all the found words in test2.txt
with open("test1.txt", "r+") as f: # open test1.txt for reading and writing
# read the content of test1.txt and replace each match with the next `words` list value
content = pattern.sub(lambda x: words.pop(0) if words else x.group(), f.read())
f.seek(0) # rewind the file to the beginning
f.write(content) # write the new, 'updated' content
f.truncate() # truncate the rest of the file (if any)
的内容:
(hello)(bye)
和test1.txt
包含:
(This actually works) (Amazing!)
执行上述脚本会将test2.txt
更改为:
(This actually works)(Amazing!)
它还会通过迭代替换最多test1.txt
中找到的匹配项来解释文件中的不匹配(例如,如果您的test2.txt
包含test1.txt
,它将更改为{ {1}})。