我有一个基本上看起来像这样的文本文件:
Game #16406772158 starts.\n#Game No : 16406772158\n
....
wins $0.75 USD\n\n\n_
很多\ n(新文本)\ n(新文本)然后\ n \ n \ n。我想在我的文本文件中找到发生这种情况的所有实例。当我的代码看起来像这样时,它可以工作(但仅适用于第一个实例):
gameRegex = re.compile(r"""Game #(.+\n)*""")
game = gameRegex.search(totalContent)
当我切换到findall方法时,输出"游戏"变量看起来像这样:
['Yl9Ui1OhAPyGV0JlCPLRrg wins $0.75 USD\n',
'G72AzGPQLTOWfYoNST1K/g wins $10 USD\n',
'4bSQFjpEWTIcsil7GJkkVA wins $39.99 USD from the main pot with three of a kind, Kings.\n',
'U3xFxCVFfFBt50sL9VgLgQ wins $1.45 USD\n', ..., ]
编程很新,我不知道该怎么做。我希望它看起来像这样,它创建一个列表。在列表的每个项目中,它会显示文本,直到\ n \ n \ n:
game = ['Game #16406772158 starts.\n#Game No : 16406772158\n***** Hand
History for Game 16406772158 *****\n$50 USD NL Texas Hold'em - Wednesday,
July 01, 00:00:01 EDT 2009 ... Yl9Ui1OhAPyGV0JlCPLRrg wins $0.75 USD\n',
'Game #16406772158 starts.\n#Game No : 16406772158\n***** Hand History for
Game 16406772158 *****\n$50 USD NL Texas Hold'em - Wednesday, July 01,
00:00:01 EDT 2009 ... Yl9Ui1OhAPyGV0JlCPLRrg wins $0.75 USD\n']
答案 0 :(得分:1)
我认为您正在寻找的模式是这样的:
(?:(?!\\n\\n\\n).)+\\n\\n\\n
要删除列表项末尾的两个额外\ n,请使用此正则表达式:
(?:(?!\\n\\n\\n).)+\\n(?=\\n\\n)
import re
regex = r"(?:(?!\\n\\n\\n).)+\\n(?=\\n\\n)"
test_str = ("Game #16406772158 starts.\\n#Game No : 16406772158\\n\n"
"Yl9Ui1OhAPyGV0JlCPLRrg wins $0.75 USD\\nG72AzGPQLTOWfYoNST1K/g wins $10 USD\\n'4bSQFjpEWTIcsil7GJkkVA wins $39.99 USD from the main pot with three of a kind, Kings.\\n'U3xFxCVFfFBt50sL9VgLgQ wins $1.45 USD\\nwins $0.75 USD\\n\\n\\nGame #16406772158 starts.\\n#Game No : 16406772158\\n....\n"
"wins $0.75 USD\\n\\n\\n\n"
"Game #16406772158 starts.\\n#Game No : 16406772158\\n\n"
"....\n"
"wins $0.75 USD\\n\\n\\n")
result = []
matches = re.finditer(regex, test_str, re.DOTALL)
for match in matches:
#print ("Match was found at {start}-{end}: {match}".format(start = match.start(), end = match.end(), match = match.group()))
result.append(match.group())
print(result)
输出:
["Game #16406772158 starts.\\n#Game No : 16406772158\\n\nYl9Ui1OhAPyGV0JlCPLRrg wins $0.75 USD\\nG72AzGPQLTOWfYoNST1K/g wins $10 USD\\n'4bSQFjpEWTIcsil7GJkkVA wins $39.99 USD from the main pot with three of a kind, Kings.\\n'U3xFxCVFfFBt50sL9VgLgQ wins $1.45 USD\\nwins $0.75 USD\\n", '\\n\\nGame #16406772158 starts.\\n#Game No : 16406772158\\n....\nwins $0.75 USD\\n', '\\n\\n\nGame #16406772158 starts.\\n#Game No : 16406772158\\n\n....\nwins $0.75 USD\\n']