Python - 匹配最后一个单词并从上一行删除

时间:2017-06-14 15:56:08

标签: python match

我正在尝试匹配最后一个单词并从上一行删除。

pd.concat([DF1, DF2], axis = 1)

输入

import re

# regex to select words with _-
line = re.compile('s/^(\w+(?:[-_]\d+)?)\n(?=.*\1\b)//gm;')

here_text = '''befall_fallen-fell
               closing-round
               Line - Eddying, dizzying, closing-round
               laughter-laugh_laugh
               Line - With soft and drunken laughter-laugh_laugh
               laughter-laugh_laugh
               befall_fallen-fell
               Line - Veiling all that may befall_fallen-fell'''

输出 - 尝试

befall_fallen-fell
closing-round
Line - Eddying, dizzying, closing-round
laughter-laugh_laugh
Line - With soft and drunken laughter-laugh_laugh
laughter-laugh_laugh
befall_fallen-fell
Line - Veiling all that may befall_fallen-fell

不确定如何开始。

1 个答案:

答案 0 :(得分:0)

以下PCRE正则表达式应该有效:

match \b(\S+)\b(.*\n.*\b\1)$
replace by \2
flags : [m]ulti-line and [g]lobal

或者,在python中:

re.sub(r'\b(\S+)\b(.*\n.*\b\1)$', r'\2', here_text, flags=re.M)

您可以在AndrewLIregex101上试用。

请注意,在其上一行中删除了最后一个单词的行不会再次匹配:

a
b a
b

将被替换为

 
b a
b

而不是

 
a
b