Question

我在Notepad ++中有一个这样的列表

V - Visitors  2009 - S01e11-12.torrent
V - Visitors (2009) S02e04.torrent
V - Visitors (2009) S01e01-12.torrent
V S02e02.torrent
V S02e05.torrent
Valentina S01e01-13.torrent
Valeria Medico Legale S01-02e01-16.torrent
Veep - Season 1 BDMux.torrent
Veep - Season 2 BDMux.torrent
Veep - Season 3.torrent
Veep - Season 4.torrent
Vegas S01e01-21.torrent
Velvet S01e13.torrent
Velvet S01e15.torrent
Vikings.S03E03.torrent
Vikings.S03E04.torrent
Vikings.S03E05.torrent
Velvet_S03e02.torrent
Velvet_S03e03.torrent
Velvet_S03e04.torrent

我想要一个正则表达式删除重复的第一个第二个单词行（veep - veep）以获得这样的最终列表

V - Visitors  2009 - S01e11-12.torrent
V S02e02.torrent
Valentina S01e01-13.torrent
Valeria Medico Legale S01-02e01-16.torrent
Veep - Season 1 BDMux.torrent
Vegas S01e01-21.torrent
Velvet S01e13.torrent

所以，如果我有

Veep - Season 1 BDMux.torrent
Veep - Season 2 BDMux.torrent

我只想要第一行

Veep - Season 1 BDMux.torrent

Answer 1

正则表达式查找/替换如下：

打开替换对话框
查找内容： ^([^ _.-]+[ _.-]+([^ _.-]++)?)(.*?\R)(\1.*?\R)+
替换为： \1\3
检查正则表达式
点击替换或全部替换

<强>解释

前提条件是文件已排序
第一部分^([^ _.-]+[ _.-]+([^ _.-]++)?)处理获取第一个单词后跟分隔符＆＃34; ＆＃34;，＆＃34; _＆＃34;，＆＃34;。＆＃34;或＆＃34; - ＆＃34;。
- 第一个词是一切都不是分隔符
- 第二个单词（([^ _.-]++)?）是可选的，以容纳天鹅绒示例
- 由于使用了第一个单词的括号，分隔符和可选的第二个单词被捕获到\1中，并且后面和包括换行符的内容被分解为\3以供以后重用
(.*?\R)会捕获到换行符（\R
最后一个parrt (\1.*?\R)+匹配以\1
匹配跨越所有行，它们将替换为\1\3并且只重建第一行，从而删除以下行

正则表达式按模式删除相似的单词

1 个答案: