使用Notepadd ++进行双线换行

时间:2015-05-18 23:52:01

标签: regex replace notepad++ line-breaks

我正准备一些Whatsapp聊天记录来渲染统计数据和wordclouds。但是我的数据时不时会出现双重换行符,这会混淆日志的格式,我想知道如何自动修复。

13 Mar 18:51 - nicola: mainly he's crap
13 Mar 18:52 - Sebastian K: ... you didn't really dress it up
13 Mar 18:52 - nicola: and he has no natural grace like most cats 

well no i didn't lol
13 Mar 18:52 - nicola: you saw the last video
13 Mar 18:53 - Sebastian K: Stilton jumped onto that wall effortlessly while Ched almost killed himself yea...

搜索并删除空行(简单修复)。但是,我仍然留下了打破日期和时间格式的行:

13 Mar 18:51 - nicola: mainly he's crap
13 Mar 18:52 - Sebastian K: ... you didn't really dress it up
13 Mar 18:52 - nicola: and he has no natural grace like most cats 
well no i didn't lol
13 Mar 18:52 - nicola: you saw the last video
13 Mar 18:53 - Sebastian K: Stilton jumped onto that wall effortlessly while Ched almost killed himself yea...

目标格式:

13 Mar 18:51 - nicola: mainly he's crap
13 Mar 18:52 - Sebastian K: ... you didn't really dress it up
13 Mar 18:52 - nicola: and he has no natural grace like most cats well no i didn't lol
13 Mar 18:52 - nicola: you saw the last video
13 Mar 18:53 - Sebastian K: Stilton jumped onto that wall effortlessly while Ched almost killed himself yea...

也许解决方案正在利用这条规则:我需要保留的换行符遵循模式:

TEXT *linebreak* 
NUMBER(begging of date column)

麻烦的人遵循模式:

TEXT *linebreak*
TEXT

我怎样才能使用Notepad ++修复它?

1 个答案:

答案 0 :(得分:1)

在搜索和替换对话框中,您可以搜索此模式

public interface FileProcess{
    public void process();    
}

public class TextProcess implements FileProcess{ 
    public void process(){System.out.print("Im Text file")};
}

public class VideoProcess implements FileProcess{ 
   public void process(){System.out.print("Im Video file")};
}

public class AudioProcess implements FileProcess{ 
   public void process(){System.out.print("Im Audio file")};
}

启用正则表达式并替换为空。

\r\n(?!\d) 搜索由CR和LF组成的换行符。在Notepad ++中启用控制字符的显示,以查看您有哪些换行符。

\r\nnegative lookahead断言,当没有数字跟随时,这是正确的。这适用于您的示例,但对于某些极端情况可能会失败,您可以将其扩展为模式,例如(?!\d)当日期总是两位数时。