根据R中的正则表达式创建换行符

时间:2015-11-15 02:55:57

标签: r split lines

我是R.的新手。我从网上提取了一些文字并粘贴在一个文本文件中。它们看起来像这样。

    c("HR name as meena in malad west branch first source ltd called me for interview as openings in llyods chat process as banking process she told me 3 rounds of interview and other hr vl ask me these questions.As she said there r openings but when other hr taken my interview she told there r no...", 
"", "", "Sir with due respect from 7 nov 2015, i dont receive my sms alerts from my registered mobile number as 9596159288 . ", 
"Account name Tariq Ahmad Mir", "Branch: WATRIGAM", "Contact: 1954-235307", 
"", "IFSC Code: SBIN0004591 ", "", "", "MICR Code: 193002321..."

这些评论中的每一条都以" ..."分隔。在评论的最后。我试图将每个评论连接成一行。我尝试了以下代码:

a <- readLines("banking1.txt", warn = FALSE)
a <- a[sapply(a, nchar) > 0]
a <- paste(a, collapse = ",")

给了我一个输出如下:

"HR name as meena in malad west branch first source ltd called me for interview as openings in llyods chat process as banking process she told me 3 rounds of interview and other hr vl ask me these questions.As she said there r openings but when other hr taken my interview she told there r no...,Sir with due respect from 7 nov 2015, i dont receive my sms alerts from my registered mobile number as 9596159288 . ,Account name Tariq Ahmad Mir,Branch: WATRIGAM,Contact: 1954-235307,IFSC Code: SBIN0004591 ,MICR Code: 193002321..."

我正在尝试使用...分隔符分割它们。

a <- strsplit(a, "...,")
a <- strsplit(a, "...,")[[1]]
a <- noquote(strsplit(a, "...,")[[1]]) 

和许多其他类似的选择。但输出并不是我所期望的。我需要的是

HR name as meena in malad west branch first source ltd called me for interview as openings in llyods chat process as banking process she told me 3 rounds of interview and other hr vl ask me these questions.As she said there r openings but when other hr taken my interview she told there r no...
Sir with due respect from 7 nov 2015, i dont receive my sms alerts from my registered mobile number as 9512139288 . Account name Tariq Ahmad Mir Branch: MAGRITAW Contact: 1954-235307 IFSC Code: AVCN0001234 MICR Code: 19300321...

有人可以帮忙吗?

1 个答案:

答案 0 :(得分:1)

您可以使用负面的背后隐藏。

x <- c("HR name as meena in malad west branch first source ltd called me for interview as openings in llyods chat process as banking process she told me 3 rounds of interview and other hr vl ask me these questions.As she said there r openings but when other hr taken my interview she told there r no...", 
  "", "", "Sir with due respect from 7 nov 2015, i dont receive my sms alerts from my registered mobile number as 9596159288 . ", 
  "Account name Tariq Ahmad Mir", "Branch: WATRIGAM", "Contact: 1954-235307", 
  "", "IFSC Code: SBIN0004591 ", "", "", "MICR Code: 193002321...")
y <- paste(x, collapse="\n")
z <- gsub("(?<!\\.{3})\\n+", " ", y, perl=TRUE) 
z <- strsplit(z, "\n")

DEMO