删除方括号及其内容

时间:2019-03-18 20:18:41

标签: r regex gsub

有几篇文章涵盖了类似的问题:

Remove square brackets from a string vector

...但是regex太难了,我似乎无法得到我想要的任何东西。

我已经从html复制并粘贴了一个大表,它的结构很好。一栏中有一些拖尾的文物。

以下是一些示例数据:

df <- structure(list(From = c("3 February 2015[N 4]", "23 February 2017[N 3]", 
                    "17 March 2010[N 1]", "22 July 2016[N 2]", "14 May 1986", "22 February 1995", 
                    "8 June 1995", "12 August 1996"), Until = c("4 November 2015", 
                                                                "17 October 2017", "9 May 2010", "3 January 2017", "21 February 1995", 
                                                                "8 June 1995", "12 August 1996", "13 September 1996")), class = c("spec_tbl_df", 
                                                                                                                                  "tbl_df", "tbl", "data.frame"), row.names = c(NA, -8L), spec = structure(list(
                                                                                                                                    cols = list(Name = structure(list(), class = c("collector_character", 
                                                                                                                                                                                   "collector")), Nat. = structure(list(), class = c("collector_logical", 
                                                                                                                                                                                                                                     "collector")), Club = structure(list(), class = c("collector_character", 
                                                                                                                                                                                                                                                                                       "collector")), From = structure(list(), class = c("collector_character", 
                                                                                                                                                                                                                                                                                                                                         "collector")), Until = structure(list(), class = c("collector_character", 
                                                                                                                                                                                                                                                                                                                                                                                            "collector")), `Duration
                                                                                                                                                (days)` = structure(list(), class = c("collector_double", 
                                                                                                                                                                                      "collector")), `Years in
                                                                                                                                                League` = structure(list(), class = c("collector_character", 
                                                                                                                                                                                      "collector")), Ref. = structure(list(), class = c("collector_character", 
                                                                                                                                                                                                                                        "collector"))), default = structure(list(), class = c("collector_guess", 
                                                                                                                                                                                                                                                                                              "collector")), skip = 1), class = "col_spec"))

文物采用方括号的格式,其中带有字母和数字。 [N1]

当我解析为日期列Until时,效果很好:

library(lubridate)
df %>%
  mutate(Until = dmy(Until))

但是伪造奇数的列From无法解析这些条目:

df %>%
  mutate(From = dmy(From))

我先尝试使用纯文本gsub,甚至一次尝试过一次:

gsub("[N1]", "", df$From)

...但是伪影条目以外的列中的文本被弄乱了-我想是由于方括号。

然后我尝试了正则表达式,但无法使其正常工作:

gsub("\\[.*?\]/", "", df$From)

gsub("\\[N\d\\]", "", df$From)

都给出相同的内容:Error: '\]' is an unrecognized escape in character string starting

我真的不介意解决方案是gsub中的str_replace_all还是tidyverse,我只需要删除/替换[N1][N2]并等等。

0 个答案:

没有答案