R正则表达式删除除撇号之外的所有标点符号

时间:2013-03-06 18:58:51

标签: regex r

我正在尝试从字符串中删除除撇号之外的所有标点符号。这是我的exastr2< -

str2 <- "this doesn't not have an apostrophe,.!@#$%^&*()"
gsub("[[:punct:,^\\']]"," ", str2 )
# [1] "this doesn't not have an apostrophe,.!@#$%^&*()"

我做错了什么?

3 个答案:

答案 0 :(得分:16)

“负前瞻断言”可用于在考虑任何撇号之前删除它们,甚至在它们作为标点符号进行测试之前。

gsub("(?!')[[:punct:]]", "", str2, perl=TRUE)
# [1] "this doesn't not have an apostrophe"

答案 1 :(得分:1)

我不确定您是否可以在正则表达式中指定除'之外的所有标点符号。我会用否定检查alphanumerics + ' + space

gsub("[^'[:lower:] ]", "", str2) # per Joshua's comment
# [1] "this doesn't not have an apostrophe"

答案 2 :(得分:1)

您可以使用:

str2 <- "this doesn't not have an apostrophe,.!@#$%^&*()"

library(qdap)
strip(str2, apostrophe.remove = FALSE, lower.case = FALSE)
相关问题