R中的非字母数字字符

时间:2014-08-25 03:33:17

标签: r

对于大写,小写字母和10位数字,我可以生成包含所有字母或10位数字的向量,如下所示:

A <- LETTERS[0:26]
B <- letters[0:26]
C <- seq(0,9)

我想知道是否有类似的非字母数字字符功能。

~!@#$%^&*_-+=`|\(){}[]:;"'<>,.?/

我试过

D <- c("~","!","@","#","$","%","^", "&","*","_","-","+","=","`","|","\","(",")","{","}","[","]",":",";",""","'","<",">",",",".","?","/")

由于

4 个答案:

答案 0 :(得分:3)

这是另一种选择。生成所有ascii字符,然后使用正则表达式过滤掉非标点符号。

ascii <- rawToChar(as.raw(0:127), multiple=TRUE)
ascii[grepl('[[:punct:]]', ascii)]

# [1] "!"  "\"" "#"  "$"  "%"  "&"  "'"  "("  ")"  "*"  "+"  ","  "-"  "."  "/"  ":"  ";"  "<"  "="  ">"  "?"  "@" 
# [23] "["  "\\" "]"  "^"  "_"  "`"  "{"  "|"  "}"  "~" 

答案 1 :(得分:1)

这可能很有用。 。 ASCII字符集排列在相似类型的字符(字母等)的范围内。

http://datadebrief.blogspot.com/2011/03/ascii-code-table-in-r.html

答案 2 :(得分:1)

它有点抽出,可能是一个更好的网站(以及获得相同结果的更好方法),但

library(XML); library(RCurl)
doc <- htmlParse(getURL("https://wci.llnl.gov/codes/basis/manual/node161.html"))
xp <- xpathSApply(doc, "//tr/td", xmlValue, trim = TRUE) 
xp[nzchar(xp) & nchar(xp) == 1]
#  [1] "!" "[" "%" "," "]" "&" "-" "|" "'" "." "=" "~" "("
# [14] "/" ")" "*" "=" "{" "?" "`" "}" "@" ":" ";" "^" " "

此外,使用其他答案的网站会产生更完整的结果

> URL <- "http://datadebrief.blogspot.com/2011/03/ascii-code-table-in-r.html"
> r <- readLines(URL, warn = FALSE)[780:874]
> s <- sapply(strsplit(r, "\\s+"), "[", 1) 
> s[!s %in% c(letters, LETTERS, 0:9)]
#  [1] ""     "!"    "\""   "#"    "$"    "%"    "&"    "'"    "("   
# [10] ")"    "*"    "+"    ","    "-"    "."    "/"    ":"    ";"   
# [19] "<"    "="    ">"    "?"    "@"    "["    "\\\\" "]"    "^"   
# [28] "_"    "`"    "{"    "|"    "}"    "~" 

......或者是的,只需像MrFlick一样使用rawToChar(as.raw(...))说: - )

答案 3 :(得分:1)

此答案仅用于娱乐,列出您想要的字符,并使用strsplit生成您的矢量。

> D <- strsplit('!"#$%&\'()*+,-./\\:;<=>?@[]^_`{|}~', '(?=.)', perl=T)[[1]]
##  [1] "!"  "\"" "#"  "$"  "%"  "&"  "'"  "("  ")"  "*"  "+"  ","  "-"  "."  "/" 
## [16] "\\" ":"  ";"  "<"  "="  ">"  "?"  "@"  "["  "]"  "^"  "_"  "`"  "{"  "|" 
## [31] "}"  "~" 

或过滤所需的字符。

> D <- gsub('[^\\pP\\pS]', '', rawToChar(as.raw(1:127), multiple=T), perl=T)
> D[D != ""]
##  [1] "!"  "\"" "#"  "$"  "%"  "&"  "'"  "("  ")"  "*"  "+"  ","  "-"  "."  "/" 
## [16] ":"  ";"  "<"  "="  ">"  "?"  "@"  "["  "\\" "]"  "^"  "_"  "`"  "{"  "|" 
## [31] "}"  "~" 
相关问题