从矢量中选择一些单词

时间:2014-08-28 06:40:19

标签: regex r

我试图从vector

中提取一些单词
#[1] "crossWord"       "stackedBARgraph" "crossBOW"        "topHat"         
#[5] "BowtinG"         "softH"

从上面的列表中,我需要

 "crossWord"  "topHat", "softH"

规则是如果有一个小写字母,然后是一个大写字母在结尾或如果不在结尾,那么小写字母后跟,并且单词不应以大写字母开头

2 个答案:

答案 0 :(得分:1)

这是一种方式:

  grep("^[a-z]+[A-Z]([a-z]+|\\b)", str1, value=TRUE)
  #[1] "crossWord" "topHat"    "softH"    

  #data
  str1 <-  c("crossWord", "stackedBARgraph", "crossBOW", "topHat", "BowtinG", "softH")

解释

`^` beginning of string 
`[a-z]+` one or more lower case characters followed by
`[A-Z]`  one uppercase character followed by
`([a-z]+|\\b)` either one ore more lower case characters or a word boundary 

答案 1 :(得分:1)

您可以使用此正则表达式:

v <- c("crossWord","stackedBARgraph","crossBOW","topHat","BowtinG","softH")

validIdxs <- grep("^[a-z]+(([A-Z][a-z]+)|([A-Z]))$",v)
v[validIdxs]
# [1] "crossWord" "topHat"    "softH" 

在线正则表达式测试:http://regex101.com/r/vW2pQ7/1