什么是SQL的LIKE'description%'语句的R等价物?

时间:2010-08-22 02:17:04

标签: sql r

不知道怎么回答这个问题但是,我想在几个字符串元素中搜索一个术语。这是我的代码看起来像(但错误):

inplay = vector(length=nrow(des))
for (ii in 1:nrow(des)) {
 if (des[ii] = 'In play%')
  inplay[ii] = 1
 else inplay[ii] = 0
}

des是一个存储字符串的向量,例如“Swinging Strike”,“In play(run(s))”,“in play(out(s)recorded)”等等。我想要存储的内容是与des向量对应的1s和0s向量,inplay中的1s表示des值在其中有“In play%”,否则为0s。

我认为第3行不正确,因为所有这一切都会返回0的向量,最后一个元素为1。

提前致谢!

3 个答案:

答案 0 :(得分:17)

data.table package的语法通常为similar to SQL。该软件包包括%like%,这是一个用于调用regexpr"的便利功能。以下是从其帮助文件中获取的示例:

## Create the data.table:
DT = data.table(Name=c("Mary","George","Martha"), Salary=c(2,3,4))

## Subset the DT table where the Name column is like "Mar%":
DT[Name %like% "^Mar"]
##      Name Salary
## 1:   Mary      2
## 2: Martha      4

答案 1 :(得分:15)

对SQL的LIKE的R模拟只是R的普通索引语法。

'LIKE'运算符通过将指定列中的字符串值与用户提供的模式匹配来从表中选择数据行

> # create a data frame having a character column
> clrs = c("blue", "black", "brown", "beige", "berry", "bronze", "blue-green", "blueberry")
> dfx = data.frame(Velocity=sample(100, 8), Colors=clrs)
> dfx
            Velocity    Colors
        1       90       blue
        2       94      black
        3       71      brown
        4       36      beige
        5       75      berry
        6        2     bronze
        7       89    blue-green
        8       93    blueberry

> # create a pattern to use (the same as you would do when using the LIKE operator)
> ptn = '^be.*?'  # gets beige and berry but not blueberry
> # execute a pattern-matching function on your data to create an index vector
> ndx = grep(ptn, dfx$Colors, perl=T)
> # use this index vector to extract the rows you want from the data frome:
> selected_rows = dfx[ndx,]
> selected_rows
   Velocity Colors
     4       36  beige
     5       75  berry 

在SQL中,那将是:

SELECT * FROM dfx WHERE Colors LIKE ptn3

答案 2 :(得分:2)

regexpr

之类的东西
> d <- c("Swinging Strike", "In play (run(s))", "In play (out(s) recorded)")
> regexpr('In play', d)
[1] -1  1  1
attr(,"match.length")
[1] -1  7  7
> 

grep

> grep('In play', d)
[1] 2 3
>