根据找到的值对文本进行评分

时间:2014-03-15 19:42:33

标签: r dataframe

我想知道如何根据grep()找到的值对我的数据框进行评分。 说我有一个DF包含这个:

age=c("France","Mars","Jupitor","Moon","Sun","Afrika","Texas","Michigan","Washington","Kiev","Amsterdam","Norway")
height=c("Paris","Planet","Planet","COLD","HOT!","LIONS","Austin","Lansing","WashingtonDC","Ukrain","Holland","Oslo")
village=data.frame(age=age,height=height)

我使用grep('Moon',village$age, ignore.case=TRUE)来搜索它所在的行。 如何在年龄前添加一列,以示例中的数字1为数, 如果我用grep('FRANCE',village$age, ignore.case=TRUE)用数字2对它进行评分?

1 个答案:

答案 0 :(得分:1)

你没有指定未找到的"分数"应该是,所以以下只使用NA' s:

age <- c("France","Mars","Jupitor","Moon","Sun","Afrika",
         "Texas","Michigan","Washington","Kiev","Amsterdam","Norway")

height <- c("Paris","Planet","Planet","COLD","HOT!","LIONS",
            "Austin","Lansing","WashingtonDC","Ukrain","Holland","Oslo")

village <- data.frame(score=NA, age=age, height=height)

print(village)

##    score        age       height
## 1     NA     France        Paris
## 2     NA       Mars       Planet
## 3     NA    Jupitor       Planet
## 4     NA       Moon         COLD
## 5     NA        Sun         HOT!
## 6     NA     Afrika        LIONS
## 7     NA      Texas       Austin
## 8     NA   Michigan      Lansing
## 9     NA Washington WashingtonDC
## 10    NA       Kiev       Ukrain
## 11    NA  Amsterdam      Holland
## 12    NA     Norway         Oslo

village[grep('moon', village$age, ignore.case=TRUE),]$score <- 1
village[grep('france', village$age, ignore.case=TRUE),]$score <- 2

print(village)

##    score        age       height
## 1      2     France        Paris
## 2     NA       Mars       Planet
## 3     NA    Jupitor       Planet
## 4      1       Moon         COLD
## 5     NA        Sun         HOT!
## 6     NA     Afrika        LIONS
## 7     NA      Texas       Austin
## 8     NA   Michigan      Lansing
## 9     NA Washington WashingtonDC
## 10    NA       Kiev       Ukrain
## 11    NA  Amsterdam      Holland
## 12    NA     Norway         Oslo