r中重新编码的问题

时间:2013-09-04 11:42:59

标签: r

使用一些示例代码:

df <- structure(list(DWFRSS1 = c("Always", "Sometimes", "Never", "Always", 
"Sometimes", "Sometimes", "Always", "Sometimes", "Never", "Often", 
"Always", "Sometimes", "Sometimes", "Always", "Always"), DWFRSS2 = c("Always", 
"Never", "Often", "Always", "Always", "Never", "Always", "Rarely", 
"Never", "Often", "Always", "Rarely", "Often", "Never", "Always"
), DWFRSS3 = c("Always", "Always", "Often", "Always", "Always", 
"Always", "Always", "Sometimes", "Rarely", "Often", "Always", 
"Often", "Always", "Always", "Always"), DWFRSS4 = c("Always", 
"Always", "Often", "Always", "Always", "Always", "Always", "Never", 
"Often", "Always", "Always", "Sometimes", "Often", "Sometimes", 
"Sometimes"), DWFYSS1 = c("Often", "Often", "Always", "Always", 
"Always", "Often", "Often", "Rarely", "Sometimes", "Often", "Never ", 
"Sometimes", "Sometimes", "Always", "Always"), DWFYSS2 = c("Often", 
"Always", "Always", "Always", "Always", "Always", "Sometimes", 
"Rarely", "Rarely", "Always", "Always", "Often", "Often", "Always", 
"Always"), DWFYSS3 = c("Often", "Often", "Always", "Always", 
"Always", "Often", "Never ", "Rarely", "Never ", "Always", "Always", 
"Often", "Often", "Always", "Always"), DWFYSS4 = c("Always", 
"Always", "Always", "Always", "Always", "Always", "Always", "Sometimes", 
"Often", "Always", "Always", "Often", "Always", "Always", "Always"
)), .Names = c("DWFRSS1", "DWFRSS2", "DWFRSS3", "DWFRSS4", "DWFYSS1", 
"DWFYSS2", "DWFYSS3", "DWFYSS4"), class = "data.frame", row.names = c(NA, 
15L))

我正在尝试使用下面详述的代码重新编码变量:

library(car)
cols <- c("DWFRSS1","DWFRSS2","DWFRSS3","DWFRSS4",
       "DWFYSS1","DWFYSS2","DWFYSS3","DWFYSS4")
df[,cols]  <- sapply(df[, cols], FUN = function(x){
   recode(x, "'Never' =1; 'Rarely' =2; 'Sometimes' =3; 'Often' =4; 'Always' =5",
   as.numeric.result=TRUE)})

但是,正如您从结果数据框中看到的那样,“从不”有时不会被编码。从文本看起来这是因为有一个额外的空间(“从不”)。在运行重新编码行之前,如何让R删除这些空格(如果存在)?

1 个答案:

答案 0 :(得分:2)

您的部分值为"Never ",而不是"Never"。空间阻止匹配。

您可以使用str_trim包中的stringr来删除空格。

Ananda建议的完整解决方案:

library(stringr)
as.data.frame(
  lapply(
    df, 
    function(x) 
    {
      recode(
        str_trim(x), 
        "'Never'=1; 'Rarely'=2; 'Sometimes'=3; 'Often'=4; 'Always'=5", 
        as.numeric.result = TRUE
      )
    }
  )
)