将具有特殊字符的字符串向量转换为因子

时间:2014-08-20 01:43:26

标签: r

我有一个Stata dta原始数据文件,其中包含一个的字符串向量。使用foreign包导入R后,我的数据如下所示:

# dput(dat[1:3, 218])
# c("", "I want very much\xc9will do whatever it takes", "I want very much\xc9will do my fair share"

对于这个例子,我将创建一个名为test的对象:

test <- c("", "I want very much\xc9will do whatever it takes", "I want very much\xc9will do my fair share")

我想将test转换为一个因子,但我只是得到了所有的NA。我尝试了两种方法:

factor(test,
       levels=c("I want very much\\xc9will do whatever it takes",
                "I want very much\\xc9will do my fair share"),
       labels=c(1, 2))
# [1] <NA> <NA> <NA>
# Levels: 1 2

factor(test,
       levels=c("I want very much…will do whatever it takes",
                "I want very much…will do my fair share"),
       labels=c(1, 2))
# [1] <NA> <NA> <NA>
# Levels: 1 2

我知道我可以编辑dta文件,但我不想触摸原始数据。我还能尝试什么?

最后,我想要以下内容:

#[1] <NA> 1    2   
#Levels: 1 2

3 个答案:

答案 0 :(得分:1)

请勿使用\\来逃避您的特殊角色。这有效:

factor(test,
       levels=c("I want very much\xc9will do whatever it takes",
                "I want very much\xc9will do my fair share"),
       labels=c(1, 2))

#[1] <NA> 1    2   
#Levels: 1 2

答案 1 :(得分:0)

test <- c(NA, "I want very much\xc9will do my fair share", "I want very much\xc9will do whatever it takes")

ana <- as.factor(test)

library(plyr)

bob <- revalue(ana, c("I want very much\xc9will do my fair share" = "1",
                  "I want very much\xc9will do whatever it takes" = "2"))
bob

这对你有用吗?

答案 2 :(得分:0)

从查看您的预期输出,可能是:

 factor(as.vector(setNames(1:2,unique(test[test!='']))[test]))
 #[1] <NA> 1    2   
 #Levels: 1 2

从@ thelatemail的回复中注意到,levelstest字符串不匹配。例如。

 test1 <- c("", "I want very much\\xc9will do whatever it takes", "I want very much\\xc9will do my fair share")  #using `\\`
 factor(test1, levels= unique(test1[test1!='']), labels=1:2)
 #[1] <NA> 1    2   
 #Levels: 1 2

如果你这样做:

 factor(test1, levels= unique(test[test!='']), labels=1:2)
 #[1] <NA> <NA> <NA>
 #Levels: 1 2
相关问题