使用未观察到的级别定义因子变量

时间:2012-10-11 15:15:41

标签: r

我正在使用具有以下结构的数据集......

grades <- c("7A", "8B", "6C", "6B+")

...但是我的数据集中没有一些当前未观察到的级别。但是我不希望自动定义因子(因此在读取数据时使用read.csv(...,stringsAsFactors = FALSE))。我想明确定义级别及其标签,并将导入的字符串转换为有序因子,以便所有等级都表示相关的计数为零(如果没有观察到的话)。

real.grades  <- ordered(x = character(), 
                        levels = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17),                       
                        labels = c("6A", "6A+", "6B", "6B+", "6C", "6C+", "7A", "7A+", "7B", "7B+", "7C", "7C+", "8A", "8A+", "8B", "8B+", "8C"))

......但我正在努力解决这个问题?

提前感谢您的建议和指示。

1 个答案:

答案 0 :(得分:2)

我认为这就是你所追求的:

grades <- c("7A", "8B", "6C", "6B+")

real.grades  <- factor(grades, levels = c("6A", "6A+", "6B", "6B+", "6C", 
    "6C+", "7A", "7A+", "7B", "7B+", "7C", "7C+", "8A", "8A+", "8B", 
    "8B+", "8C"))   

产量:

> real.grades 
[1] 7A  8B  6C  6B+
Levels: 6A 6A+ 6B 6B+ 6C 6C+ 7A 7A+ 7B 7B+ 7C 7C+ 8A 8A+ 8B 8B+ 8C

对于数字表示,请使用:

as.numeric(real.grades)