在数据框中以数字方式重新排序因子

时间:2013-03-27 17:32:37

标签: r sorting

我有从0到39的因子。以下是现在如何订购:

> levels(items$label)
 [1] "0"  "1"  "10" "11" "12" "13" "14" "15" "16" "17" "18" "19"
[13] "2"  "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "3" 
[25] "30" "31" "32" "33" "34" "35" "36" "37" "38" "39" "4"  "5" 
[37] "6"  "7"  "8"  "9"

如何以数字顺序重新排序它们以用于显示目的?我不想改变数据框的含义。

更新:如何使用排序因子items更新原始数据框labels?这不应该实质性地改变数据框架;我只是希望这些因素在后续操作中以正确的顺序出现。

3 个答案:

答案 0 :(得分:3)

sorted_labels <- paste(sort(as.integer(levels(items$label))))

给出:

 [1] "0"  "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11"
[13] "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" "22" "23"
[25] "24" "25" "26" "27" "28" "29" "30" "31" "32" "33" "34" "35"
[37] "36" "37" "38" "39"

或(如https://stackoverflow.com/a/15665655/109618中所述):

sorted_labels <- order(levels(items$label)) - 1
# order by itself is a 1-based vector
# using `- 1` gives a 0-based vector

根据更新的问题,这会更新数据框:

items$label <- factor(items$label, levels = sorted_labels)

答案 1 :(得分:2)

如果所有整数都存在,那么您只需使用order

  order(levels(items$label)) - 1   # where the minus 1 is for starting from 0

如果不是所有整数都存在,那么你必须使用as.numeric,就像你拥有它一样。

答案 2 :(得分:0)

如果您正在寻找tidyverse / forcats解决方案:

生成数据:

items <- data.frame(label = as.character(0:39),stringsAsFactors = FALSE)
# if stringsAsFactors = TRUE (default), items$label must be converted to character before casting to integer!

factor(items$label)
#>  [1] 0  1  2  3  4  5  6  7  8  9  10 11 12 13 14 15 16 17 18 19 20 21 22
#> [24] 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
#> 40 Levels: 0 1 10 11 12 13 14 15 16 17 18 19 2 20 21 22 23 24 25 26 ... 9

使用fct_relevel

library(forcats)

fct_relevel(items$label,function(x){as.character(sort(as.integer(x)))})
#>  [1] 0  1  2  3  4  5  6  7  8  9  10 11 12 13 14 15 16 17 18 19 20 21 22
#> [24] 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
#> 40 Levels: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 ... 39

它也可以与fct_reorder一起使用

fct_reorder(items$label,as.integer(items$label))
#>  [1] 0  1  2  3  4  5  6  7  8  9  10 11 12 13 14 15 16 17 18 19 20 21 22
#> [24] 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
#> 40 Levels: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 ... 39

如果您的向量还包含字符(例如1 egg2 eggs等),则会带来一些不错的可能性:

items$label2 <- paste(items$label,"eggs")

factor(items$label2)
#>  [1] 0 eggs  1 eggs  2 eggs  3 eggs  4 eggs  5 eggs  6 eggs  7 eggs 
#>  [9] 8 eggs  9 eggs  10 eggs 11 eggs 12 eggs 13 eggs 14 eggs 15 eggs
#> [17] 16 eggs 17 eggs 18 eggs 19 eggs 20 eggs 21 eggs 22 eggs 23 eggs
#> [25] 24 eggs 25 eggs 26 eggs 27 eggs 28 eggs 29 eggs 30 eggs 31 eggs
#> [33] 32 eggs 33 eggs 34 eggs 35 eggs 36 eggs 37 eggs 38 eggs 39 eggs
#> 40 Levels: 0 eggs 1 eggs 10 eggs 11 eggs 12 eggs 13 eggs ... 9 eggs


library(readr)

fct_reorder(items$label2,parse_number(items$label2))
#>  [1] 0 eggs  1 eggs  2 eggs  3 eggs  4 eggs  5 eggs  6 eggs  7 eggs 
#>  [9] 8 eggs  9 eggs  10 eggs 11 eggs 12 eggs 13 eggs 14 eggs 15 eggs
#> [17] 16 eggs 17 eggs 18 eggs 19 eggs 20 eggs 21 eggs 22 eggs 23 eggs
#> [25] 24 eggs 25 eggs 26 eggs 27 eggs 28 eggs 29 eggs 30 eggs 31 eggs
#> [33] 32 eggs 33 eggs 34 eggs 35 eggs 36 eggs 37 eggs 38 eggs 39 eggs
#> 40 Levels: 0 eggs 1 eggs 2 eggs 3 eggs 4 eggs 5 eggs 6 eggs ... 39 eggs

reprex package(v0.3.0)于2019-06-26创建

fct_ *函数的所有输出都可以写回到原始数据。例如:

items$data <- fct_reorder(items$label,as.integer(items$label))