按多个条件聚合列

时间:2017-02-28 17:41:37

标签: r dplyr

我想汇总这个数据框,其中每个Family Size有六个类别Hours Worked

families <- structure(list(`Family Size` = c(2L, 2L, 2L, 2L, 2L, 2L, 2L,13L, 13L, 13L), HoursLess20 = c("1,014", "1,041", "11", "3","1", "2", "1", "0", "0", "0"), Hours2024 = c(7L, 298L, 1L, 0L,0L, 0L, 0L, 0L, 0L, 0L), Hours2529 = c(1L, 34L, 0L, 0L, 0L, 0L,0L, 0L, 0L, 0L), Hours3034 = c(6L, 44L, 1L, 0L, 0L, 0L, 0L, 0L,0L, 0L), Hours3539 = c(4L, 46L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), Hours40plus = c(9L, 128L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), .Names = c("Family Size","HoursLess20", "Hours2024", "Hours2529", "Hours3034", "Hours3539","Hours40plus"), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 1977L,1978L, 1979L), class = "data.frame") 

1 个答案:

答案 0 :(得分:1)

首先,您目前将HoursLess20中的值作为字符串 (因为逗号)。要进行任何类型的数字聚合, 你会想要删除逗号并将其转换为数字。

families$HoursLess20 = as.numeric(gsub(",", "", families$HoursLess20))

完成后,您只需使用聚合函数即可 创建你想要的任何聚合。

## Sum
aggregate(families[,-1], list(families[,1]), sum)
  Group.1 HoursLess20 Hours2024 Hours2529 Hours3034 Hours3539 Hours40plus
1       2        2073       306        35        51        50         138
2      13           0         0         0         0         0           0

## Average
aggregate(families[,-1], list(families[,1]), mean)
  Group.1 HoursLess20 Hours2024 Hours2529 Hours3034 Hours3539 Hours40plus
1       2    296.1429  43.71429         5  7.285714  7.142857    19.71429
2      13      0.0000   0.00000         0  0.000000  0.000000     0.00000