dplyr group_by遍历不同的列

时间:2019-03-13 23:05:16

标签: r dplyr apply lapply

我有以下数据;

enter image description here

我想使用group_by创建三个不同的数据帧并总结dplyr函数。这些将是df_Sex,df_AgeGroup和df_Type。对于这些列中的每一列,我都想执行以下功能;

 df_Sex =  df%>%group_by(Sex)%>%summarise(Total = sum(Number))

是否可以使用apply或lapply将这三列(性,AgeGrouping和类型)中的每一个的名称传递给这三个列?

2 个答案:

答案 0 :(得分:2)

这将起作用,但将创建数据帧列表作为输出

### Create your data first    

df <- data.frame(ID = rep(10250,6), Sex = c(rep("Female", 3), rep("Male",3)), 
                     Population = c(rep(3499, 3), rep(1163,3)), AgeGrouping =c(rep("0-14", 3), rep("15-25",3)) , 
                     Type = c("Type1", "Type1","Type2", "Type1","Type1","Type2"), Number = c(260,100,0,122,56,0))

gr <- list("Sex", "AgeGrouping","Type")

df_list <- lapply(gr, function(i) group_by(df, .dots=i) %>%summarise(Total = sum(Number)))

答案 1 :(得分:1)

这是一种方法:

f <- function(x) {
     df %>% 
         group_by(!!x) %>% 
         summarize(Total = sum(Number))
}

lapply(c(quo(Sex), quo(AgeGrouping), quo(Type)), f)

也许有更好的方法可以做到这一点,我对tidyeval的关注并不多。我个人更希望这样:

library(data.table)
DT <- as.data.table(df)
lapply(c("Sex", "AgeGrouping", "Type"), 
       function(x) DT[, .(Total = sum(Number)), by = x])