Question

如果我有此数据框：

(df=as.data.frame(dput(structure(list(sex = structure(c(1L, 1L, 2L, 2L), .Label = c("boy", "girl"), class = "factor"), age = c(52L, 58L, 40L, 62L), bmi = c(25L, 23L, 30L, 26L), chol = c(187L, 220L, 190L, 204L),sed = c(180L, 120L, 155L, 124L)), .Names = c("sex", "age", "b1", "b2","b100"), row.names = c(NA, -4L), class = "data.frame"))))

我想按性别进行分组，然后将summarise（）中的差异函数应用于差异列：

calculate the "mean" of the column "age" (ONLY)

calculate the "sd" of all columns whose names begin with "b": column b1,b2...

我尝试过：

df%>%group_by(sex)%>%summarise_at(.vars = c("age",names(df)[substr(names(df),1,1)=="b"]),
                                            .funs = c(mean="mean", sd="sd"))

但是它将“ mean”和“ sd”功能应用于所有列，正是我要避免的。

我想要的结果是一列：mean_age和其他列：sd_b1，sd_b2 ...

使用dplyr可以吗？或者我必须分两个步骤来完成它：

df%>%group_by(sex)%>%summarise(mean_age=mean(age))

df%>%group_by(sex)%>%summarise_at(.vars = c(names(df)[substr(names(df),1,1)=="b"]),
                                            .funs = c(sd="sd"))

谢谢

group_by并汇总到几列的differents函数

0 个答案: