分别汇总多列的数据表

时间:2018-03-26 09:56:33

标签: r dataframe dplyr summarize

我试图尽可能自动地在多个列中汇总数据,而不是独立地为每个列编写代码。我想总结一下:

Patch    Size       Achmil  Aciarv   Aegpod Agrcap
  A       10            0       1      1      0
  B       2             1       0      0      0
  C       2             1       0      0      0
  D       2             1       0      0      0

进入这个

Species   Presence      MaxSize    MeanSize    Count
Achmil       0             10         10         1
Achmil       1             2           2         3
Aciarv       0             2           2         3
Aciarv       1             10          10        1

我知道我可以单独运行group_by并汇总每列

achmil<-group_by(LimitArea, Achmil) %>%
  summarise(SumA=mean(Size))

但使用某种循环是否无法针对每个列的每个列自动运行此操作?任何帮助表示赞赏。

2 个答案:

答案 0 :(得分:0)

也许我们需要gather进行长格式化,然后执行summarise

library(tidyverse)
gather(df1, Species, Presence, Achmil:Agrcap) %>% 
      group_by(Species, Presence)  %>% 
      summarise( MaxSize = max(Size), MeanSize = mean(Size), Count = n())
# A tibble: 7 x 5
# Groups: Species [?]
#  Species Presence MaxSize MeanSize Count
#  <chr>      <int>   <dbl>    <dbl> <int>
#1 Achmil         0   10.0     10.0      1
#2 Achmil         1    2.00     2.00     3
#3 Aciarv         0    2.00     2.00     3
#4 Aciarv         1   10.0     10.0      1
#5 Aegpod         0    2.00     2.00     3
#6 Aegpod         1   10.0     10.0      1
#7 Agrcap         0   10.0      4.00     4

答案 1 :(得分:0)

这里使用聚合(和reshape2::melt()

的另一种解决方案
library(reshape2)

df = melt(df[,2:ncol(df)], "Size")
aggregate(. ~ `variable`+`value`, data = df, 
          FUN = function(x) c(max = max(x), mean = mean(x), count = length(x)))

    variable value Size.max Size.mean Size.count
1   Achmil     0       10        10          1
2   Aciarv     0        2         2          3
3   Aegpod     0        2         2          3
4   Agrcap     0       10         4          4
5   Achmil     1        2         2          3
6   Aciarv     1       10        10          1
7   Aegpod     1       10        10          1