计算特定列的平均值

时间:2014-01-12 05:42:38

标签: r

C1<-c(3,2,4,4,5)
C2<-c(3,7,3,4,5)
C3<-c(5,4,3,6,3)
DF<-data.frame(ID=c("A","B","C","D","E"),C1=C1,C2=C2,C3=C3)

DF
  ID Type C1 C2 C3
1  A    1  3  3  5
2  B    2  2  7  4
3  C    1  4  3  3
4  D    2  4  4  6
5  E    2  5  5  3

如何按类型计算每个列分组的平均值并忽略ID列?即:

Type    C1   C2   C3
   1  3.50 3.00 4.00
   2  3.67 5.00 4.33

谢谢!

2 个答案:

答案 0 :(得分:2)

使用Type列创建数据:

DF <- read.table(header=TRUE, text='  ID Type C1 C2 C3
1  A    1  3  3  5
2  B    2  2  7  4
3  C    1  4  3  3
4  D    2  4  4  6
5  E    2  5  5  3')

然后,在知道ID列位于第1位的情况下,aggregate的简单应用可以获得您想要的内容:

aggregate(.~Type, data=DF[-1], FUN=mean)
  Type       C1       C2       C3
1    1 3.500000 3.000000 4.000000
2    2 3.666667 5.333333 4.333333

答案 1 :(得分:1)

其他一些方法:

### plyr was written with this type of problem in mind
library(plyr)
ddply(DF[-1], .(Type), colMeans)

### staying in base; these are more unwieldly than `aggregate`
t(sapply(split(DF[-c(1,2)], DF$Type), colMeans))
### `ave` also written for similar problems; however will replace all elements 
### by vector average (mean) so need to use `unique` afterwards:
unique(with(DF, ave(C1, Type)))
with(DF,
     lapply(lapply(DF[-c(1,2)], ave, Type), unique)
     )

### faster and scales well on large datasets
library(data.table)
DFt <- as.data.table(DF)
DFt[, list(mean(C1), mean(C2), mean(C3)), by=Type]