基于两列

时间:2015-09-08 16:16:08

标签: r

我想添加一列的值,将它们分组为两列。我发现如何在一列上执行此操作,但无法弄清楚如何在两列上执行此操作。 例如,如果我有以下数据框:

x=c("a","a", "b", "b","c", "c","a","a","b","b","c","c", "a", "a","b","b", "c", "c") 
y=c(1:18) 
q=c("M","M","M", "M","M","M","W","W","W","W","W","W","F","F","F","F","F","F")
df<-data.frame(x,y,q)

我想在x和q的y列中添加值,以便我有一个像这样的新数据框

x=c("a","a", "b", "b","c", "c","a","a","b","b","c","c", "a", "a","b","b", "c", "c") 
y=c(3,7,11,15,19,23,27,31,35) 
q=c("M","M","M","W","W","W","F","F","F")
d<-data.frame(x,y,q)

1 个答案:

答案 0 :(得分:4)

您有几种选择:

1:基地R

aggregate(y~x+q, df, sum)

2: data.table

library(data.table)
setDT(df)[, .(sumy=sum(y)), by = .(x,q)]

# when you want to summarise several columns:
setDT(df)[, lapply(.SD, sum), by = .(x,q)]

3: dplyr

library(dplyr)
df %>% group_by(x,q) %>% summarise(sumy = sum(y))

# when you want to summarise several columns:
df %>% group_by(x,q) %>% summarise_each(funs(sum))

所有人都应该给你相同的结果(虽然不是相同的顺序)。例如,data.table输出如下所示:

   x q  y
1: a M  3
2: b M  7
3: c M 11
4: a W 15
5: b W 19
6: c W 23
7: a F 27
8: b F 31
9: c F 35