使用分组数据

时间:2016-02-04 17:35:31

标签: r time-series

帖子calculation of anomalies on time-series非常有帮助,但我在我的情况下对数据进行了分组。我有一个包含年份,组,值和列的数据框。每个组都有每年的值。我想要计算的是每组中的年度异常。即今年的价值减去该组所有年份的平均值。将此异常值附加到数据框中的列也会很好。谢谢!这是样本数据

year <- c(2000, 2000, 2000, 2000, 2000,2001, 2001, 2001, 2001, 2001,2002, 2002, 2002, 2002, 2002,2003, 2003, 2003, 2003, 2003)
group <- c("A", "B", "C", "D", "A", "B", "C", "D","A", "B", "C", "D","A", "B", "C", "D","A", "B", "C", "D")
value <- runif(20, 0, 1)
df <- as.data.frame(year)
df$group <- group
df$value <- value

1 个答案:

答案 0 :(得分:2)

ave函数有用的另一个实例(因此实际上不需要FUN参数,但重要的是要记住它在参数列表中的省略号之后,因此如果使用则需要成为命名参数):

df$grp.means <-with( df, ave(value,group, FUN=mean )
df$yr.anomaly <- df$value-df$grp.means
df
 year group      value grp.means   yr.anomaly
 2000     A 0.40778676 0.4135109 -0.005724164
 2000     B 0.02709893 0.2660400 -0.238941031
 2000     C 0.30375035 0.6461923 -0.342441950
 2000     D 0.46330590 0.4901705 -0.026864586
 2000     A 0.98482498 0.4135109  0.571314056
 2001     B 0.02279144 0.2660400 -0.243248519
 2001     C 0.64370031 0.6461923 -0.002491994
 2001     D 0.28803650 0.4901705 -0.202133986
 2001     A 0.40769648 0.4135109 -0.005814443
 2001     B 0.21896143 0.2660400 -0.047078526
 2002     C 0.83771796 0.6461923  0.191525655
 2002     D 0.61869987 0.4901705  0.128529384
 2002     A 0.06946549 0.4135109 -0.344045431
 2002     B 0.14443442 0.2660400 -0.121605537
 2002     C 0.95324165 0.6461923  0.307049349
 2003     D 0.60165466 0.4901705  0.111484174
 2003     A 0.19778091 0.4135109 -0.215730018
 2003     B 0.91691357 0.2660400  0.650873612
 2003     C 0.49255124 0.6461923 -0.153641061
 2003     D 0.47915550 0.4901705 -0.011014985

也可以一步完成:

df$yr.anomaly <- with( df, ave(value, group, FUN=function(x) x- mean(x)))