多行数据表的4行移动平均值

时间:2018-07-19 01:27:41

标签: r data.table moving-average

我有一个包含多列的data.table xSet。我需要一个新表,该表的平均每行平均移动4行。

Example input output Table

2 个答案:

答案 0 :(得分:2)

我们可以使用rollapplyr中的zoo

library(zoo)
library(dplyr)
df1 %>% 
  mutate_all(funs(New = rollapplyr(., FUN = mean, width = 4, partial = TRUE)))

data.table的类似选项

library(data.table)
setDT(df1)[, paste0("New", names(df1)) := lapply(.SD, 
     function(x) rollapplyr(x, FUN = mean, width = 4, partial = TRUE))]

数据

set.seed(24)
df1 <- as.data.frame(matrix(sample(0:9, 3 * 15, replace = TRUE),
       ncol = 3, dimnames = list(NULL, paste0("Col", 1:3))))

答案 1 :(得分:0)

akrunG. Grothendieck的答案称为rollapplr()函数,该函数默认使用右对齐窗口。

但这与OP在图像中显示的定义相反。

这可以通过创建一些合适的输入数据并使用toString()而不是mean()作为聚合函数来实现:

library(data.table)
# create suitable input data
DT <- data.table(col1 = 1:15, col2 = 21:35, col3 = 41:55)

DT[, cbind(.SD, New = zoo::rollapplyr(.SD, 4, toString, partial = TRUE))]
    col1 col2 col3       New.col1       New.col2       New.col3
 1:    1   21   41              1             21             41
 2:    2   22   42           1, 2         21, 22         41, 42
 3:    3   23   43        1, 2, 3     21, 22, 23     41, 42, 43
 4:    4   24   44     1, 2, 3, 4 21, 22, 23, 24 41, 42, 43, 44
 5:    5   25   45     2, 3, 4, 5 22, 23, 24, 25 42, 43, 44, 45
 6:    6   26   46     3, 4, 5, 6 23, 24, 25, 26 43, 44, 45, 46
 7:    7   27   47     4, 5, 6, 7 24, 25, 26, 27 44, 45, 46, 47
 8:    8   28   48     5, 6, 7, 8 25, 26, 27, 28 45, 46, 47, 48
 9:    9   29   49     6, 7, 8, 9 26, 27, 28, 29 46, 47, 48, 49
10:   10   30   50    7, 8, 9, 10 27, 28, 29, 30 47, 48, 49, 50
11:   11   31   51   8, 9, 10, 11 28, 29, 30, 31 48, 49, 50, 51
12:   12   32   52  9, 10, 11, 12 29, 30, 31, 32 49, 50, 51, 52
13:   13   33   53 10, 11, 12, 13 30, 31, 32, 33 50, 51, 52, 53
14:   14   34   54 11, 12, 13, 14 31, 32, 33, 34 51, 52, 53, 54
15:   15   35   55 12, 13, 14, 15 32, 33, 34, 35 52, 53, 54, 55

col1等于行号,New.col1显示了计算rollapplyr()中涉及的行索引。

与OP的图像相比,只有第1行和第2行匹配。显然,右对齐窗口不符合OP的定义。

我们可以将OP的要求与其他滚动窗口的对齐选项进行比较:

DT <- data.table(col1 = 1:15, col2 = 21:35, col3 = 41:55)
align_window <- c("center", "left", "right")
DT[, (align_window) := lapply(align_window, 
                              function(x) zoo::rollapply(
                                col1, 4, toString, partial = TRUE, align = x))]
# add OP's definition from image
DT[1:2, OP := right][3, OP := toString(2:4)][4:15, OP := center][]
    col1 col2 col3         center           left          right             OP
 1:    1   21   41        1, 2, 3     1, 2, 3, 4              1              1
 2:    2   22   42     1, 2, 3, 4     2, 3, 4, 5           1, 2           1, 2
 3:    3   23   43     2, 3, 4, 5     3, 4, 5, 6        1, 2, 3        2, 3, 4
 4:    4   24   44     3, 4, 5, 6     4, 5, 6, 7     1, 2, 3, 4     3, 4, 5, 6
 5:    5   25   45     4, 5, 6, 7     5, 6, 7, 8     2, 3, 4, 5     4, 5, 6, 7
 6:    6   26   46     5, 6, 7, 8     6, 7, 8, 9     3, 4, 5, 6     5, 6, 7, 8
 7:    7   27   47     6, 7, 8, 9    7, 8, 9, 10     4, 5, 6, 7     6, 7, 8, 9
 8:    8   28   48    7, 8, 9, 10   8, 9, 10, 11     5, 6, 7, 8    7, 8, 9, 10
 9:    9   29   49   8, 9, 10, 11  9, 10, 11, 12     6, 7, 8, 9   8, 9, 10, 11
10:   10   30   50  9, 10, 11, 12 10, 11, 12, 13    7, 8, 9, 10  9, 10, 11, 12
11:   11   31   51 10, 11, 12, 13 11, 12, 13, 14   8, 9, 10, 11 10, 11, 12, 13
12:   12   32   52 11, 12, 13, 14 12, 13, 14, 15  9, 10, 11, 12 11, 12, 13, 14
13:   13   33   53 12, 13, 14, 15     13, 14, 15 10, 11, 12, 13 12, 13, 14, 15
14:   14   34   54     13, 14, 15         14, 15 11, 12, 13, 14     13, 14, 15
15:   15   35   55         14, 15             15 12, 13, 14, 15         14, 15

所有对齐选项均不能完全符合OP的定义。除前三行外,“中心”是最佳匹配。