汇总压缩时间序列随时间的值

时间:2019-03-21 12:39:58

标签: r

我尝试通过下面的代码描述我的问题。我有一个数据帧形式的“压缩”时间序列的数据帧:有。它包含一个期间的开始日期和结束日期,以及随时间变化的值。我想像在数据帧中一样重复数据:想最终到达数据帧:finally_want,它随时间求和。也许我不需要匮乏并直接寻求最终希望?谢谢。

library(dplyr)

start_date <- as.Date(c("2004-08-02", "2004-08-03"))
end_date <- as.Date(c("2004-08-04", "2004-08-05"))
value <- c(5, 6)
have <- data.frame(start_date, end_date, value)
have

date <- as.Date(c("2004-08-02", "2004-08-03", "2004-08-04", "2004-08-03", "2004-08-04", "2004-08-05"))
value <- c(5, 5, 5, 6, 6, 6)
want <- data.frame(date, value)
want

ultimately_want <- want %>%
    group_by(date) %>%
    summarise(total = sum(value))

ultimately_want

2 个答案:

答案 0 :(得分:1)

这是一种data.table方法,

library(data.table)

setDT(have)[, .(value = value, date = seq(start_date, end_date, by = "day")), 
                                     by = 1:nrow(have)][,.(total = sum(value)), date][]

#         date total
#1: 2004-08-02     5
#2: 2004-08-03    11
#3: 2004-08-04    11
#4: 2004-08-05     6

答案 1 :(得分:-1)

我们可以做到

library(dplyr)
library(tidyr)
mutate(have,rid=row_number()) %>% gather(key,date, -value,-rid) %>% 
   select(-key)%>%group_by(rid) %>% complete(value, date=full_seq(date,1)) %>%
   group_by(date) %>% summarise(total = sum(value))

# A tibble: 4 x 2
    date       total
   <date>     <dbl>
1 2004-08-02     5
2 2004-08-03    11
3 2004-08-04    11
4 2004-08-05     6