生成时间序列数据中的条目

时间:2011-10-11 07:54:50

标签: r time-series

我想为以下数据框中的每个缺失月份(直到当前)生成一行(零金额)。你能帮我一把吗?谢谢!

   trans_date       ammount
1  2004-12-01        2968.91
2  2005-04-01         500.62
3  2005-05-01         434.30
4  2005-06-01         549.15
5  2005-07-01         276.77
6  2005-09-01         548.64
7  2005-10-01         761.69
8  2005-11-01         636.77
9  2005-12-01        1517.58
10 2006-03-01         719.09
11 2006-04-01        1231.88
12 2006-05-01         580.46
13 2006-07-01        1468.43
14 2006-10-01         692.22
15 2006-11-01         505.81
16 2006-12-01        1589.70
17 2007-03-01        1559.82
18 2007-06-01         764.98
19 2007-07-01         964.77
20 2007-09-01         405.18
21 2007-11-01         112.42
22 2007-12-01        1134.08
23 2008-02-01         269.72
24 2008-03-01         208.96
25 2008-04-01         353.58
26 2008-05-01         756.00
27 2008-06-01         747.85
28 2008-07-01         781.62
29 2008-09-01         195.36
30 2008-10-01         424.24
31 2008-12-01         166.23
32 2009-02-01         237.11
33 2009-04-01         110.94
34 2009-07-01         191.29
35 2009-11-01         153.42
36 2009-12-01         222.87
37 2010-09-01        1259.97
38 2010-11-01         375.61
39 2010-12-01         496.48
40 2011-02-01         360.07
41 2011-03-01         324.95
42 2011-04-01         566.93
43 2011-06-01         281.19
44 2011-08-01         428.04

'data.frame':   44 obs. of  2 variables:
 $ trans_date     : Date, format: "2004-12-01" "2005-04-01" "2005-05-01" "2005-06-01" ...
 $ ammount: num  2969 501 434 549 277 ...

2 个答案:

答案 0 :(得分:4)

您可以使用seq.Datemerge

> str(df)
'data.frame':   44 obs. of  2 variables:
 $ trans_date: Date, format: "2004-12-01" "2005-04-01" "2005-05-01" "2005-06-01" ...
 $ ammount   : num  2969 501 434 549 277 ...
> mns <- data.frame(trans_date = seq.Date(min(df$trans_date), max(df$trans_date), by = "month"))
> df2 <- merge(mns, df, all = TRUE)
> df2$ammount <- ifelse(is.na(df2$ammount), 0, df2$ammount)
> head(df2)
  trans_date ammount
1 2004-12-01 2968.91
2 2005-01-01    0.00
3 2005-02-01    0.00
4 2005-03-01    0.00
5 2005-04-01  500.62
6 2005-05-01  434.30

如果你需要几个月才能使用,请使用:

mns <- data.frame(trans_date = seq.Date(min(df$trans_date), Sys.Date(), by = "month"))

请注意,如果参数是Date类,只需调用seq而不是seq.Date即可。

答案 1 :(得分:1)

如果您使用的是xts对象,则可以使用timeBasedSeqmerge.xts。假设您的原始数据位于对象Data中:

# create xts object:
#   no comma on the first subset (Data['ammount']) keeps column name;
#   as.Date needs a vector, so use comma (Data[,'trans_date'])
x <- xts(Data['ammount'],as.Date(Data[,'trans_date']))
# create a time-based vector from 2004-12-01 to 2011-08-01.  The "m" denotes
# monthly time-steps.  By default this returns a yearmon class.  Use
# retclass="Date" to return a Date vector.
d <- timeBasedSeq(paste(start(x),end(x),"m",sep="/"), retclass="Date")
# merge x with an "empty" xts object, xts(,d), filling with zeros
y <- merge(x,xts(,d),fill=0)