创建按变量级别分组的时间序列

时间:2019-03-20 04:00:20

标签: time-series

我正在尝试创建一个时间序列,以预测按团队分组的得分。

TeamScores$Year <- as.Date(TeamScores$Year)
sample <-TeamScores[1:20,]
dput(sample)
structure(list(
    Team = c("Abl Christian", "Air Force", "Akron", "Alab A&M", "Alabama", "Alabama St", "Albany", "Alcorn State", "American", "App State", "AR Lit Rock", "Arizona", "Arizona St", "Ark Pine Bl", "Arkansas", "Arkansas St", "Army", "Auburn", "Austin Peay", "Ball State"), 
    Score = c(71.7, 67.4, 68.4, 60.6, 71.8, 65.6, 66.8, 60.3, 72, 77.3, 73.6, 70.9, 77.8, 65.3, 75.5, 72.8, 70.2,  78.9, 80.1, 74.1), 
    Year = structure(
        c(17532, 17532, 17532, 17532, 17532, 17532, 17532, 17532, 17532, 17532, 17532, 17532, 17532,     17532, 17532, 17532, 17532, 17532, 17532, 17532), 
        class = "Date")), 
    row.names = c(NA, -20L), 
    class = c("tbl_df", "tbl", "data.frame"))

(我认为)我成功创建了一个时间序列,但是我无法适应工作。

time_ser<-ts(matrix(TeamScores$Team,nrow=3530),start=c(2009-01-01),frequency=1)  
class(time_ser)
#[1] "ts"

fit<- auto.arima(time_ser)
#Error in stats::arima(x = x, order = order, seasonal = seasonal, include.mean = include.mean,  : 
  'x' must be numeric
In addition: Warning message:
In is.constant(x) : NAs introduced by coercion 

我的x(分数)是数字,我迷路了。我以为我需要执行auto.arima函数才能执行预测函数。

1 个答案:

答案 0 :(得分:0)

使用格式的数据结构,您可以像这样运行ARIMA:

# making the structure
TeamScores <- structure(list(
  Team = c("Abl Christian", "Air Force", "Akron", "Abl Christian", "Air Force", "Akron","Abl Christian", "Air Force", "Akron","Abl Christian", "Air Force", "Akron","Abl Christian", "Air Force", "Akron","Abl Christian", "Air Force", "Akron","Abl Christian", "Air Force"), 
  Score = c(71.7, 67.4, 68.4, 60.6, 71.8, 65.6, 66.8, 60.3, 72, 77.3, 73.6, 70.9, 77.8, 65.3, 75.5, 72.8, 70.2,  78.9, 80.1, 74.1), 
  Year = structure(
    c(17532, 17533, 17534, 17535, 17536, 17537, 17538, 17539, 17540, 17541, 17542, 17543, 17544,     17545, 17546, 17547, 17548, 17549, 17550, 17551), 
    class = "Date")), 
  row.names = c(NA, -20L), 
  class = c("tbl_df", "tbl", "data.frame"))

# make a vector with team names:
teamnames <- c("Abl Christian", "Air Force", "Akron")

# run ARIMA for each team:
for (team in teamnames){
  subdf <- subset(TeamScores, Team==team)
  fit <- auto.arima(subdf$Score,xreg=subdf$Year)
  print(fit)}

P.S。我无法使用您的示例代码/数据运行Arima,因为在示例中,所有日期都是相同的(2018-01-01),并且每个组仅出现一次,并且您无法真正创建一个时间点的时间序列每组一个数据点。因此,我为测试做了类似的结构。另外,我跳过了制作ts对象的操作,并直接在数据帧上运行ARIMA。

相关问题