如何减少R程序的计算时间

时间:2018-02-02 09:11:49

标签: r time prediction computation

我正在对时间序列数据进行预测,但我正在努力减少计算时间。这是代码示例。因此,代码实际上预测了不同监测站的温度。对于134个电台,我的电脑需要10分钟。我在想是否有办法减少整体计算时间。

示例数据如下所示。共有134个站,观察时间为2个月。

date              station1        station2      station3         station4
18/01/2017 0:00     36.8           36.25           27.4            25.75
19/01/2017 0:00     30.71428571    34.6            29.4           22.33333333
20/01/2017 0:00     38.75          40.33333333     30.16666667    29.33333333
21/01/2017 0:00     40.83333333    40.33333333     31.2 3         2.25

dat1 <-read.csv("smart.csv")
library(forecast)
attach(dat1)
library(forecastHybrid)
ptm <- proc.time()
result<-data.frame(auto=0,nnetar=0)
for(i in 2:135) {
   temp.ts <-ts(dat1[i])
   train = temp.ts[1:600]
   test = temp.ts[601:620]

   hm3 <- hybridModel(train, weights = "equal", errorMethod = "MASE", models = 
"an")
   accuracy(hm3,individual = TRUE)
   hForecast <- forecast(hm3, h = 1) 
   result<-rbind(result,data.frame(auto=hForecast$pointForecasts[1],
                 nnetar=hForecast$pointForecasts[2]))
   fit_accuracy <- accuracy(hForecast, test)
}

proc.time()-ptm
write.csv(result, file= "xyz.csv")

1 个答案:

答案 0 :(得分:0)

鉴于样本,我假设您的数据框类似于

date<-seq(ymd_hm("2016-01-01 00:00"),ymd_hm("2017-09-11 00:00"),by="day")
station1<-runif(620)
station2<-runif(620)
station3<-runif(620)
station4<-runif(620)
dat1=data.frame(date,station1,station2,station3,station4)

如果是这种情况,则代码会出错:

Error in testaccuracy(f, x, test, d, D) : 
  Not enough forecasts. Check that forecasts and test data match.

此错误是由循环的最后一行引起的:

fit_accuracy <- accuracy(hForecast, test)

因为hForecast的长度为1,测试长度为20.

所以我写了下面的代码,它运行得足够快,运行得足够快:

forecastStation<-function(data){
  temp=ts(data)
  train = temp[1:600,]
  test = temp[601:620,]
  #hm3 <- hybridModel(train, weights = "equal", errorMethod = "MASE", models = "an")
  arimaModel <-auto.arima(train)
  netModel=nnetar(train)
  accuracy(arimaModel,individual = TRUE);accuracy(netModel,individual = TRUE)
  arimaPredict <- forecast(arimaModel, 1)$mean[1]
  netPredict<- forecast(netModel, 1)$mean[1]
  return(data.frame(auto=arimaPredict,nnetar=netPredict))
}
result<-do.call("rbind",lapply(2:5,function(x) FUN=forecastStation(dat1[x])))
result$Station=colnames(dat1)[2:5]

与您的主要区别在于,我不是使用hybridModel函数,而是单独使用auto.arima和nnetar

结果是以下形式的数据框:

> result
       auto    nnetar  Station
1 0.4995727 0.4906344 station1
2 0.4907216 0.5045967 station2
3 0.5300489 0.5413126 station3
4 0.5021821 0.4951382 station4
提前一步预测。我不确定你是否需要提前1或2步。如果第二种情况是将函数更改为:

forecastStation<-function(data){
  temp=ts(data)
  train = temp[1:600,]
  test = temp[601:620,]
  #hm3 <- hybridModel(train, weights = "equal", errorMethod = "MASE", models = "an")
  arimaModel <-auto.arima(train)
  netModel=nnetar(train)
  accuracy(arimaModel,individual = TRUE);accuracy(netModel,individual = TRUE)
  arimaPredict <- forecast(arimaModel, 20)$mean[1:20]
  netPredict<- forecast(netModel, 20)$mean[1:20]
  return(data.frame(auto=arimaPredict,nnetar=netPredict))
}

希望这会有所帮助