在优化期间通过数据集循环函数

时间:2013-06-05 04:43:43

标签: r function plyr mathematical-optimization

我有以下数据:

data_ex <- structure(list(ID = c(493L, 493L, 493L, 493L, 493L, 493L, 493L, 
493L, 494L, 494L, 494L, 494L, 494L, 494L, 494L), value.y = c(1.403198175, 
1.403198175, 1.403198175, 1.403198175, 1.403198175, 1.403198175, 
1.403198175, 1.403198175, 1.540408028, 1.540408028, 1.540408028, 
1.540408028, 1.540408028, 1.540408028, 1.540408028), Sensor = structure(c(1L, 
2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L), .Label = c("Sat1", 
"Sat2"), class = "factor"), Date = structure(c(3L, 1L, 2L, 1L, 
2L, 4L, 3L, 4L, 4L, 3L, 5L, 2L, 1L, 1L, 5L), .Label = c("10-Jul", 
"2-Jul", "30-Jun", "4-Jul", "9-Jul"), class = "factor"), variable = structure(c(1L, 
2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L), .Label = c("A", 
"B"), class = "factor"), value.x = c(0.514018, 1.250407631, 1.349420084, 
0.629876797, 0.666055046, 1.434158327, 0.952216, 0.695925622, 
0.667056075, 0.964285, 1.173076, 1.265919252, 0.658852868, 1.329348307, 
0.60396)), .Names = c("ID", "value.y", "Sensor", "Date", "variable", 
"value.x"), row.names = c(1L, 3L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 
12L, 13L, 14L, 15L, 16L, 17L), class = "data.frame")

使用以下代码进行优化:

##Function to optimize
TestCalc <- function(p, x, y) {
  x <- data_ex$value.x
  y <- data_ex$value.y
  sum((y - abs(log(1 - ((x - p[1]) / (p[2]  - p[1]))) / 0.5))^2)
}

## Set limits for optimization
p <- c(1,1)
lower <- -3*p
upper <-  6 + lower

## Optimize 
library(dfoptim)
opt = nmkb(p, TestCalc, lower=lower, upper=upper)

我更喜欢循环遍历整个数据集,以比较使用不同因素进行优化的效果。像这样:

data_optimize <- ddply(data_ex, .(Sensor, Date, variable), summarize, opt = nmkb(p, LAICalc, lower=lower, upper=upper))

如何将x和y移出函数,以便它们在ddply中正常工作?或者有更好的方法吗?

1 个答案:

答案 0 :(得分:0)

请勿使用summarize。您需要使用...将参数传递给更好定义的testCalc

TestCalc <- function(p, x, y) {
  sum((y - abs(log(1 - ((x - p[1]) / (p[2]  - p[1]))) / 0.5))^2)
}

下一个问题是,您是要保存nmkb的所有结果,还是只保留参数值

data_optimize <- ddply(data_ex, .(Sensor, Date, variable), function(x) {
  opt = nmkb(p, TestCalc, lower=lower, upper=upper, x=x$value.x, y= x$value.y)$par
   names(opt) <- c('p1','p2')
   opt})
## There were 14 warnings (use warnings() to see them)
## You need think about these 

data_optimize
#    Sensor   Date variable          p1       p2
# 1    Sat1 30-Jun        A  1.55932192 2.587030
# 2    Sat1 30-Jun        B  0.76645625 1.134881
# 3    Sat1  9-Jul        A  1.76999701 2.774562
# 4    Sat1  9-Jul        B -0.93056697 2.985854
# 5    Sat2 10-Jul        A  0.18469300 1.067717
# 6    Sat2 10-Jul        B  0.03787765 2.442937
# 7    Sat2  2-Jul        A -1.17711009 2.478547
# 8    Sat2  2-Jul        B -0.53059895 2.999981
# 9    Sat2  4-Jul        A -1.84008670 2.999541
# 10   Sat2  4-Jul        B -0.08772934 2.931071