R 的 optimx 的 MATLAB 样式梯度答案

【问题标题】：MATLAB-style gradient for R's optimxR 的 optimx 的 MATLAB 样式梯度
【发布时间】：2017-01-20 11:08:27
【问题描述】：

我使用 R optimx 来优化我的功能。在 MATLAB 中我总是写：

function[f,g] = calculations(x,otherParameters)
% some calculations
f=someting
g=somethingOther
% Here f and g are function and gradient values that fmincon use for optimization

那么f 和g 的值都提供给fmincon。但是当我使用optimx 时，我应该单独提供渐变功能。这个要求的缺点是我有很多为 f 计算的值，然后需要估计 g。因此，为梯度创建单独的函数迫使我计算一些值两次，这在计算上是低效的。请帮助我了解如何以最有效的方式在 R 中避免这个问题（例如，让全局变量在我看来不是一个很好的方法）。

【问题讨论】：

标签： r performance optimization

【解决方案1】：

nloptr 包允许将目标函数和梯度作为两个组件列表返回。请参阅小插图中的示例：https://cran.r-project.org/web/packages/nloptr/vignettes/nloptr.pdf，此处重复：

library(nloptr)

eval_f_list <- function(x) {
  common <- x[2] - x[1] * x[1]
  return( list(objective = 100 * common^2 + (1 - x[1])^2,
               gradient = c(-400 * x[1] * common - 2 * (1 - x[1]), 200 * common)))
}
x0 <- c( -1.2, 1 )
opts <- list("algorithm" = "NLOPT_LD_LBFGS", "xtol_rel" = 1.0e-8)

res <- nloptr( x0=x0, eval_f=eval_f_list, opts=opts)

【讨论】：

【解决方案2】：

查看memoise 包。为了更详细地解释，请考虑如何在 R 中针对知道导数的函数进行优化的简单示例：

complex.function <- function(x){
  Sys.sleep(3)
}

f <- function(x){
  cat("f",x,"\n")
  complex.function(x)
  (x-1)^4+(x-1)^2+7
}

g <- function(x){
  cat("g",x,"\n")
  complex.function(x)
  4*(x-1)^3+2*(x-1)
}

system.time(optim(3.1, f, g,method="BFGS")) ##57.01sec
#f 3.1 
#g 3.1 
#f -38.144 
#f -5.1488 
#f 1.45024 
#g 1.45024 
#f 1.398015 
#g 1.398015 
#f 1.146116 
#g 1.146116 
#f 0.8414061 
#f 1.085174 
#g 1.085174 
#f 1.00532 
#g 1.00532 
#f 1.000081 
#g 1.000081 
#f 0.9999192 
#f 1.000048

因为该方法在几乎相同的点上评估 f 和 g，所以有优化的潜力。

现在，如果您 memoise() 包含复杂计算的函数，它会缓存输出，因此您可以执行以下操作：

library(memoise)

complex.function2 <- memoise(function(x){
  Sys.sleep(3)
  list(fun=(x-1)^4+(x-1)^2+7,deriv=4*(x-1)^3+2*(x-1))
})



f2 <- function(x){
  cat("f2",x,"\n")
  complex.function2(x)$fun
}

g2 <- function(x){
  cat("g2",x,"\n")
  complex.function2(x)$deriv
}

system.time(optim(3.1, f2, g2,method="BFGS")) ##36.02sec

并减少调用复杂函数的次数，因此我的计算机上的执行时间从 57 秒下降到 36 秒。

查看optim 的帮助文件，看看您感兴趣的方法是否实际使用了衍生物 - 如果不是，这一切都没有实际意义。

【讨论】：