nls 和有理函数的参数起始列表答案

【问题标题】：Start list of parameters for nls and rational functionnls 和有理函数的参数起始列表
【发布时间】：2018-11-17 23:12:36
【问题描述】：

我想为我的数据拟合一个有理函数：

数据：

 [1] 2.000000 3.000000 2.333333 1.750000 2.000000 1.833333 2.416667  1.916667
 [9] 1.750000 2.166667 2.116667 1.916667 1.944444 1.611111 1.722222 1.777778
[17] 1.877778 1.944444 1.958333 1.833333 2.041667 2.020833 1.908333 1.916667
[25] 1.733333 1.833333 1.800000 1.933333 1.893333 1.866667 1.888889 1.805556
[33] 1.833333 1.847222 1.822222 1.805556 1.833333 1.904762 1.880952 1.833333
[41] 1.804762 1.809524 1.708333 1.708333 1.750000 1.708333 1.683333 1.687500
[49] 1.611111 1.666667 1.648148 1.611111 1.611111 1.611111 1.650000 1.600000
[57] 1.650000 1.625000 1.630000 1.616667 1.469697 1.560606 1.590909 1.651515
[65] 1.651515 1.651515 1.513889 1.555556 1.625000 1.638889 1.647222 1.652778
[73] 1.679487 1.717949 1.705128 1.698718.

我想拟合的模型如下：

Model <- function(t, a, b, c, d) {
                 (a + b*t)/(1 + c*t + d*t^2)
 }
我知道我首先必须为 nls 提供 a、b、c... 的起始列表，但我真的不知道如何设置参数。由于我不是专家，我在此http://www.css.cornell.edu/faculty/dgr2/teach/R/R_rat.pdf 文档中找到了有用的指南。但在某些时候它说：

“给定一组有序对 (ti,yi)，其中通常在每个 t 值处都有重复测量，有理函数的参数可以通过非线性最小二乘估计拟合，例如R 中的 nls 方法。一个我们有四个参数，我们可以通过计算一阶导数来计算 t 的最大值。

虽然我不报告其他数据，但我还有另一列表示时间（1:76 的整数表示年份）。

谁能帮帮我？

最好的

【问题讨论】：

“nls”是什么意思？非线性系统？您的意思是要优化该函数以对您的参数进行良好的估计？
我将编辑我的问题
请提供 MCVE stackoverflow.com/help/mcve

标签： r nls

【解决方案1】：

该模型未在问题中完全指定，但假设下面代码中的模型以及下面注 2 中可重现的数据，如果我们设置 c = d = 0，那么它是一个线性模型，因此我们可以使用来自的系数线性模型拟合为起始值：

fm1 <- lm(y ~ t)
st2 <- list(a = coef(fm1)[[1]], b = coef(fm1)[[2]], c = 0, d = 0)
fm2 <- nls(y ~ Model(t, a, b, c, d), start = st2)

给予：

> fm2
Nonlinear regression model
  model: y ~ Model(t, a, b, c, d)
   data: parent.frame()
        a         b         c         d 
2.5097712 0.6038808 0.3205409 0.0008663 
 residual sum-of-squares: 1.684

Number of iterations to convergence: 16 
Achieved convergence tolerance: 8.029e-06

以图形方式查看拟合：

# model is shown in red. See Note 1 for fm4 (blue) and fm5 (green) models.
plot(y ~ t)
lines(fitted(fm2) ~ t, col = "red")
lines(fitted(fm4) ~ t, col = "blue")
lines(fitted(fm5) ~ t, col = "green")
legend("topright", c("fm2", "fm4", "fm5"), col = c("red", "blue", "green"), lty = 1)

注 1

以下是一个不同的模型，它几乎同样适合，但只使用了 3 个参数。见上图的蓝线。

fm3 <- lm(log(y) ~ log(t))
st4 <- list(a = coef(fm3)[[1]], b = 0, c = coef(fm3)[[2]])
fm4 <- nls(y ~ exp(a + b/t + c*log(t)), start = st4)

> fm4
Nonlinear regression model
  model: y ~ exp(a + b/t + c * log(t))
   data: parent.frame()
      a       b       c 
 0.9845 -0.1767 -0.1157 
 residual sum-of-squares: 1.685

Number of iterations to convergence: 4 
Achieved convergence tolerance: 2.625e-06

而且这个模型也不错。它只使用两个参数，它们是线性的，它的残差平方和为 1.728837，而 fm2 模型为 1.684，fm4 模型为 1.685。见上图绿线。

fm5 <- lm(y ~ log(t))

> fm5

Call:
lm(formula = y ~ log(t))

Coefficients:
(Intercept)       log(t)  
     2.4029      -0.1793  

> deviance(fm5)
[1] 1.728837

注2

y <- c(2, 3, 2.333333, 1.75, 2, 1.833333, 2.416667, 1.916667, 1.75, 
2.166667, 2.116667, 1.916667, 1.944444, 1.611111, 1.722222, 1.777778, 
1.877778, 1.944444, 1.958333, 1.833333, 2.041667, 2.020833, 1.908333, 
1.916667, 1.733333, 1.833333, 1.8, 1.933333, 1.893333, 1.866667, 
1.888889, 1.805556, 1.833333, 1.847222, 1.822222, 1.805556, 1.833333, 
1.904762, 1.880952, 1.833333, 1.804762, 1.809524, 1.708333, 1.708333, 
1.75, 1.708333, 1.683333, 1.6875, 1.611111, 1.666667, 1.648148, 
1.611111, 1.611111, 1.611111, 1.65, 1.6, 1.65, 1.625, 1.63, 1.616667, 
1.469697, 1.560606, 1.590909, 1.651515, 1.651515, 1.651515, 1.513889, 
1.555556, 1.625, 1.638889, 1.647222, 1.652778, 1.679487, 1.717949, 
1.705128, 1.698718)

t <- seq_along(y)

【讨论】：

我也有类似的想法。如果 b=C=d 那么使用 a=mean(y) 也会有意义
感谢你们的宝贵帮助。这是我一直在寻找的。 @Grothendieck我不知道为什么，但是在运行您的答案后，它在 fm2 之后给出了以下错误：数据错误 [，nm[i]] ：维数不正确。我不知道出了什么问题。我使用了你使用的 y 向量和 t 相同。
天哪，我犯了一个错误，因为我将我定义的“模型”与“模型”函数混淆了……一个天真的错误。我道歉。再次感谢您的宝贵帮助！
已添加注释 1。旧注释变为注释 2。

【解决方案2】：

我相信你可以做类似的事情

nls(y ~ (a1 + b1*times) / (1 + c1*times + d1*times^2))

其中y 是您在上面提供的数据，times=1:76。我在参数中添加了1，因为nls 没有将c 识别为参数，而是将c() 识别为函数。

但是，当我运行此程序时，我收到 singular gradient 错误，并建议将起始值初始化为 1（默认值）以外的值。您可以使用参数start = list("a1"=0.1, "b1"=0.1, "c1"=0.1, "d1"=0.1) 指定起始值，但这似乎没有帮助。也许您对起始值应该是什么有更好的了解？

【讨论】：

请注意，在 John Nash 的包 nlsr 中有一个更稳定的 nls 版本。在 Base R 中尝试 nls 可能会让您陷入臭名昭著的“奇异梯度”错误，而 nlsr::nlsb 将给出与 Grothendieck 上述相同的解决方案，以获取所有合理的起始值。