有没有办法在 scipy.optimize.curve_fit 的参数之间添加依赖关系答案

【问题标题】：Is there a way to add a dependency between parameters for scipy.optimize.curve_fit有没有办法在 scipy.optimize.curve_fit 的参数之间添加依赖关系
【发布时间】：2018-06-07 18:50:23
【问题描述】：

我试图在海洋模型中描述 70 年代核试验引起的碳同位素扩散。大气信号是一个强烈的尖峰信号，它会随着洋流被带到深处（更深的水流要慢得多）。

我的目标是检测浓度上升的开始和不同深度水平的上升速度。

我假设海洋中碳同位素的浓度表现为具有 3 段的分段线性函数：

一个恒定的初始值 (b) 直到时间 (t_0)
浓度随时间 (t_0) 到 (t_1) 的线性增加，速率为 m1。
浓度随时间线性下降 (t_1)，速率为 m2

我在 python 中使用这段代码来表示函数：

import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as sio

def piecewise_linear( t, t0, t1, b, m1, m2 ):
    condlist = [ t < t0,
                (t >= t0 ) & ( t < t1 ),
                t >= t1
               ]
    funclist = [lambda t: b,
                lambda t: b + m1 * ( t - t0 ),
                lambda t: b + m1 * ( t - t0 ) + m2 * ( t - t1 )
               ]
    return np.piecewise( t, condlist, funclist )

对于给定的时间数组t，我希望能够适应此函数的两种“类型”：

完整的 3 段线，代表上层海洋，信号在此快速传播，尖峰被完全捕获。
一种特殊情况，在时间序列结束时浓度尚未达到峰值（这表示深海中的信号，需要很长时间才能传播信号）

举例

t = np.arange( 0, 15, 0.1 )
y_full = piecewise_linear( t, 5, 10, 2, 2, -4 )
y_cut = piecewise_linear( t, 5, 15, 2, 2, -4 )
plt.plot( t, y_full )
plt.plot( t, y_cut )
plt.legend( [ 'surface', 'deep ocean' ] )

对于第一种情况，当我在添加一些随机噪声后尝试将函数拟合到信号时，我得到了很好的结果：

noise = np.random.normal( 0, 1, len( y_full ) ) * 1
y = y_full
yy = y_full + noise
bounds = ( [ 0, 0, 0, 0, -np.inf ], [ np.inf, np.inf, np.inf, np.inf, 0 ] )
fit,_ = sio.curve_fit( piecewise_linear, t, yy, bounds=bounds )
print( fit )
y_fit = piecewise_linear( t, *tuple( fit ) )
plt.plot( t, yy, color='0.5' )
plt.plot( t, y_fit, linewidth=3 )
plt.plot( t, y, linestyle='--', linewidth=3 )

结果

>>[  5.00001407  10.01945313   2.13055863   1.95208167  -3.95199719]

但是，当我尝试评估第二种情况（深海）时，我经常得到如下糟糕的结果：

noise = np.random.normal( 0, 1, len(y_full ) ) * 1#
y = y_cut
yy = y_cut+noise
bounds = ( [ 0, 0, 0, 0, -np.inf], [ np.inf, np.inf, np.inf, np.inf, 0 ] )
fit,_ = sio.curve_fit( piecewise_linear, t, yy, bounds=bounds )
print( fit )
y_fit = piecewise_linear( t, *tuple( fit ) )
plt.plot( t, yy, color='0.5' )
plt.plot( t, y_fit, linewidth=3 )
plt.plot( t, y, linestyle='--', linewidth=3 )
plt.legend( [ 'noisy data', 'fit', 'original' ] )

我明白了

>>[ 1.83838997  0.40000014  1.51810839  2.56982348 -1.0622842 ]

优化确定t_0大于t_1，这在这种情况下是没有意义的。

有没有办法将条件t_0 < t_1 构建到曲线拟合中？还是我必须测试，给出哪种类型的曲线，然后拟合两个不同的函数（3 段或 2 段分段线性函数）？

非常感谢任何帮助

【问题讨论】：

您可以快速测试一堵“砖墙”，如果找到给定条件，例如 t_0 > t_1，那么拟合函数会返回一个非常大的值 - 因此是一个非常大的错误。这种方法有些粗略，但通常作为一个实际问题起作用，并且具有易于编码和易于测试的优点。
... 或者，您可以更改参数化以便始终满足约束。

标签： python scipy curve-fitting

【解决方案1】：

您可以考虑为此使用 lmfit (https://lmfit.github.io/lmfit-py)。 Lmfit 为曲线拟合提供了更高级别的接口，并使拟合参数成为一流的 Python 对象。除此之外，这很容易允许修复一些参数，并以比scipy.optimize.curve_fit 使用的更 Pythonic 的方式设置参数的界限。特别是对于您的问题， lmfit 参数还支持使用数学表达式作为所有参数的约束表达式。

要将您的模型函数 piecewise_linear() 转换为使用 lmfit 进行曲线拟合的模型，您可以执行类似的操作

from lmfit import Model

# make a model
mymodel = Model(piecewise_linear)

# create parameters and set initial values
# note that parameters are *named* from the 
# names of arguments of your model function
params = mymodel.make_params(t0=0, t1=1, b=3, m1=2, m2=2)

# now, you can place bounds on parameters, maybe like
params['b'].min = 0
params['m1'].min = 0

# but what you want is an inequality constraint, so
#   1. add a new parameter 'tdiff'
#   2. constrain t1 = t0 + tdiff
#   3. set a minimum value of 0 for tdiff
params.add('tdiff', value=1, min=0)
params['t1'].expr = 't0 + tdiff'

# now perform the fit
result = mymodel.fit(yy, params, t=t)

# print out results
print(result.fit_report())

您可以阅读 lmfit 文档或其他 SO 问题，了解如何从拟合结果中提取其他信息。

【讨论】：

谁能评论为什么这被否决了？我认为它确实解决了这个问题，并想了解为什么有人不这么认为。

【解决方案2】：

在这种情况下curve_fit 有几个缺点，因此需要考虑 MNewille 的解决方案。此外，curve_fit 没有参数args（与例如leastsq 形成对比），这可能允许关闭第二个斜率。没有m2 的第二个拟合函数可能是一个解决方案，在这里。但是，如果curve_fit 是必须的，并且需要在这两种情况下都工作的通用拟合函数，则解决方案可能如下所示（注意从数据中提取的起始参数）：

import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as sio

"""
 we know t0 > 0, t1 > t0, b>0, m1 > 0, m2 < 0
"""
def piecewise_linear( t, t0, a , b, m1, m2 ):
    t0 = abs( t0 )
    t1 = abs( a ) * t0
    b = abs( b )
    m1 = abs( m1 )
    m2 = - abs( m2 )
    condlist = [ t < t0,
                 ( t >= t0 ) & ( t < t1 ),
                 t >= t1
               ]
    funclist = [ lambda t: b,
                 lambda t: b + m1 * ( t - t0 ),
                 lambda t: b + m1 * ( t - t0 ) + m2 * ( t - t1 )
               ]
    return np.piecewise( t, condlist, funclist )

t = np.arange( 0, 15, 0.1 )
y_full = piecewise_linear( t, 5, 2, 2, 2, -4 )
y_cut = piecewise_linear( t, 5, 3, 2, 2, -4 )

####################

#~ plt.plot( t, y_full )
#~ plt.plot( t, y_cut )
#~ plt.legend( [ 'surface', 'deep ocean'] )

####################

#~ noise = np.random.normal( 0, 1, len( y_full ) ) * 1
#~ y = y_full
#~ yy = y_full + noise

#~ bounds = ( [ 0, 0, 0, 0, -np.inf ], [ np.inf, np.inf, np.inf, np.inf, 0 ] )
#~ fit,_ = sio.curve_fit( piecewise_linear, t, yy, bounds=bounds )
#~ print( fit )
#~ y_fit = piecewise_linear( t, *tuple( fit ) )
#~ plt.plot( t, yy, color='0.5' )
#~ plt.plot( t, y_fit, linewidth=3 )
#~ plt.plot( t, y, linestyle='--', linewidth=3 )

####################

noise = np.random.normal( 0, 1, len( y_full ) ) * 1
y = y_cut
yy = y_cut + noise
tPos = np.argmax( yy )
t1Start = t[ tPos ]
t0Start = t[ tPos // 2 ]
bStart = yy[ 0 ]
aStart = 2
m1Start = ( yy[ tPos ] - yy[ tPos // 2 ] ) / ( t1Start - t0Start )

p0 = [ t0Start, aStart, bStart, m1Start, 0 ])
fit,_ = sio.curve_fit( piecewise_linear, t, yy, p0=p0 )
print( fit )
y_fit = piecewise_linear( t, *tuple( fit ) )

plt.plot( t, yy, color='0.5' )
plt.plot( t, y_fit, linewidth=3 )
plt.plot( t, y, linestyle='--', linewidth=3 )
plt.legend( [ 'noisy data', 'fit', 'original' ] )
plt.show()

它适用于测试数据。必须记住，返回的拟合参数可能是负数。由于函数采用模数，因此也需要对返回的参数执行此操作。另请注意，t1 不再直接拟合，而是作为t0 的倍数。因此，需要相应地传播错误。新结构不需要bounds。

还要注意，起始参数p0 的选择也应该适用于案例1。

【讨论】：

非常感谢。从数据中提取起点是一个很好的补充。

【解决方案3】：

由于我不知道 Python 中使用的回归算法到底是什么，我无法真正回答您的问题。可能算法和往常一样是一个迭代过程。

作为附加信息，我将展示一个非常简单的方法，它可以在没有迭代过程或初始猜测的情况下给出近似答案。基于积分方程拟合的理论可以在https://fr.scribd.com/doc/14674814/Regressions-et-equations-integrales中找到，分段函数的一些使用示例在：https://fr.scribd.com/document/380941024/Regression-par-morceaux-Piecewise-Regression-pdf

在由三个线性段组成的分段函数的情况下，微积分的方法在上面第二篇论文的第 30 页给出。用任何计算机语言编写代码都非常容易。我想 Python 也可以。

从扫描原图得到的数据：

积分方程回归的方法导致下一个结果：

拟合方程为：

H 是 Heaviside 函数。

上图给出了参数a1,a2,p1,q1,p2,q2,p3,q3的取值。

可以看到第一段并不像预期的那样完全水平。但是斜率很小：0.166

由于算法第二部分的微小变化，可以指定斜率=0（即p1=0）。修改后的算法如下图：

现在，结果是：

【讨论】：