对绘图中的拟合设置限制答案

【问题标题】：Putting limits on a fitting in a plot对绘图中的拟合设置限制
【发布时间】：2021-02-15 08:55:52
【问题描述】：

我有一个数据图，并对其应用了线性拟合，但是我不确定为什么，但拟合线远离数据本身。我将如何对这条线施加限制，使其恰好适合我的数据（最好也让数据成为图表的焦点！）

图形输出和代码如下：

plt.plot(np.log(ts), amps, "1", ms=5, label="Schmitt Analysis (Amplitude against Log(Time))")

##Plot Linear Fit
y1, r, *_ = np.polyfit(amps, np.log(ts), 1, full=True)
f1 = np.poly1d(y1)
plt.plot(amps, f1(amps), label=f"linear ($\chi^2$ = {r[0]:0.2f})")

plt.xlabel("Log(Time)")
plt.ylabel("Amplitude")
plt.title("Schmitt Analysis (Amplitude against Log(Time))")
plt.xlim(0,10)
plt.ylim(-40,80)
plt.legend()

plt.savefig('A_Schmitt.jpg')

实际使用的数据：

log(ts) = [-inf 2.89037176 3.58351894 3.98898405 4.49980967 4.68213123 4.83628191 4.9698133 5.08759634 5.19295685 5.28826703 5.37527841 5.45532112 5.52942909 5.59842196 5.7235851 5.78074352 5.83481074 5.9348942 6.02586597 6.06842559 6.10924758 6.1484683 6.22257627 6.25766759 6.32435896 6.38687932 6.41673228 6.44571982 6.50128967 6.52795792 6.5539334 6.71901315 6.78219206]

安培= [77.78630383833547，62.92926582239441，63.84025706577048，55.489066870438165，38.60797989548756，40.771390484048545，14.679073842876978，33.95959972488966，29.41960790300141，32.93241034391399，30.927428194781815，31.086396885182356，21.52771899125612，4.27684299160886，6.432975528727562，7.500376934048583，18.730555740591637，4.355896959987761，11.677509915219987，12.865482314301719，0.6120306267606219，12.614420497451556， 2.2025029753442404，9.447046999592711，4.0688197216393425，0.546672901996845，1.12780050608251，2.2030852358874635，2.202804718915858，0.5726686031033587，0.5465322281618783，0.5185100682386156，0.575055917739342，0.5681697592593679]

注意到我犯了一个错误，我设法让图表更新了一点，但现在拟合完全失败了。

我也将上面的代码更新到了新版本。

【问题讨论】：

如果您可以发布一些有用的示例数据。
全部完成。还添加了对代码/图表的更新

标签： python numpy matplotlib curve-fitting

【解决方案1】：

这里有两个问题——一个错字和-inf的存在。
首先是拼写错误 - 您将 logts 和 amps 分别归为 y 和 x，符合您的要求，而应该反过来。
其次，对数转换的时间数组中 -inf 的存在没有被 fit 例程很好地处理。我们可以用logts[1:]手动排除第一个点。

import numpy as np
from matplotlib import pyplot as plt

#recreating your input - seemingly log(ts) is a numpy array
logts = np.asarray([-np.inf, 2.89037176, 3.58351894, 3.98898405, 4.49980967, 4.68213123, 4.83628191, 4.9698133, 5.08759634, 5.19295685, 5.28826703, 5.37527841, 5.45532112, 5.52942909, 5.59842196, 5.7235851, 5.78074352, 5.83481074, 5.9348942, 6.02586597, 6.06842559, 6.10924758, 6.1484683, 6.22257627, 6.25766759, 6.32435896, 6.38687932, 6.41673228, 6.44571982, 6.50128967, 6.52795792, 6.5539334, 6.71901315, 6.78219206])
amps = [77.78630383833547, 62.92926582239441, 63.84025706577048, 55.489066870438165, 38.60797989548756, 40.771390484048545, 14.679073842876978, 33.95959972488966, 29.41960790300141, 32.93241034391399, 30.927428194781815, 31.086396885182356, 21.52771899125612, 4.27684299160886, 6.432975528727562, 7.500376934048583, 18.730555740591637, 4.355896959987761, 11.677509915219987, 12.865482314301719, 0.6120306267606219, 12.614420497451556, 2.2025029753442404, 9.447046999592711, 4.0688197216393425, 0.546672901996845, 1.12780050608251, 2.2030852358874635, 2.202804718915858, 0.5726686031033587, 0.5465322281618783, 0.5185100682386156, 0.575055917739342, 0.5681697592593679]

#plot raw data
plt.plot(logts, amps, "1", ms=5, label="Schmitt Analysis (Amplitude against Log(Time))")

#linear fit excluding the first point that is an outlier
y1, r, *_ = np.polyfit(logts[1:], amps[1:], 1, full=True)
f1 = np.poly1d(y1)

#get min and max of logts excluding nan and inf values
logtsmin = np.floor(np.nanmin(logts[logts != -np.inf]))
logtsmax = np.ceil(np.nanmax(logts[logts != np.inf]))
#evenly spaced x-values for the fit line plot 
xlogts = np.linspace(logtsmin, logtsmax, 1000)
plt.plot(xlogts, f1(xlogts), label=f"linear ($\chi^2$ = {r[0]:0.2f})")

plt.xlabel("Log(Time)")
plt.ylabel("Amplitude")
plt.title("Schmitt Analysis (Amplitude against Log(Time))")
plt.xlim(logtsmin, logtsmax)
plt.legend()

plt.show()

示例输出：

【讨论】：

是的，就是这样，成功了。谢谢！现在谈谈为什么我的数据是垃圾......干杯
出于兴趣，有没有一种简单的方法可以从图例中删除系列，只显示最适合标签的线，即“线性...”而不是“施密特...” ?
图例显示了label 参数定义的内容。在您的情况下，删除 label="Schmitt Analysis (Amplitude against Log(Time)) 部分。

【解决方案2】：

使用xlim 和ylim

    plt.plot(np.log(ts), amps, "1", ms=5, label="Schmitt Analysis (Log(Amplitude) 
    against Time)")
    
    
    y1, r, *_ = np.polyfit(amps, ts, 1, full=True)
    f1 = np.poly1d(y1)
    plt.plot(amps, f1(amps), label=f"linear ($\chi^2$ = {r[0]:0.2f})")
    
    plt.xlabel("Log(Time)")
    plt.ylabel("Amplitude")
    plt.title("Schmitt Analysis (Amplitude against Log(Time)")
    plt.xlim(0, 10)
    plt.ylim(0, 10)
    plt.legend()
    
    plt.savefig('A_Schmitt.jpg'

)

【讨论】：

您好，金，感谢您的回复。我遵循了代码，但这只是打印了一个在该范围内可见限制的图表，最佳拟合线仍然离数据非常远，但现在它超出了图表的边缘。理想情况下，我正在寻找一种方法来获得最适合正确绘制的线。
那么这听起来像是您的模型的问题，因为如果您的数据点是聚集的，那么最佳拟合线总是假设从数据开始的点开始，到数据结束的点结束从 0 开始，您可以移动 xlim 和 ylims 以从更接近您的数据的点开始，而不是像我所做的那样 0
向我们提供您正在使用的数据将帮助我们了解您的需求，因为要么您的模型没有按照您的意图进行操作，要么数据远离我给出的点请记住，我给出的0,10 只是您可以使用任何您喜欢的值的示例
更新了图表/代码并按要求添加了示例数据。最佳拟合线现在返回 NaN。
返回 NaN 是否是由于 log(ts) 开头的无穷大（因为 ts 的第一个值为 0）。我将如何从数据集中删除该点。如果这很重要，我不完全确定如何判断 log(ts) 是列表、数组还是 ndarray？