【问题标题】:Plotly: How to display regression errors with lines between the observations and the regression line?Plotly:如何用观测值和回归线之间的线显示回归误差?
【发布时间】:2021-04-12 04:35:51
【问题描述】:

我在 Python 中生成了以下 Plotly 图表。我已将回归调整为一组有限的点,并得到以下图表:

我想在这些点和调整后的曲线之间画一条垂直线,如下例所示:

我正在使用 Plotly plotly.graph_objects , pandas 来生成这些图表,但我不知道如何绘制它们。这是我正在使用的代码:

import pandas as pd
import plotly.graph_objects as go

for point, curve in zip(points, curves):

    point_plot = go.Scatter(x=df['Duration'],
                            y=df[point],
                            name=point,
                            # text=df['Nome'],
                            mode='markers+text',
                            line_color=COLOR_CODE[point],
                            textposition='top center')

    line_plot = go.Scatter(x=df['Duration'],
                            y=df[curve],
                            name='', 
                            line_color=COLOR_CODE[point],
                            mode='lines')
    

    # XXX: this don't solve the problem but it's what I could think of for now
    to_bar = df[points].diff(axis=1).copy()
    to_bar['Nome'] = df['Nome']
    bar_plot = go.Bar(x=to_bar['Nome'], y=to_bar[point], name='', marker_color=COLOR_CODE[point])

                            
    fig.add_trace(line_plot, row=1, col=1)
    fig.add_trace(point_plot, row=1, col=1)
    fig.add_trace(bar_plot, row=2, col=1)

【问题讨论】:

  • 为每个点获取其x 并使用创建回归线的函数计算其y。当您将拥有原始x,y 和新计算的new_x, new_y 时,您可以画线(x, y, new_x, new_y)
  • @Occhima 我的建议对你有什么效果?
  • 谢谢@vestland,效果很好!
  • @Occhima 很高兴听到这个消息!并感谢您接受我的回答。

标签: python plotly regression plotly-python


【解决方案1】:

您尚未提供包含数据样本的工作代码 sn-p,因此我将根据我之前的回答 Plotly: How to plot a regression line using plotly? 提出建议。如果您的身材像您的示例一样被限制为两个系列,您可以:

1. 使用 xVals = fig.data[0]['x'] 从其中一个系列中检索 x 值,并且

2. 使用 dict errors = {} 组织回归线和观察标记的所有点,并且

3. 使用以下命令填充该字典:

for d in fig.data:
    errors[d['mode']]=d['y']

4. 然后,您可以使用以下方法为线条和标记(您的错误)之间的距离添加线条形状:

for i, x in enumerate(xVals):
    shapes.append(go.layout.Shape(type="line", [...])

结果:

完整代码:

import plotly.graph_objects as go
import statsmodels.api as sm
import pandas as pd
import numpy as np
import datetime

# data
np.random.seed(123)
numdays=20

X = (np.random.randint(low=-20, high=20, size=numdays).cumsum()+100).tolist()
Y = (np.random.randint(low=-20, high=20, size=numdays).cumsum()+100).tolist()

df = pd.DataFrame({'X': X, 'Y':Y})

# regression
df['bestfit'] = sm.OLS(df['Y'],sm.add_constant(df['X'])).fit().fittedvalues

# plotly figure setup
fig=go.Figure()
fig.add_trace(go.Scatter(name='X vs Y', x=df['X'], y=df['Y'].values, mode='markers'))
fig.add_trace(go.Scatter(name='line of best fit', x=X, y=df['bestfit'], mode='lines'))


# plotly figure layout
fig.update_layout(xaxis_title = 'X', yaxis_title = 'Y')

# retrieve x-values from one of the series
xVals = fig.data[0]['x']

errors = {} # container for prediction errors

# organize data for errors in a dict
for d in fig.data:
    errors[d['mode']]=d['y']

shapes = [] # container for shapes

# make a line shape for each error == distance between each marker and line points
for i, x in enumerate(xVals):
    shapes.append(go.layout.Shape(type="line",
                                    x0=x,
                                    y0=errors['markers'][i],
                                    x1=x,
                                    y1=errors['lines'][i],
                                    line=dict(
                                        #color=np.random.choice(colors,1)[0],
                                        color = 'black',
                                        width=1),
                                    opacity=0.5,
                                    layer="above")
                 )

# include shapes in layout
fig.update_layout(shapes=shapes)
fig.show()

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-02-17
    • 2017-09-17
    • 2020-07-04
    • 1970-01-01
    • 2014-02-23
    相关资源
    最近更新 更多