【问题标题】:Predicting confidence interval with statsmodels使用 statsmodels 预测置信区间
【发布时间】:2018-12-21 04:40:56
【问题描述】:

我正在构建这样的线性模型:

import statsmodels.api as sm
from statsmodels.stats.outliers_influence import summary_table
import numpy as np
import random

x = np.arange(1,101, 1)
y = random.sample(range(1,1000), 100)

X = sm.add_constant(x)
regr = sm.OLS(y, X)
fit = regr.fit()

st, data, ss2 = summary_table(fit, alpha=0.05)

我可以从data 确定标准误差和置信区间。

现在我想预测我正在尝试的新数据的置信区间是多少:

new_data = [102, 103, 104, 105]

fit.get_prediction(new_data)

但这会返回:

Traceback (most recent call last):

  File "<ipython-input-168-372d2610946d>", line 14, in <module>
    fit.get_prediction(new)

  File "/Users/spotter/anaconda3/lib/python3.6/site-packages/statsmodels/regression/linear_model.py", line 2138, in get_prediction
    weights=weights, row_labels=row_labels, **kwds)

  File "/Users/user/anaconda3/lib/python3.6/site-packages/statsmodels/regression/_prediction.py", line 163, in get_prediction
    predicted_mean = self.model.predict(self.params, exog, **pred_kwds)

  File "/Users/user/anaconda3/lib/python3.6/site-packages/statsmodels/regression/linear_model.py", line 261, in predict
    return np.dot(exog, params)

ValueError: shapes (1,4) and (2,) not aligned: 4 (dim 1) != 2 (dim 0

【问题讨论】:

    标签: python statsmodels


    【解决方案1】:

    由于您使用截距训练模型,因此在创建 new_data 时还需要包含它(= 添加一列 1)。

    new_data = sm.add_constant([102, 103, 104, 105])
    result = fit.get_prediction(new_data)
    result.conf_int()
    

    【讨论】:

      猜你喜欢
      • 2013-07-07
      • 2015-11-18
      • 1970-01-01
      • 2017-11-02
      • 2021-04-05
      • 1970-01-01
      • 2014-07-22
      • 2021-01-22
      • 2021-01-22
      相关资源
      最近更新 更多