线程 SARIMAX 模型中的错误答案

【问题标题】：Error in Threading SARIMAX model线程 SARIMAX 模型中的错误
【发布时间】：2018-02-19 03:42:27
【问题描述】：

我第一次使用线程库是为了加快我的 SARIMAX 模型的训练时间。但是代码一直失败并出现以下错误

Bad direction in the line search; refresh the lbfgs memory and restart the iteration.
This problem is unconstrained.
This problem is unconstrained.
This problem is unconstrained.

以下是我的代码：

import numpy as np
import pandas as pd
from statsmodels.tsa.arima_model import ARIMA
import statsmodels.tsa.api as smt
from threading import Thread

def process_id(ndata):
   train = ndata[0:-7]
   test = ndata[len(train):]
   try:
       model = smt.SARIMAX(train.asfreq(freq='1d'), exog=None, order=(0, 1, 1), seasonal_order=(0, 1, 1, 7)).fit()
       pred = model.get_forecast(len(test))
       fcst = pred.predicted_mean
       fcst.index = test.index
       mapelist = []
       for i in range(len(fcst)):
            mapelist.insert(i, (np.absolute(test[i] - fcst[i])) / test[i])
       mape = np.mean(mapelist) * 100
       print(mape)
    except:
       mape = 0
       pass
return mape

def process_range(ndata, store=None):
   if store is None:
      store = {}
   for id in ndata:
      store[id] = process_id(ndata[id])
   return store


def threaded_process_range(nthreads,ndata):
    store = {}
    threads = []
    # create the threads
    k = 0
    tk = ndata.columns
    for i in range(nthreads):
        dk  = tk[k:len(tk)/nthreads+k]
        k = k+len(tk)/nthreads
        t = Thread(target=process_range, args=(ndata[dk],store))
        threads.append(t)
    [ t.start() for t in threads ]
    [ t.join() for t in threads ]
    return store

outdata = threaded_process_range(4,ndata)

我想提几点：

数据是数据框中的每日股票时间序列
线程适用于 ARIMA 模型
SARIMAX 模型在 for 循环中完成时可以工作

任何见解都将非常感谢！

【问题讨论】：

标签： python parallel-processing arima

【解决方案1】：

我在使用 lbfgs 时遇到了同样的错误，我不确定为什么 lbfgs 无法进行梯度评估，但我尝试更改优化器。你也可以试试这个，从这些优化器中选择一个

Newton-Raphson 表示“newton”，Nelder-Mead 表示“nm”

Broyden-Fletcher-Goldfarb-Shanno (BFGS) 的“bfgs”

'lbfgs' 用于具有可选框约束的有限内存 BFGS

'powell' 表示修改后的 Powell 方法

'cg' 表示共轭梯度

'ncg' 表示牛顿共轭梯度

'basinhopping' 用于全球盆地跳跃求解器

在您的代码中更改它

model = smt.SARIMAX(train.asfreq(freq='1d'), exog=None, order=(0, 1, 1), seasonal_order=(0, 1, 1, 7)).fit(method='cg')

这是一个老问题，但我仍然会回答它，以防将来有人遇到同样的问题。

【讨论】：