【问题标题】:Forecasting stock prices using SVR: A timeseries based problem使用 SVR 预测股票价格:基于时间序列的问题
【发布时间】:2021-09-08 23:55:19
【问题描述】:

我正在尝试使用 SVR 预测股票价格(Adj Close)。我能够为训练数据训练模型,但测试数据出现错误。训练数据存储在dataframe df,从 2014 年到 2018 年,测试数据存储在dataframe test_df,从 2019 年至今。代码如下:

import pandas as pd 
import pandas_datareader.data as web
import datetime
import numpy as np 
from matplotlib import style

# Get the stock data using yahoo API:
style.use('ggplot')

# get 2014-2018 data to train our model
start = datetime.datetime(2014,1,1)
end = datetime.datetime(2018,12,30)
df = web.DataReader("TSLA", 'yahoo', start, end) 

# get 2019 data to test our model on 
start = datetime.datetime(2019,1,1)
end = datetime.date.today()
test_df = web.DataReader("TSLA", 'yahoo', start, end) 


# sort by date
df = df.sort_values('Date')
test_df = test_df.sort_values('Date')

# fix the date 
df.reset_index(inplace=True)
df.set_index("Date", inplace=True)
test_df.reset_index(inplace=True)
test_df.set_index("Date", inplace=True)

df.tail()

# Converting dates

import matplotlib.dates as mdates

# change the dates into ints for training 
dates_df = df.copy()
dates_df = dates_df.reset_index()

# Store the original dates for plotting the predicitons
org_dates = dates_df['Date']

# convert to ints
dates_df['Date'] = dates_df['Date'].map(mdates.date2num)

dates_df.tail()

# Use sklearn support vector regression to predicit our data:
from sklearn.svm import SVR

dates = dates_df['Date'].to_numpy()
prices = df['Adj Close'].to_numpy()

#Convert to 1d Vector
dates = np.reshape(dates, (len(dates), 1))
prices = np.reshape(prices, (len(prices), 1))

svr_rbf = SVR(kernel= 'rbf', C= 1e3, gamma= 0.1)
svr_rbf.fit(dates, prices)


plt.figure(figsize = (12,6))
plt.plot(dates, prices, color= 'black', label= 'Data')
plt.plot(org_dates, svr_rbf.predict(dates), color= 'red', label= 'RBF model') 
plt.xlabel('Date')
plt.ylabel('Price')
plt.legend()
plt.show()

对于训练数据,它可以正常工作到这里。接下来,如何预测测试数据(test_df)。

【问题讨论】:

    标签: time-series regression svm forecasting stock


    【解决方案1】:

    按照您的约定,它应该如下所示:

    # change the dates into ints for training 
    test_dates_df = test_df.copy()
    test_dates_df = test_dates_df.reset_index()
    
    # Store the original dates for plotting the predicitons
    test_org_dates = test_dates_df['Date']
    
    # convert to ints
    test_dates_df['Date'] = test_dates_df['Date'].map(mdates.date2num)
    
    test_dates = test_dates_df['Date'].to_numpy()
    test_prices = test_df['Adj Close'].to_numpy()
    
    #Convert to 1d Vector
    test_dates = np.reshape(test_dates, (len(test_dates), 1))
    test_prices = np.reshape(test_prices, (len(test_prices), 1))
    
    # Predict on unseen test data
    y_hat_test = svr_rbf.predict(test_dates)
    
    # Visualize predictions against real values
    plt.figure(figsize = (12,6))
    plt.plot(test_dates, test_prices, color= 'black', label= 'Data')
    plt.plot(test_org_dates, y_hat_test, color= 'red', label= 'RBF model (test)') 
    plt.xlabel('Date')
    plt.ylabel('Price')
    plt.legend()
    plt.show()
    

    【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2019-04-18
    • 2023-01-13
    • 2019-07-04
    • 2022-01-25
    • 2019-05-22
    • 2019-06-02
    • 1970-01-01
    相关资源
    最近更新 更多