【问题标题】:bad prediction when having noise on the data: LSTM time-series regression数据有噪声时的错误预测:LSTM 时间序列回归
【发布时间】:2022-11-22 00:50:19
【问题描述】:

我想使用用于时间序列预测的 LSTM 模型使用智能鞋垫来预测测力板。测力板上的数据有正值和负值(我认为得到的正值是噪音)。如果我忽略正值,那么数据测试的预测结果就会很糟糕。但如果我将正值更改为 0,那么预测结果会很好。如果我想保持正值而不改变它但预测结果很好怎么办?

测力板形状

2050,1

智能鞋垫形状

2050,89

以下是我的代码:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import math
from tensorflow.keras.layers import Dense,RepeatVector, LSTM, Dropout
from tensorflow.keras.layers import Flatten, Conv1D, MaxPooling1D
from tensorflow.keras.layers import Bidirectional, Dropout
from tensorflow.keras.models import Sequential
from tensorflow.keras.utils import plot_model
from tensorflow.keras.optimizers import Adam
from sklearn.model_selection import train_test_split
from keras.callbacks import ModelCheckpoint, EarlyStopping
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.preprocessing import MinMaxScaler
%matplotlib inline

## Load Data
Insole = pd.read_csv('1113_Rwalk40s1_list.txt', header=None, low_memory=False)
SIData =  np.asarray(Insole)

df = pd.read_csv('1113_Rwalk40s1.csv', low_memory=False)
columns = ['Fx']
selected_df = df[columns]
FCDatas = selected_df[:2050]
## End Load Data

## Concatenate Data
SmartInsole = np.array(SIData[:2050])
FCData = np.array(FCDatas)
# FCData = np.where(FCData>0, 0, FCData) #making positive value to 0
Dataset = np.concatenate((SmartInsole, FCData), axis=1)
## End Concatenate Data


## Normalization Data
scaler_in = MinMaxScaler(feature_range=(0, 1))
scaler_out = MinMaxScaler(feature_range=(0, 1))
data_scaled_in = scaler_in.fit_transform(Dataset[:,0:89])
data_scaled_out = scaler_out.fit_transform(Dataset[:,89:90])
## End Normalization Data

steps= 50
inp = []
out = []
for i in range(len(data_scaled_out) - (steps)):
    inp.append(data_scaled_in[i:i+steps])
    out.append(data_scaled_out[i+steps])

inp= np.asanyarray(inp)
out= np.asanyarray(out)

x_train, x_test, y_train, y_test = train_test_split(inp, out, test_size=0.25,random_state=2)

## Model Building
model = Sequential()
model.add(LSTM(64, activation='relu',  return_sequences= False, input_shape= (50,89)))
model.add(Dense(32,activation='relu'))
model.add(Dense(16,activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss = 'mse', optimizer=Adam(learning_rate=0.002), metrics=['mse'])
model.summary()
## End Model Building

## Model fit
history = model.fit(x_train,y_train, epochs=50, verbose=2, batch_size=64, validation_data=(x_test, y_test))
## End Model fit

## Model Loss Plot
import matplotlib.pyplot as plt

plt.figure(figsize=(10,6))
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Test Loss')
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epochs')
plt.legend(loc='upper right')
plt.show()
## End Model Loss Plot

## Prediction and Model Evaluation
model.evaluate(inp, out)
predictions=model.predict(inp)

print('MSE: ',mean_squared_error(out, predictions))
print('RMSE: ',math.sqrt(mean_squared_error(out, predictions)))
print('Coefficient of determination (r2 Score): ', r2_score(out, predictions))

#invert normalize
predictions = scaler_out.inverse_transform(predictions) 
out = scaler_out.inverse_transform(out) 

x=[]
colors=['red','green','brown','teal','gray','black','maroon','orange','purple']
colors2=['green','red','orange','black','maroon','teal','blue','gray','brown']
x = np.arange(0,2000)*40/2000 
for i in range(0,1):
    plt.figure(figsize=(15,6))
    plt.plot(x,out[0:2000,i],color=colors[i])
    plt.plot(x,predictions[0:2000,i],markerfacecolor='none',color=colors2[i])
    plt.title('LSTM Regression (Training Data)')
    plt.ylabel('Force/Fx (N)')
    plt.xlabel('Time(s)')
    plt.legend(['Real value', 'Predicted Value'], loc='lower left')
    plt.savefig('Regression Result.png'[i])
    plt.show()

## End Prediction and Model Evaluation

## Model Validation
Test_Insole = pd.read_csv('1113_Rwalk40s2_list.txt', header=None, low_memory=False)
TestSIData =  np.asarray(Test_Insole)

Test_df = pd.read_csv('1113_Rwalk40s2.csv', low_memory=False)
Test_columns = ['Fx']
Test_selected_df = Test_df[Test_columns]
Test_FCDatas = Test_selected_df[:2050]

test_SmartInsole = np.array(TestSIData[:2050]) 
test_FCData = np.array(Test_FCDatas)
# test_FCData = np.where(test_FCData>0, 0, test_FCData) #making positive value to 0
test_Dataset = np.concatenate((test_SmartInsole, test_FCData), axis=1)

test_scaler_in = MinMaxScaler(feature_range=(0, 1))
test_scaler_out = MinMaxScaler(feature_range=(0, 1))
test_data_scaled_in = test_scaler_in.fit_transform(test_Dataset[:,0:89])
test_data_scaled_out = test_scaler_out.fit_transform(test_Dataset[:,89:90])

test_steps= 50
test_inp = []
test_out = []
for i in range(len(test_data_scaled_out) - (test_steps)):
    test_inp.append(test_data_scaled_in[i:i+test_steps])
    test_out.append(test_data_scaled_out[i+test_steps])

test_inp= np.asanyarray(test_inp)
test_out= np.asanyarray(test_out)

model.evaluate(test_inp, test_out)
test_predictions=model.predict(test_inp)

test_predictions = test_scaler_out.inverse_transform(test_predictions) 
test_out = test_scaler_out.inverse_transform(test_out) 

x=[]
colors=['red','green','brown','teal','gray','black','maroon','orange','purple']
colors2=['green','red','orange','black','maroon','teal','blue','gray','brown']
x = np.arange(0,2000)*40/2000 
for i in range(0,1):
    plt.figure(figsize=(15,6))
    plt.plot(x,test_out[0:2000,i],color=colors[i])
    plt.plot(x,test_predictions[0:2000,i],markerfacecolor='none',color=colors2[i])
    plt.title('LSTM Regression (Testing Data)')
    plt.ylabel('Force/Fx (N)')
    plt.xlabel('Time(s)')
    plt.legend(['Real value', 'Predicted Value'], loc='lower left')
    plt.savefig('Regression Result.png'[i])
    plt.show()

## End Model validation

不改变正值的结果

如果我将正值更改为 0 的结果

【问题讨论】:

    标签: python regression lstm data-analysis


    【解决方案1】:

    为什么你不能尝试使用 Normalizer 或 RobustScaler 或 StandardScaler 而不是 MinMax Scaler

    【讨论】:

    • 我试过了,还是不行
    猜你喜欢
    • 2018-09-22
    • 1970-01-01
    • 2018-11-16
    • 1970-01-01
    • 1970-01-01
    • 2021-09-08
    • 1970-01-01
    • 2018-10-07
    • 2018-12-28
    相关资源
    最近更新 更多