【发布时间】:2020-11-30 15:58:11
【问题描述】:
我正在创建一个隐马尔可夫模型和一个 LSTM 神经网络来对同一数据集进行预测,以比较两个不同模型的性能。我的 HMM 运行良好,但是当尝试使用相同的数据集训练我的 LSTM 时,我无法让我的网络学习任何东西。作为参考,这里有一个概括的图表,描述了我想要完成的事情:
为了实现 LSTM 神经网络,我遵循了this article,它使用小型 Keras 模型对具有多个输入的数据集进行预测,例如我的问题。然而,在实现了一个与教程中的非常相似的模型(代码如下)之后,我的准确率从未超过 40%。事实上,从 epoch 1 一直到我选择结束训练的任何 epoch,准确度总是完全相同的。由于某种原因,我的损失无论如何也超低,这让我认为准确度应该更高。由于损失和准确性不一致,我怀疑我的模型完全错误地表示我的数据或模型中的参数是完全错误的。
我的数据集非常基础,所以我觉得我错过了一些大的东西。我之前相当容易地创建了一个 CCN,我认为只要我遵循教程,制作 LSTM 就会很容易。如果我想创建一个非常基本的 LSTM 来进行非常基本的预测,我应该创建什么样的模型?使用分类分类和 LSTM 时应该使用什么损失函数?我能想到的最后一个具体问题,通常是什么导致准确性永远不会提高并且始终保持不变?
到目前为止,我对 LSTM 的实现有什么:
# Number of games to go back for next prediction.
TIME_STEPS = 1
# Gets the game data from the generated CSV file.
# Column 1 - Game Number
# Column 2 - Result
# Column 3 - My Rating
# Column 4 - Opponent's Rating
dataFile = 'ChessData.csv'
data = pd.read_csv(dataFile, index_col='Game Number')
df = data.copy()
# Splits the CSV file into training and validation data.
train_size = int(len(df) * 0.8)
train_dataset, test_dataset = df.iloc[:train_size], df.iloc[train_size:]
# Splits the data based on target/dependent variables.
# Also creates the X and y for supervised learning.
X_train = train_dataset.drop('Result', axis=1)
y_train = train_dataset.loc[:, ['Result']]
# Splits the test data for X and y and well.
X_test = test_dataset.drop('Result', axis=1)
y_test = test_dataset.loc[:, ['Result']]
# Different scaler for input and output
scaler_x = MinMaxScaler(feature_range = (0,1))
scaler_y = MinMaxScaler(feature_range = (0,1))
# Fit the scaler using available training data
input_scaler = scaler_x.fit(X_train)
output_scaler = scaler_y.fit(y_train)
# Apply the scaler to training data
y_train = output_scaler.transform(y_train)
X_train = input_scaler.transform(X_train)
# Apply the scaler to test data
y_test = output_scaler.transform(y_test)
X_test = input_scaler.transform(X_test)
# Create a 3D input
def create_dataset (X, y, time_steps = 1):
Xs, ys = [], []
for i in range(len(X)-time_steps):
v = X[i:i+time_steps, :]
Xs.append(v)
ys.append(y[i+time_steps])
return np.array(Xs), np.array(ys)
# Creates the 3D input by calling create_dataset for both
# the training data and the testing data.
X_test, y_test = create_dataset(X_test, y_test, TIME_STEPS)
X_train, y_train = create_dataset(X_train, y_train, TIME_STEPS)
# Defines the LSTM Model
def create_model(units, m):
model = Sequential()
model.add(m (units = units, return_sequences = True,
input_shape = [X_train.shape[1], X_train.shape[2]]))
model.add(Dropout(0.2))
model.add(m (units = units))
model.add(Dropout(0.2))
model.add(Dense(units = 1))
#Compile model
model.compile(optimizer=keras.optimizers.Adam(0.001),
loss="categorical_crossentropy",
metrics=["accuracy"])
return model
# Creates an LSTM model instance
model_lstm = create_model(128, LSTM)
# Fits the LSTM Model
def fit_model(model):
early_stop = keras.callbacks.EarlyStopping(monitor = 'val_loss',
patience = 10)
history = model.fit(X_train, y_train, epochs = 100,
validation_split = 0.2, batch_size = 32,
shuffle = False, callbacks = [early_stop])
return history
history_lstm = fit_model(model_lstm)
# Make prediction
def prediction(model):
prediction = model.predict(X_test)
prediction = scaler_y.inverse_transform(prediction)
return prediction
prediction_lstm = prediction(model_lstm)
print(prediction_lstm)
【问题讨论】:
标签: python tensorflow machine-learning keras lstm