【发布时间】:2020-08-06 14:42:35
【问题描述】:
拥有用户每月活动的数据集,按国家和浏览器细分。每行是 1 天的用户活动总和以及该日常活动的分数。例如:每天的会话数是一项功能。分数是根据每日特征计算的浮点数。
我的目标是尝试仅使用 2 天的用户数据来预测月底的“平均用户”得分。
我有 25 个月的数据,有些已满,有些只有总天数的一部分,为了获得固定的批量大小,我像这样填充了序列:
from keras.preprocessing.sequence import pad_sequences
padded_sequences = pad_sequences(sequences, maxlen=None, dtype='float64', padding='pre', truncating='post', value=-10.)
所以序列少于最大值,其中填充了 -10 行。
我决定创建一个 LSTM 模型来消化数据,因此在每批结束时,该模型应该预测平均用户得分。然后稍后我将尝试仅使用 2 天的样本进行预测。
我的模型看起来像这样:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dropout,Dense,Masking
from tensorflow.keras import metrics
from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.optimizers import Adam
import datetime, os
model = Sequential()
opt = Adam(learning_rate=0.0001, clipnorm=1)
num_samples = train_x.shape[1]
num_features = train_x.shape[2]
model.add(Masking(mask_value=-10., input_shape=(num_samples, num_features)))
model.add(LSTM(64, return_sequences=True, activation='relu'))
model.add(Dropout(0.3))
#this is the last LSTM layer, use return_sequences=False
model.add(LSTM(64, return_sequences=False, stateful=False, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(1))
model.compile(loss='mse', optimizer='adam' ,metrics=['acc',metrics.mean_squared_error])
logdir = os.path.join(logs_base_dir, datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
tensorboard_callback = TensorBoard(log_dir=logdir, update_freq=1)
model.summary()
Model: "sequential_13"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
masking_5 (Masking) (None, 4283, 16) 0
_________________________________________________________________
lstm_20 (LSTM) (None, 4283, 64) 20736
_________________________________________________________________
dropout_14 (Dropout) (None, 4283, 64) 0
_________________________________________________________________
lstm_21 (LSTM) (None, 64) 33024
_________________________________________________________________
dropout_15 (Dropout) (None, 64) 0
_________________________________________________________________
dense_9 (Dense) (None, 1) 65
=================================================================
Total params: 53,825
Trainable params: 53,825
Non-trainable params: 0
_________________________________________________________________
训练时我在第 19 个 epoch 得到 NaN 值
Epoch 16/1000
16/16 [==============================] - 14s 855ms/sample - loss: 298.8135 - acc: 0.0000e+00 - mean_squared_error: 298.8135 - val_loss: 220.7307 - val_acc: 0.0000e+00 - val_mean_squared_error: 220.7307
Epoch 17/1000
16/16 [==============================] - 14s 846ms/sample - loss: 290.3051 - acc: 0.0000e+00 - mean_squared_error: 290.3051 - val_loss: 205.3393 - val_acc: 0.0000e+00 - val_mean_squared_error: 205.3393
Epoch 18/1000
16/16 [==============================] - 14s 869ms/sample - loss: 272.1889 - acc: 0.0000e+00 - mean_squared_error: 272.1889 - val_loss: nan - val_acc: 0.0000e+00 - val_mean_squared_error: nan
Epoch 19/1000
16/16 [==============================] - 14s 852ms/sample - loss: nan - acc: 0.0000e+00 - mean_squared_error: nan - val_loss: nan - val_acc: 0.0000e+00 - val_mean_squared_error: nan
Epoch 20/1000
16/16 [==============================] - 14s 856ms/sample - loss: nan - acc: 0.0000e+00 - mean_squared_error: nan - val_loss: nan - val_acc: 0.0000e+00 - val_mean_squared_error: nan
Epoch 21/1000
我尝试应用here 描述的方法,但没有真正成功。
更新: 我已将激活从 relu 更改为 tanh,它解决了 NaN 问题。但是,当损失下降时,我的模型的准确性似乎保持为 0
Epoch 100/1000
16/16 [==============================] - 14s 869ms/sample - loss: 22.8179 - acc: 0.0000e+00 - mean_squared_error: 22.8179 - val_loss: 11.7422 - val_acc: 0.0000e+00 - val_mean_squared_error: 11.7422
问:我在这里做错了什么?
【问题讨论】:
-
我可以想象这与在 LSTM 层中使用 relu 激活有关——因为它没有界限,这将增加激活/梯度爆炸的可能性。您是否尝试过使用默认的 tanh 激活?
-
我会尝试发布我的反馈
-
请看我的更新
标签: python tensorflow keras