训练具有负回归可能性的神经网络答案

【问题标题】：Train the neural network with negative lokelihood for regression训练具有负回归可能性的神经网络
【发布时间】：2021-01-14 21:45:48
【问题描述】：

我正在尝试使用负似然函数来训练简单的前馈神经网络，以估计回归任务的不确定性。我的神经网络将均值和方差输出为两个标签，我编写了自定义损失函数，如下所示：

def nll_loss(y_true, y_pred):
    epsilon = 1e-6
    mean = y_pred[:,0] # mean
    sigma_sq = y_pred[:,1] # variance
    sigma_sq_sp = K.log(1 + K.exp(sigma_sq)) + 1e-06 # softplus on the variance
    nll_loss =  0.5 * K.mean(K.log(sigma_sq_sp + epsilon) + K.square(y_true - mean) / (sigma_sq_sp + epsilon))
    
    return nll_loss

inp = Input(shape=(1,))
x = Dense(10, activation="relu")(inp)
x = Dense(20, activation="relu")(x)
x = Dense(30, activation="relu")(x)
output = Dense(2, activation="linear")(x)

model = Model(inp, output)

model.compile(loss=nll_loss, optimizer='adam')

model.fit(x_train, y_train, epochs=50)

我的 x_train 和 y_train 具有 (200,) 形状，这意味着标量特征和标签（以及 200 个示例）。使用切片y_pred[:,0]和y_pred[:,1]从输出层提取第一个和第二个标签是否正确？

我的模型经过几个 epoch 的训练，然后损失为 nan。我在计算损失函数时做错了什么吗？ y_true 和 y_pred 可以有不同的形状吗？

谢谢。

【问题讨论】：

标签： tensorflow keras deep-learning

【解决方案1】：

看来问题出在 Softplus 激活函数的计算上：

sigma_sq_sp = K.log(1 + K.exp(sigma_sq)) + 1e-06

当我用下面的代码替换上面的行时，它工作正常：

sigma_sq_sp = tf.keras.activations.softplus(sigma_sq)

也许，softplus 的内置 Keras 激活可确保稳定性。谢谢。

【讨论】：