在 Tensorflow 中创建加权 MSE 损失函数答案

【问题标题】：Create a weighted MSE loss function in Tensorflow在 Tensorflow 中创建加权 MSE 损失函数
【发布时间】：2021-05-07 15:21:17
【问题描述】：

我想使用Tensorflow 训练一个循环神经网络。我的模型为每个训练样本输出一个 1 x 100 的向量。假设y = [y_1, y_2, ..., y_100] 是我对训练样本x 的输出，预期输出是y'= [y'_1, y'_2, ..., y'_100]。

我希望编写一个自定义损失函数来计算这个特定样本的损失，如下所示：

Loss =  1/sum(weights) * sqrt(w_1*(y_1-y'_1)^2 + ... + w_100*(y_100-y'_100)^2)

其中weights = [w_1,...,w_100] 是一个给定的权重数组。

有人可以帮我实现这样一个自定义损失函数吗？（我在训练时也使用小批量）

【问题讨论】：

标签： tensorflow keras loss-function

【解决方案1】：

我想强调一下，根据您的问题，您有两种可能性：

[1] 如果所有样本的权重相同：

您可以构建一个损失包装器。这是一个虚拟示例：

n_sample = 200
X = np.random.uniform(0,1, (n_sample,10))
y = np.random.uniform(0,1, (n_sample,100))
W = np.random.uniform(0,1, (100,)).astype('float32')

def custom_loss_wrapper(weights):
    def loss(true, pred):
        sum_weights = tf.reduce_sum(weights) * tf.cast(tf.shape(pred)[0], tf.float32)
        resid = tf.sqrt(tf.reduce_sum(weights * tf.square(true - pred)))
        return resid/sum_weights
    return loss

inp = Input((10,))
x = Dense(256)(inp)
pred = Dense(100)(x)

model = Model(inp, pred)
model.compile('adam', loss=custom_loss_wrapper(W))

model.fit(X, y, epochs=3)

[2]如果样本之间的权重不同：

您应该使用add_loss 构建模型，以便动态地考虑每个样本的权重。这是一个虚拟示例：

n_sample = 200
X = np.random.uniform(0,1, (n_sample,10))
y = np.random.uniform(0,1, (n_sample,100))
W = np.random.uniform(0,1, (n_sample,100))

def custom_loss(true, pred, weights):
    sum_weights = tf.reduce_sum(weights)
    resid = tf.sqrt(tf.reduce_sum(weights * tf.square(true - pred)))
    return resid/sum_weights

inp = Input((10,))
true = Input((100,))
weights = Input((100,))
x = Dense(256)(inp)
pred = Dense(100)(x)

model = Model([inp,true,weights], pred)
model.add_loss(custom_loss(true, pred, weights))
model.compile('adam', loss=None)

model.fit([X,y,W], y=None, epochs=3)

使用add_loss 时，您应该将损失中涉及的所有张量作为输入层传递，并将它们传递到损失中进行计算。

在推理时，您可以像往常一样计算预测，只需删除真实和权重作为输入：

final_model = Model(model.input[0], model.output)
final_model.predict(X)

【讨论】：

【解决方案2】：

您可以通过以下方式实现自定义加权mse

import numpy as np 
from tensorflow.keras import backend as K 

def custom_mse(class_weights):
    def weighted_mse(gt, pred):
        # Formula: 
        # w_1*(y_1-y'_1)^2 + ... + w_100*(y_100-y'_100)^2 / sum(weights)
        return K.sum(class_weights * K.square(gt - pred)) / K.sum(class_weights)
    return weighted_mse

y_true  = np.array([[0., 1., 1, 0.], [0., 0., 1., 1.]])
y_pred  = np.array([[0., 1, 0., 1.], [1., 0., 1., 1.]])
weights = np.array([0.25, 0.50, 1., 0.75])

print(y_true.shape, y_pred.shape, weights.shape)
(2, 4) (2, 4) (4,)

loss = custom_mse(class_weights=weights)
loss(y_true, y_pred).numpy()
0.8

在模型编译中使用它。

model.compile(loss=custom_mse(weights))

这将使用提供的加权矩阵计算 mse。但是，在您的问题中，您引用了 sqrt...，我认为您的意思是 root mse (rmse)。为此，您可以在 custom_mse 的自定义函数中使用 K.sqrt(K.sum(...)) / K.sum(...)。

仅供参考，您可能也有兴趣在Model. fit 期间查看class_weights 和sample_weights。来自source：

class_weight：可选的字典，将类索引（整数）映射到权重（浮点）值，用于加权损失功能（仅在训练期间）。这对于告诉模型很有用 “更加关注”来自代表性不足的班级的样本。

sample_weight：用于训练样本的可选 Numpy 权重数组，用于加权损失函数（仅在训练期间）。您可以传递一个长度相同的平面 (1D) Numpy 数组输入样本（权重和样本之间的 1:1 映射），或者在在时间数据的情况下，您可以传递具有形状的二维数组（样本， sequence_length)，对每个时间步应用不同的权重每个样本。当 x 是数据集时，不支持此参数，生成器或keras.utils.Sequence 实例，而是提供 sample_weights 作为 x 的第三个元素。

还有loss_weights in Model.compile，来自source

loss_weights：可选列表或字典，指定标量系数（Python 浮点数）以加权损失贡献不同的模型输出。将被最小化的损失值那么模型将是所有个体损失的加权总和，加权通过 loss_weights 系数。如果是一个列表，它预计会有一个 1:1 映射到模型的输出。如果是 dict，则应映射将名称（字符串）输出到标量系数。

【讨论】：