不平衡数据和加权交叉熵答案

【问题标题】：Unbalanced data and weighted cross entropy不平衡数据和加权交叉熵
【发布时间】：2017-11-17 12:53:28
【问题描述】：

我正在尝试用不平衡的数据训练网络。我有 A（198 个样本）、B（436 个样本）、C（710 个样本）、D（272 个样本），并且我已经阅读了有关“weighted_cross_entropy_with_logits”的信息，但我发现的所有示例都是针对二进制分类的，所以我不是很了解对如何设置这些权重充满信心。

样本总数：1616

A_weight：198/1616 = 0.12？

如果我理解的话，其背后的想法是惩罚多数类别的错误并更积极地重视少数类别的命中，对吧？

我的一段代码：

weights = tf.constant([0.12, 0.26, 0.43, 0.17])
cost = tf.reduce_mean(tf.nn.weighted_cross_entropy_with_logits(logits=pred, targets=y, pos_weight=weights))

我已经阅读了this one 和其他二进制分类的示例，但仍然不是很清楚。

提前致谢。

【问题讨论】：

标签： python machine-learning tensorflow deep-learning

【解决方案1】：

Tensorflow 2.0 兼容答案：为了社区的利益，将 P-Gn 的答案中指定的代码迁移到 2.0。

# your class weights
class_weights = tf.compat.v2.constant([[1.0, 2.0, 3.0]])
# deduce weights for batch samples based on their true label
weights = tf.compat.v2.reduce_sum(class_weights * onehot_labels, axis=1)
# compute your (unweighted) softmax cross entropy loss
unweighted_losses = tf.compat.v2.nn.softmax_cross_entropy_with_logits(onehot_labels, logits)
# apply the weights, relying on broadcasting of the multiplication
weighted_losses = unweighted_losses * weights
# reduce the result to get your final loss
loss = tf.reduce_mean(weighted_losses)

有关将代码从 TensorFlow 版本 1.x 迁移到 2.x 的更多信息，请参阅Migration Guide。

【讨论】：

【解决方案2】：

请参阅this answer 了解与 sparse_softmax_cross_entropy 一起使用的替代解决方案：

import  tensorflow as tf
import numpy as np

np.random.seed(123)
sess = tf.InteractiveSession()

# let's say we have the logits and labels of a batch of size 6 with 5 classes
logits = tf.constant(np.random.randint(0, 10, 30).reshape(6, 5), dtype=tf.float32)
labels = tf.constant(np.random.randint(0, 5, 6), dtype=tf.int32)

# specify some class weightings
class_weights = tf.constant([0.3, 0.1, 0.2, 0.3, 0.1])

# specify the weights for each sample in the batch (without having to compute the onehot label matrix)
weights = tf.gather(class_weights, labels)

# compute the loss
tf.losses.sparse_softmax_cross_entropy(labels, logits, weights).eval()

【讨论】：

赞成，因为这个答案有 -1 票。我认为这个答案至少值得 0 票，因为它让我发现 tf.gather 这使我的代码非常高效，因为我的标签稀疏而不密集。
@DankMasterDan：链接可以很好地提供信用和上下文，但请复制粘贴引用的代码到您的答案中，这样它就可以自给自足了。

【解决方案3】：

注意weighted_cross_entropy_with_logits 是sigmoid_cross_entropy_with_logits 的加权变体。 Sigmoid 交叉熵通常用于二元分类。是的，它可以处理多个标签，但 sigmoid 交叉熵基本上对它们中的每一个做出（二元）决策——例如，对于人脸识别网络，那些（不互斥的）标签可能是“主题是否戴眼镜？”、“对象是女性吗？”等

在二元分类中，每个输出通道对应一个二元（软）决策。因此，需要在计算损失时进行加权。这就是weighted_cross_entropy_with_logits 所做的，通过对交叉熵的一项进行加权。

在互斥多标签分类中，我们使用softmax_cross_entropy_with_logits，其行为不同：每个输出通道对应一个类别候选的分数。在之后做出决定，通过比较每个通道的各自输出。

因此，在最终决定之前加权只是在比较分数之前修改分数的简单问题，通常是通过乘以权重。例如，对于一个三元分类任务，

# your class weights
class_weights = tf.constant([[1.0, 2.0, 3.0]])
# deduce weights for batch samples based on their true label
weights = tf.reduce_sum(class_weights * onehot_labels, axis=1)
# compute your (unweighted) softmax cross entropy loss
unweighted_losses = tf.nn.softmax_cross_entropy_with_logits(onehot_labels, logits)
# apply the weights, relying on broadcasting of the multiplication
weighted_losses = unweighted_losses * weights
# reduce the result to get your final loss
loss = tf.reduce_mean(weighted_losses)

您也可以依靠tf.losses.softmax_cross_entropy 来处理最后三个步骤。

在您需要解决数据不平衡问题的情况下，类别权重确实可能与其在您的训练数据中的频率成反比。将它们归一化以便它们总和为一个或类的数量也是有意义的。

请注意，在上面，我们根据样本的真实标签对损失进行了惩罚。我们也可以通过简单地定义

来根据估计标签来惩罚损失

weights = class_weights

由于广播魔术，其余代码无需更改。

在一般情况下，您希望权重取决于您所犯的错误类型。换句话说，对于每一对标签X和Y，当真正的标签是Y时，你可以选择如何惩罚选择标签X。你最终得到一个完整的先验权重矩阵，这导致上面的weights 是一个完整的(num_samples, num_classes) 张量。这有点超出了您的预期，但是知道在上面的代码中只有您对权重张量的定义需要更改可能会很有用。

【讨论】：

感谢用户1735003的解释！现在清楚多了。但是你的回答给我带来了一个新问题：我的数据是相关的，因为 A 应该类似于 B 而不是 C。所以我想惩罚假 C 而不是假 B。在这种情况下，和与您的最后一段相关，我可以有一个 (num_classes, num_classes) 的张量，我会在其中惩罚更多这种错误分类吗？在那种情况下：张量的对角线应该是反比的......还是应该不同？提前致谢。
很难给出选择权重的明确建议。即使对于不平衡的数据，使用逆频率有时也不合适，因为（例如）代表 99.9% 数据的类也很容易分类。您所描述的情况——不平衡的多标签数据，在混淆之前具有不均匀的先验——甚至更加复杂。您可以在没有权重或标准平衡权重的情况下开始运行，并决定根据您获得的混淆矩阵修改权重。不幸的是，权重最终还有更多的超参数需要调整。
我明白了，@user1735003。我只是想知道你能不能告诉我一些参考资料来检查？我的意思是，我什至不知道如何开始调整权重，是否应该增加或减少一个或整个行（以防我可以使用 (num_classes, num_classes) 矩阵）。非常感谢您的帮助。
在二元交叉熵的上下文中，有没有办法对真阳性、真阴性、假阳性和假阴性进行加权？我想我有一个与此相关的问题，如下所述：stackoverflow.com/questions/48744092/…
fyi 函数 tf.nn.softmax_cross_entropy_with_logits 已被弃用，取而代之的是 tf.nn.softmax_cross_entropy_with_logits_v2