用于pytorch中图像分割的通道方式CrossEntropyLoss答案

【问题标题】：Channel wise CrossEntropyLoss for image segmentation in pytorch用于pytorch中图像分割的通道方式CrossEntropyLoss
【发布时间】：2018-11-26 12:41:49
【问题描述】：

我正在做一个图像分割任务。总共有 7 个类，所以最终输出是一个张量，如 [batch, 7, height, width] ，它是一个 softmax 输出。现在直观地我想使用 CrossEntropy 损失，但 pytorch 实现不适用于通道明智的 one-hot 编码向量

所以我打算自己制作一个函数。在一些 stackoverflow 的帮助下，我的代码到目前为止看起来像这样

from torch.autograd import Variable
import torch
import torch.nn.functional as F


def cross_entropy2d(input, target, weight=None, size_average=True):
    # input: (n, c, w, z), target: (n, w, z)
    n, c, w, z = input.size()
    # log_p: (n, c, w, z)
    log_p = F.log_softmax(input, dim=1)
    # log_p: (n*w*z, c)
    log_p = log_p.permute(0, 3, 2, 1).contiguous().view(-1, c)  # make class dimension last dimension
    log_p = log_p[
       target.view(n, w, z, 1).repeat(0, 0, 0, c) >= 0]  # this looks wrong -> Should rather be a one-hot vector
    log_p = log_p.view(-1, c)
    # target: (n*w*z,)
    mask = target >= 0
    target = target[mask]
    loss = F.nll_loss(log_p, target.view(-1), weight=weight, size_average=False)
    if size_average:
        loss /= mask.data.sum()
    return loss


images = Variable(torch.randn(5, 3, 4, 4))
labels = Variable(torch.LongTensor(5, 3, 4, 4).random_(3))
cross_entropy2d(images, labels)

我得到两个错误。代码本身提到了一个，它需要一个热向量。第二个说如下

RuntimeError: invalid argument 2: size '[5 x 4 x 4 x 1]' is invalid for input with 3840 elements at ..\src\TH\THStorage.c:41

例如，我试图让它解决一个 3 类问题。所以目标和标签是（为了简化，不包括批处理参数！）

目标：

 Channel 1     Channel 2  Channel 3

[[0 1 1 0 ] [0 0 0 1 ] [1 0 0 0 ] [0 0 1 1 ] [0 0 0 0 ] [1 1 0 0 ] [0 0 0 1 ] [0 0 0 0 ] [1 1 1 0 ] [0 0 0 0 ] [0 0 0 1 ] [1 1 1 0 ]

标签：

 Channel 1     Channel 2  Channel 3

[[0 1 1 0 ] [0 0 0 1 ] [1 0 0 0 ] [0 0 1 1 ] [.2 0 0 0] [.8 1 0 0 ] [0 0 0 1 ] [0 0 0 0 ] [1 1 1 0 ] [0 0 0 0 ] [0 0 0 1 ] [1 1 1 0 ]

那么如何修复我的代码来计算通道明智的 CrossEntropy 损失？

【问题讨论】：

我正在尝试实现类似的目标。你最终能够使用内置的 CrossEntropyLoss 函数了吗？
是的。有点像！
您能否发布解决方案作为答案？
当然，我会的。不过可能需要一些时间来重新创建代码
其实我自己现在也想通了。如果您愿意，我可以发布我的代码作为答案。

标签： image-segmentation pytorch loss-function cross-entropy semantic-segmentation

【解决方案1】：

正如 Shai 的回答已经指出的那样，torch.nn.CrossEntropy() 函数的文档可以在here 找到，代码可以在here 找到。内置函数确实已经支持 KD 交叉熵损失。

在 3D 情况下，torch.nn.CrossEntropy() 函数需要两个参数：4D 输入矩阵和 3D 目标矩阵。输入矩阵的形状为：(Minibatch, Classes, H, W)。目标矩阵的形状为 (Minibatch, H, W)，数字范围从 0 到 (Classes-1)。如果您从 one-hot 编码矩阵开始，则必须将其转换为 np.argmax()。

三个类和小批量大小为 1 的示例：

import pytorch
import numpy as np

input_torch = torch.randn(1, 3, 2, 5, requires_grad=True)

one_hot = np.array([[[1, 1, 1, 0, 0], [0, 0, 0, 0, 0]],    
                    [[0, 0, 0, 0, 0], [1, 1, 1, 0, 0]],
                    [[0, 0, 0, 1, 1], [0, 0, 0, 1, 1]]])

target = np.array([np.argmax(a, axis = 0) for a in target])
target_torch = torch.tensor(target_argmax)

loss = torch.nn.CrossEntropyLoss()
output = loss(input_torch, target_torch)
output.backward()

【讨论】：

我没有使用 argmax，而是将目标矩阵转换为 (Minibatch, Class, H*W)，仅此而已。
谢谢！这非常有帮助！

【解决方案2】：

2D（或 KD）交叉熵是 NN 中非常基本的构建块。 pytorch 不太可能没有“开箱即用”的实现。
查看torch.nn.CrossEntropyLoss 和底层torch.nn.functional.cross_entropy 你会发现损失可以处理2D 输入（即4D 输入预测张量）。
此外，您可以查看实际实现此here 的代码，并根据dim 的dim 张量查看它如何处理不同的情况。

所以，不用麻烦，它已经为你完成了！

【讨论】：

感谢您的回答，但我仍然对您的意思感到困惑。我在这里发布之前已经看过这些文档，我也看过 pytocrh 论坛。可以举个例子吗？
@FarshidRayhan 看看this example
@FarshidRayhan 您只需为您的criterion 提供4D predictions（而不是2D）和3D targets（而不是1D）。
是的，我很久以前就试过了。它说“尺寸不匹配（输入：1x3x2x2，目标：1x3x4）”。
btw 在预测中我给出 [batch, channel, height, width] 和目标 [batch, channel, class] ！