【问题标题】:Trying to understand cross_entropy loss in PyTorch试图理解 PyTorch 中的 cross_entropy 损失
【发布时间】:2019-12-01 08:16:20
【问题描述】:

这是一个非常新手的问题,但我正试图解决 Torch 中的 cross_entropy 损失,因此我创建了以下代码:

x = torch.FloatTensor([
                        [1.,0.,0.]
                       ,[0.,1.,0.]
                       ,[0.,0.,1.]
                       ])

print(x.argmax(dim=1))

y = torch.LongTensor([0,1,2])
loss = torch.nn.functional.cross_entropy(x, y)

print(loss)

输出如下:

tensor([0, 1, 2])
tensor(0.5514)

鉴于我的输入与预期输出匹配,我不明白为什么损失不是 0?

【问题讨论】:

    标签: python machine-learning pytorch


    【解决方案1】:

    那是因为你给交叉熵函数的输入不是你所做的概率,而是用这个公式转换成概率的 logits:

    probas = np.exp(logits)/np.sum(np.exp(logits), axis=1)
    

    因此,pytorch 在您的情况下使用的概率矩阵是:

    [0.5761168847658291,  0.21194155761708547,  0.21194155761708547]
    [0.21194155761708547, 0.5761168847658291, 0.21194155761708547]
    [0.21194155761708547,  0.21194155761708547, 0.5761168847658291]
    

    【讨论】:

    • 从数学的角度来看,OP需要将y转化为概率分布
    • 是的,如果我们将输入更改为 like:x = torch.FloatTensor([ [10.,0.,0.] ,[0.,10.,0.] ,[0. ,0.,10.] ]),那么 F.cross_entropy 结果会接近于零,所以 F.cross_entropy 期望 ground truth 与其他类的差异越大越好,不是 ground truth = 1 最好.
    【解决方案2】:

    torch.nn.functional.cross_entropy 函数将log_softmax(softmax 后跟一个对数)和nll_loss(负对数似然损失)组合在一个单一的 函数,即相当于F.nll_loss(F.log_softmax(x, 1), y)

    代码:

    x = torch.FloatTensor([[1.,0.,0.],
                           [0.,1.,0.],
                           [0.,0.,1.]])
    y = torch.LongTensor([0,1,2])
    
    print(torch.nn.functional.cross_entropy(x, y))
    
    print(F.softmax(x, 1).log())
    print(F.log_softmax(x, 1))
    
    print(F.nll_loss(F.log_softmax(x, 1), y))
    

    输出:

    tensor(0.5514)
    tensor([[-0.5514, -1.5514, -1.5514],
            [-1.5514, -0.5514, -1.5514],
            [-1.5514, -1.5514, -0.5514]])
    tensor([[-0.5514, -1.5514, -1.5514],
            [-1.5514, -0.5514, -1.5514],
            [-1.5514, -1.5514, -0.5514]])
    tensor(0.5514)
    

    here阅读更多关于torch.nn.functional.cross_entropy损失函数的信息。

    【讨论】:

      【解决方案3】:

      完整的、复制/粘贴可运行示例,显示分类交叉熵损失计算示例:

      -纸+铅笔+计算器
      -NumPy
      -PyTorch

      除了微小的四舍五入差异之外,所有 3 个结果都是相同的:

      import torch
      import torch.nn.functional as F
      
      import numpy as np
      
      def main():
      
          ### paper + pencil + calculator calculation #################
      
          """
          predictions before softmax:
                        columns
                     (4 categories)
              rows     1, 4, 1, 1
          (3 samples)  5, 1, 2, 1
                       1, 2, 5, 1
      
          ground truths (NOT one hot encoded)
                1, 0, 2
      
          preds softmax calculation:
          (e^1/(e^1+e^4+e^1+e^1)), (e^4/(e^1+e^4+e^1+e^1)), (e^1/(e^1+e^4+e^1+e^1)), (e^1/(e^1+e^4+e^1+e^1))
          (e^5/(e^5+e^1+e^2+e^1)), (e^1/(e^5+e^1+e^2+e^1)), (e^2/(e^5+e^1+e^2+e^1)), (e^1/(e^5+e^1+e^2+e^1))
          (e^1/(e^1+e^2+e^5+e^1)), (e^2/(e^1+e^2+e^5+e^1)), (e^5/(e^1+e^2+e^5+e^1)), (e^1/(e^1+e^2+e^5+e^1))
      
          preds after softmax:
          0.04332, 0.87005, 0.04332, 0.04332
          0.92046, 0.01686, 0.04583, 0.01686
          0.01686, 0.04583, 0.92046, 0.01686
      
          categorical cross-entropy loss calculation:
          (-ln(0.87005) + -ln(0.92046) + -ln(0.92046)) / 3 = 0.10166
      
          Note the loss ends up relatively low because all 3 predictions are correct
          """
      
      
          ### calculation via NumPy ###################################
      
          # predictions from model (just made up example data in this case)
          # rows = 3 samples, cols = 4 categories
          preds = np.array([[1, 4, 1, 1],
                            [5, 1, 2, 1],
                            [1, 2, 5, 1]], dtype=np.float32)
      
          # ground truths, NOT one hot encoded
          gndTrs = np.array([1, 0, 2], dtype=np.int64)
      
          preds = softmax(preds)
      
          loss = calcCrossEntropyLoss(preds, gndTrs)
      
          print('\n' + 'NumPy loss = ' + str(loss) + '\n')
      
          ### calculation via PyTorch #################################
      
          # predictions from model (just made up example data in this case)
          # rows = 3 samples, cols = 4 categories
          preds = torch.tensor([[1, 4, 1, 1],
                                [5, 1, 2, 1],
                                [1, 2, 5, 1]], dtype=torch.float32)
      
          # ground truths, NOT one hot encoded
          gndTrs = torch.tensor([1, 0, 2], dtype=torch.int64)
      
          loss = F.cross_entropy(preds, gndTrs)
      
          print('PyTorch loss = ' + str(loss) + '\n')
      # end function
      
      def softmax(x: np.ndarray) -> np.ndarray:
          numSamps = x.shape[0]
      
          for i in range(numSamps):
              x[i] = np.exp(x[i]) / np.sum(np.exp(x[i]))
          # end for
      
          return x
      # end function
      
      def calcCrossEntropyLoss(preds: np.ndarray, gndTrs: np.ndarray) -> np.ndarray:
          assert len(preds.shape) == 2
          assert len(gndTrs.shape) == 1
          assert preds.shape[0] == gndTrs.shape[0]
      
          numSamps = preds.shape[0]
      
          mySum = 0.0
          for i in range(numSamps):
              # Note: in numpy, "log" is actually natural log (ln)
              mySum += -1 * np.log(preds[i, gndTrs[i]])
          # end for
      
          crossEntLoss = mySum / numSamps
          return crossEntLoss
      # end function
      
      if __name__ == '__main__':
          main()
      

      程序输出:

      NumPy loss = 0.10165966302156448
      
      PyTorch loss = tensor(0.1017)
      

      【讨论】:

        猜你喜欢
        • 2017-11-24
        • 1970-01-01
        • 2019-08-17
        • 2019-05-28
        • 2019-08-19
        • 2022-01-15
        • 1970-01-01
        • 1970-01-01
        • 2020-10-05
        相关资源
        最近更新 更多