F.relu(X) 和 torch.max(X, 0) 的区别答案

【问题标题】：Differences between F.relu(X) and torch.max(X, 0)F.relu(X) 和 torch.max(X, 0) 的区别
【发布时间】：2019-04-06 02:35:29
【问题描述】：

我正在尝试实现以下loss function

对我来说，最直接的实现是使用torch.max

losses = torch.max(ap_distances - an_distances + margin, torch.Tensor([0]))

但是，我看到其他 implementations on github 使用 F.relu

losses = F.relu(ap_distances - an_distances + margin)

它们提供基本相同的输出，但我想知道这两种方法之间是否有任何根本区别。

【问题讨论】：

【解决方案1】：

torch.max 不能根据这个discussion 微分。损失函数需要连续且可微分才能进行反向传播。 relu 是可微的，因为它可以近似，因此可以在损失函数中使用它。

【讨论】：

【解决方案2】：

如果你想像 ReLU6 (https://pytorch.org/docs/stable/generated/torch.nn.ReLU6.html) 那样限制输出值，你可以使用

import torch.nn.functional as F

x1 = F.hardtanh(x, min_value, max_value)

这保留了模型的可微性。这将产生如下结果。（最小值和最大值会有所不同）

【讨论】：