为什么更新叶子张量时没有梯度答案

【问题标题】：why there is no gradient when the leaf tensor is updated为什么更新叶子张量时没有梯度
【发布时间】：2021-07-30 14:16:51
【问题描述】：

以下是我的玩具示例

import torch
x = torch.tensor(3.0, requires_grad = True)
y = x**2
y.backward(retain_graph = True)
print(x.grad)
x = x + 4
y.backward(retain_graph = True)
print(x.grad)

第一个打印打印 x 的渐变，而第二个打印什么也不打印。为什么 x 被 x = x + 4 更新后 x 的梯度消失了？谢谢。

新增问题：

下面的代码可以做我想做的事，它迭代地更新 x。但是，每次更新时我都需要添加 x.requires_grad = True 。不使用 x.requires_grad = True 有没有更好的方法？谢谢。

x = torch.tensor(3.0, requires_grad = True)
y = x**2
y.backward(retain_graph = True)
with torch.no_grad():
  x = x + x.grad
x.requires_grad = True
y = x**2
y.backward(retain_graph = True)
print(x.grad)

更新：我的解决方案

x = torch.tensor(3.0, requires_grad = True)
y = x**2
y.backward(retain_graph = True)
print(x.grad)
x.data = x.data + x.grad.data
x.grad.zero_()
y = x**2
y.backward(retain_graph = True)
print(x.grad)

代码的结果是

tensor(6.)
tensor(18.)

，这正是我想要的。谢谢。

【问题讨论】：

标签： pytorch gradient

【解决方案1】：

那主要是因为x=x+4 没有更新张量；它创建一个新张量并将其分配给变量x。

我改代码打印x前后x=x+4前后张量的数据指针如下：

import torch
x = torch.tensor(3.0, requires_grad = True)
y = x**2
y.backward(retain_graph = True)
print(x.data_ptr())
print(x.grad)
z = x
x = x + 4
y.backward(retain_graph = True)
print(x.data_ptr())
print(x.grad)
print(x is z)
print(z.grad)

输出是：

3007755542592
tensor(6.)
3007755541184
None
False
tensor(12.)

首先，你会注意到x 中张量的数据指针在x=x+4 之后发生了变化。这是因为x+4 创建了一个新的张量，现在x 拥有它。

其次，我将原始张量保存在另一个变量z 中。如您所见，x is z 返回False，z 的梯度是y.backward(retain_graph = True) 被调用两次后的两倍。 z 现在持有 x 中 x=x+4 行之前的张量。

【讨论】：

非常感谢@Yahia Zakaria