如何查找导致 RuntimeError 的变量：梯度计算所需的变量之一已被就地操作修改答案

【问题标题】：How to find which variable is causing RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation如何查找导致 RuntimeError 的变量：梯度计算所需的变量之一已被就地操作修改
【发布时间】：2019-09-20 02:08:57
【问题描述】：

我正在尝试使用 GRUcell 层创建一个极其简单的网络来执行以下任务：在两个位置之一给出一个提示。在 T 个时间步之后，代理必须学会在相反的位置采取特定的动作。

尝试计算反向梯度时出现以下错误：

RuntimeError：梯度计算所需的变量之一有已被就地操作修改。

一个问题是我不完全理解我的哪一段代码正在执行就地操作。

我在 stackoverflow 和 pytorch 论坛上阅读了其他帖子，这些帖子都建议使用 .clone() 操作。我已经在我的所有代码中添加了它，我认为它可能会有所作为，但我没有成功。

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.gru    = nn.GRUCell(2,50) # GRU layer taking 2 inputs (L or R), has 50 units
        self.actor  = nn.Linear(50,2)  # Linear Actor layer with 2 outputs, takes GRU as input
        self.critic = nn.Linear(50,1)  # Linear Critic layer with 1 output, takes GRU as input

    def forward(self, s, h):
        h  = self.gru(s,h)  # give the input and previous hidden state to the GRU layer
        c  = self.critic(h) # estimate the value of the current state
        pi = F.softmax(self.actor(h),dim=1) # calculate the policy 
        return (h,c,pi)

    def backward_rollout(self, gamma, R_t, c_t, t):
        R_t[0,t] = gamma*R_t[0,t+1].clone()
        # calculate the reward prediction error
        Delta_t[0,t] = c_t[0,t].clone() - R_t[0,t].clone()

        #calculate the loss for the critic 
        crit = c_t[0,t].clone()
        ret  = R_t[0,t].clone()
        Value_l[0,t] = F.smooth_l1_loss(crit,ret)


###################################
# Run a trial

# parameters
N      = 1   # number of trials to run
T      = 10   # number of time-steps in a trial
gamma  = 0.98 # temporal discount factor

# for each trial
for n in range(N):   
    sample  = np.random.choice([0,1],1)[0] # pick the sample input for this trial
    s_t     = torch.zeros((1,2,T))   # state at each time step

    h_0     = torch.zeros((1,50))    # initial hidden state
    h_t     = torch.zeros((1,50,T))  # hidden state at each time step

    c_t     = torch.zeros((1,T))    # critic at each time step
    pi_t    = torch.zeros((1,2,T))  # policy at each time step

    R_t     = torch.zeros((1,T))  # return at each time step
    Delta_t = torch.zeros((1,T))  # difference between critic and true return at each step
    Value_l = torch.zeros((1,T))  # value loss

    # set the input (state) vector/tensor
    s_t[0,sample,0] = 1.0 # set first time-step stimulus
    s_t[0,0,-1]     = 1.0 # set last time-step stimulus
    s_t[0,1,-1]     = 1.0 # set last time-step stimulus

    # step through the trial
    for t in range(T):  
        # run a forward step
        state = s_t[:,:,t].clone()
        if t is 0:
            (hidden_state, critic, policy) = net(state, h_0)

        else:
            (hidden_state, critic, policy) = net(state, h_t[:,:,t-1])

        h_t[:,:,t]  = hidden_state.clone()
        c_t[:,t]    = critic.clone()
        pi_t[:,:,t] = policy.clone()

    # select an action using the policy
    action = np.random.choice([0,1], 1, p = policy[0,:].detach().numpy()) 
    #action = int(np.random.uniform() < pi[0,1])

    # compare the action to the sample
    if action is sample:
        r = 0
        print("WRONG!")
    else:
        r = 1
        print("RIGHT!")

    #h_t_old = h_t
    #s_t_old = s_t

    # step backwards through the trial to calculate gradients
    R_t[0,-1]     = r
    Delta_t[0,-1] = c_t[0,-1].clone() - r
    Value_l[0,-1] = F.smooth_l1_loss(c_t[0,-1],R_t[0,-1]).clone()

    for t in np.arange(T-2,-1,-1): #backwards rollout 
        net.backward_rollout(gamma, R_t, c_t, t)

    Vl = Value_l.clone().sum()#calculate total loss

    Vl.backward() #calculate the derivatives 
    opt.step() #update the weights
    opt.zero_grad() #zero gradients before next trial

【问题讨论】：

标签： python pytorch

【解决方案1】：

您可以尝试anomaly_detection 来查明确切的违规就地操作：https://github.com/pytorch/pytorch/issues/15803

Value_l[0,-1] = 和类似的操作是就地操作。您可以通过执行Value_l.data[0,-1] = 来回避检查，但这不会存储在计算图中，可能不是一个好主意。相关讨论在这里：https://discuss.pytorch.org/t/how-to-get-around-in-place-operation-error-if-index-leaf-variable-for-gradient-update/14554

【讨论】：

这里给出的例子直接指向了一个问题行。当我在我的代码中使用它时，我得到一个我不知道如何理解的错误。其中最后一行是：文件“”，第 63 行，在 Value_l[0,-1] = F.smooth_l1_loss(c_t[0,-1],R_t[0 ,-1]).clone() 文件“/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py”，第 2114 行，smooth_l1_loss ret = torch._C._nn.smooth_l1_loss(扩展输入，扩展目标，_Reduction.get_enum(reduction))
关于这个问题还有一个评论：异常检测不一定会指向你导致失败的inplace操作。相反，它会将您指向无法在反向传递中计算其梯度的操作。归咎于原地操作可能在此之后的任何地方发生，修改参与异常检测发现的线的张量之一。我如何找到有问题的就地操作？或者什么是合适的解决方法？