Pytorch官方教程学习笔记（6）

1.PyTorch: Tensors 与 autograd

A fully-connected ReLU network with one hidden layer and no biases, trained to
predict y from x by minimizing squared Euclidean distance.

This implementation computes the forward pass using operations on PyTorch
Tensors, and uses PyTorch autograd to compute gradients.

A PyTorch Tensor represents a node in a computational graph. If x is a
Tensor that has x.requires_grad=True then x.grad is another Tensor
holding the gradient of x with respect to some scalar value.

import torch
import matplotlib.pyplot as plt
import torch.optim as optim


dtype = torch.float
device = torch.device("cpu")
# device = torch.device("cuda:0") # Uncomment this to run on GPU

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold input and outputs.
# 将requires_grad=False设置为False表明在反向传播过程中不需要计算这些Tensors的梯度
x = torch.randn(N, D_in, device=device, dtype=dtype)
y = torch.randn(N, D_out, device=device, dtype=dtype)

# Create random Tensors for weights.
# 将requires_grad=True设置为True表明在反向传播过程中计算这些Tensors的梯度
w1 = torch.randn(D_in, H, device=device, dtype=dtype, requires_grad=True)
w2 = torch.randn(H, D_out, device=device, dtype=dtype, requires_grad=True)

learning_rate = 1e-6

optimizer = optim.SGD([{'params': w1}, 
                      {'params': w2}], 
                      lr = learning_rate)


#创建图并命名
plt.figure('Loss')
ax = plt.gca()
#设置x轴、y轴名称
ax.set_xlabel('iter')
ax.set_ylabel('loss')

iter_plot = []
loss_plot = []
for t in range(500):
    # Forward pass: compute predicted y using operations on Tensors; these
    # are exactly the same operations we used to compute the forward pass using
    # Tensors, but we do not need to keep references to intermediate values since
    # we are not implementing the backward pass by hand.
    y_pred = x.mm(w1).clamp(min=0).mm(w2)

    # Compute and print loss using operations on Tensors.
    # Now loss is a Tensor of shape (1,)
    # loss.item() gets the a scalar value held in the loss.
    loss = (y_pred - y).pow(2).sum()

    iter_plot.append(t)
    loss_plot.append(loss.item())
    
    # Use autograd to compute the backward pass. This call will compute the
    # gradient of loss with respect to all Tensors with requires_grad=True.
    # After this call w1.grad and w2.grad will be Tensors holding the gradient
    # of the loss with respect to w1 and w2 respectively.
    loss.backward()

    # Manually update weights using gradient descent. Wrap in torch.no_grad()
    # because weights have requires_grad=True, but we don't need to track this
    # in autograd.
    # 在使用autograd过程中不需要对各变量的梯度进行追踪，因而使用 with torch.no_grad()    
    # An alternative way is to operate on weight.data and weight.grad.data.
    # Recall that tensor.data gives a tensor that shares the storage with
    # tensor, but doesn't track history.
    # You can also use torch.optim.SGD to achieve this.
#     with torch.no_grad():
#         w1 -= learning_rate * w1.grad
#         w2 -= learning_rate * w2.grad

#         # Manually zero the gradients after updating weights
#         w1.grad.zero_()
#         w2.grad.zero_()

    optimizer.step()    # Does the update
    optimizer.zero_grad()# 每进行一次更新都要对梯度清零
    
ax.plot(iter_plot, loss_plot, color='r', linewidth=1, alpha=0.6)
plt.show()

Pytorch官方教程学习笔记（6）

2.PyTorch: 定义新的autograd函数

A fully-connected ReLU network with one hidden layer and no biases, trained to
predict y from x by minimizing squared Euclidean distance.

This implementation computes the forward pass using operations on PyTorch
Variables, and uses PyTorch autograd to compute gradients.

在本文中，我们使用自定义autograd函数实现ReLU函数。
实现自定义autograd函数的方法：
自定义的autograd函数有两个方法：forward函数使用输入Tensors计算输出Tensors；backward函数接收标量值关于输出Tensors的梯度，并计算同样的标量关于输入Tensors的梯度。

import torch
import matplotlib.pyplot as plt


class MyReLU(torch.autograd.Function):
    """
    为了实现自定义autograd函数，我们需要继承torch.autograd.Function类并重载forward、backward方法，这两个方法对Tensors进行操作。
    """

    @staticmethod
    def forward(ctx, input):
        """
        在forward方法中，我们接收input Tensor，并返回output Tensor。上下文对象
        ctx对用于backward计算的相关信息进行存储。我们可以使用ctx.save_for_backward
        方法对任意对象（用于backward操作）进行存储。
        """
        ctx.save_for_backward(input)# 将input存入ctx中
        return input.clamp(min=0)# 返回输出张量

    @staticmethod
    def backward(ctx, grad_output):
        """
        在backward函数中，我们接收到一个包含loss关于输入tensor的张量，我们需要在此梯度的基础上
        计算loss关于input张量的梯度（链式求导法则）。
        """
        input, = ctx.saved_tensors
        grad_input = grad_output.clone()
        grad_input[input < 0] = 0
        return grad_input


dtype = torch.float
device = torch.device("cpu")
# device = torch.device("cuda:0") # Uncomment this to run on GPU

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold input and outputs.
x = torch.randn(N, D_in, device=device, dtype=dtype)
y = torch.randn(N, D_out, device=device, dtype=dtype)

# Create random Tensors for weights.
w1 = torch.randn(D_in, H, device=device, dtype=dtype, requires_grad=True)
w2 = torch.randn(H, D_out, device=device, dtype=dtype, requires_grad=True)

#创建图并命名
plt.figure('Loss')
ax = plt.gca()
#设置x轴、y轴名称
ax.set_xlabel('iter')
ax.set_ylabel('loss')

iter_plot = []
loss_plot = []
learning_rate = 1e-6
for t in range(500):
    # 为了使用自定义函数，需要调用Function.apply方法声明实例。     
    relu = MyReLU.apply

    # Forward pass: compute predicted y using operations; we compute
    # ReLU using our custom autograd operation.
    y_pred = relu(x.mm(w1)).mm(w2)

    # Compute and print loss
    loss = (y_pred - y).pow(2).sum()
    
    iter_plot.append(t)
    loss_plot.append(loss.item())
    
    # Use autograd to compute the backward pass.
    loss.backward()

    # Update weights using gradient descent
    with torch.no_grad():
        w1 -= learning_rate * w1.grad
        w2 -= learning_rate * w2.grad

        # Manually zero the gradients after updating weights
        w1.grad.zero_()
        w2.grad.zero_()
        
ax.plot(iter_plot, loss_plot, color='r', linewidth=1, alpha=0.6)
plt.show()

Pytorch官方教程学习笔记（6）