如何在 Pytorch 中使用 torch.nn.Sequential 实现我自己的 ResNet？答案

【问题标题】：How to implement my own ResNet with torch.nn.Sequential in Pytorch?如何在 Pytorch 中使用 torch.nn.Sequential 实现我自己的 ResNet？
【发布时间】：2023-03-14 16:55:01
【问题描述】：

我想实现一个 ResNet 网络（或者更确切地说，残差块），但我真的希望它采用顺序网络形式。

我所说的顺序网络形式是指：

## mdl5, from cifar10 tutorial
mdl5 = nn.Sequential(OrderedDict([
    ('pool1', nn.MaxPool2d(2, 2)),
    ('relu1', nn.ReLU()),
    ('conv1', nn.Conv2d(3, 6, 5)),
    ('pool1', nn.MaxPool2d(2, 2)),
    ('relu2', nn.ReLU()),
    ('conv2', nn.Conv2d(6, 16, 5)),
    ('relu2', nn.ReLU()),
    ('Flatten', Flatten()),
    ('fc1', nn.Linear(1024, 120)), # figure out equation properly
    ('relu4', nn.ReLU()),
    ('fc2', nn.Linear(120, 84)),
    ('relu5', nn.ReLU()),
    ('fc3', nn.Linear(84, 10))
]))

当然，NN 乐高积木是“ResNet”。

我知道方程式是这样的：

但我不确定如何在 Pytorch AND Sequential 中执行此操作。顺序对我来说很关键！

交叉发布：

【问题讨论】：

标签： machine-learning neural-network deep-learning conv-neural-network pytorch

【解决方案1】：

您不能仅使用 torch.nn.Sequential 来执行此操作，因为顾名思义，它需要按顺序进行操作，而您的操作是并行的。

原则上，您可以像这样非常轻松地构建自己的block：

import torch

class ResNet(torch.nn.Module):
    def __init__(self, module):
        super().__init__()
        self.module = module

    def forward(self, inputs):
        return self.module(inputs) + inputs

谁能用这样的东西：

model = torch.nn.Sequential(
    torch.nn.Conv2d(3, 32, kernel_size=7),
    # 32 filters in and out, no max pooling so the shapes can be added
    ResNet(
        torch.nn.Sequential(
            torch.nn.Conv2d(32, 32, kernel_size=3),
            torch.nn.ReLU(),
            torch.nn.BatchNorm2d(32),
            torch.nn.Conv2d(32, 32, kernel_size=3),
            torch.nn.ReLU(),
            torch.nn.BatchNorm2d(32),
        )
    ),
    # Another ResNet block, you could make more of them
    # Downsampling using maxpool and others could be done in between etc. etc.
    ResNet(
        torch.nn.Sequential(
            torch.nn.Conv2d(32, 32, kernel_size=3),
            torch.nn.ReLU(),
            torch.nn.BatchNorm2d(32),
            torch.nn.Conv2d(32, 32, kernel_size=3),
            torch.nn.ReLU(),
            torch.nn.BatchNorm2d(32),
        )
    ),
    # Pool all the 32 filters to 1, you may need to use `torch.squeeze after this layer`
    torch.nn.AdaptiveAvgPool2d(1),
    # 32 10 classes
    torch.nn.Linear(32, 10),
)

通常被忽视的事实（在涉及浅层网络时没有真正的后果）是跳过连接应该没有任何非线性，如ReLU 或卷积层，这就是您在上面看到的（来源：Identity Mappings in Deep Residual Networks)。

【讨论】：

不可能有一个“剩余块”而不是包裹整个网络吗？只有一些层有跳过连接？
您只是包装了这个“蓝色”连接，同时提供了紫色快捷方式。当然，你可以做到，但是有大量的 ResNet 块及其变体，使用这个版本你可以轻松地使用它们中的任何一个。
你的意思是像 pytorch 中的很多东西？链接？ :)
Here 是一些，其他的是整个 arxiv，例如我在答案中链接的那个。它们的共同点是捷径，而不是relu-conv-batch norm。
必须调用基类的 init 才能使此代码正常工作。在 ResNet 的 init 方法中添加super(ResNet, self).__init__()。