【问题标题】:Pytorch: Size Mismatch during running a test image through a trained CNNPytorch:通过训练有素的 CNN 运行测试图像期间的大小不匹配
【发布时间】:2018-09-16 22:22:59
【问题描述】:

我正在通过教程来训练/测试卷积神经网络 (CNN),但在准备测试图像以通过经过训练的网络运行它时遇到问题。我最初的猜测是,它与网络张量输入的正确格式有关。

这是网络的代码。

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.nn.init as I


class Net(nn.Module):

def __init__(self):
    super(Net, self).__init__()

    ## 1. This network takes in a square (same width and height), grayscale image as input
    ## 2. It ends with a linear layer that represents the keypoints
    ## this last layer output 136 values, 2 for each of the 68 keypoint (x, y) pairs

    # input size 224 x 224 
    # after the first conv layer, (W-F)/S + 1 = (224-5)/1 + 1 = 220 
    # after one pool layer, this becomes (32, 110, 110)
    self.conv1 = nn.Conv2d(1, 32, 5)

    # maxpool layer 
    # pool with kernel_size = 2, stride = 2
    self.pool = nn.MaxPool2d(2,2)

    # second conv layer: 32 inputs, 64 outputs , 3x3 conv
    ## output size = (W-F)/S + 1 = (110-3)/1 + 1 = 108
    ## output dimension: (64, 108, 108)
    ## after another pool layer, this becomes (64, 54, 54)
    self.conv2 = nn.Conv2d(32, 64, 3)

    # third conv layer: 64 inputs, 128 outputs , 3x3 conv
    ## output size = (W-F)/S + 1 = (54-3)/1 + 1 = 52
    ## output dimension: (128, 52, 52)
    ## after another pool layer, this becomes (128, 26, 26)
    self.conv3 = nn.Conv2d(64,128,3)

    self.conv_drop = nn.Dropout(p = 0.2)
    self.fc_drop = nn.Dropout(p = 0.4)

    # 64 outputs * 5x5 filtered/pooled map  = 186624
    self.fc1 = nn.Linear(128*26*26, 1000)
    #
    self.fc2 = nn.Linear(1000, 1000)

    self.fc3 = nn.Linear(1000, 136)



def forward(self, x):
    x = self.pool(F.relu(self.conv1(x)))
    x = self.pool(F.relu(self.conv2(x)))
    x = self.pool(F.relu(self.conv3(x)))
    x = self.conv_drop(x)

    # prep for linear layer
    # flattening 
    x = x.view(x.size(0), -1) 

    # two linear layers with dropout in between
    x = F.relu(self.fc1(x))
    x = self.fc_drop(x)
    x = self.fc2(x)
    x = self.fc_drop(x)
    x = self.fc3(x) 

    return x

也许我在层输入中的计算是错误的?

这是测试运行的代码块:(您可以将“roi”视为标准的 numpy 图像。)

# loop over the detected faces from your haar cascade
for i, (x,y,w,h) in enumerate(faces):

    plt.figure(figsize=(10,5))
    ax = plt.subplot(1, len(faces), i+1)
    # Select the region of interest that is the face in the image 
    roi = image_copy[y:y+h, x:x+w]

    ## TODO: Convert the face region from RGB to grayscale
    roi = cv2.cvtColor(roi, cv2.COLOR_RGB2GRAY)

    ## TODO: Normalize the grayscale image so that its color range falls in [0,1] instead of [0,255]
    roi = np.multiply(roi, 1/255)

    ## TODO: Rescale the detected face to be the expected square size for your CNN (224x224, suggested)
    roi = cv2.resize(roi, (244,244))

    roi = roi.reshape(roi.shape[0], roi.shape[1], 1)
    roi = roi.transpose((2, 0, 1))

    ## TODO: Change to tensor
    roi = torch.from_numpy(roi)
    roi = roi.type(torch.FloatTensor)
    roi = roi.unsqueeze(0)
    print (roi.shape)

    ## TODO: run it through the net 
    output_pts = net(roi)

我收到错误消息:

RuntimeError: size mismatch, m1: [1 x 100352], m2: [86528 x 1000] at /opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/TH/generic/THTensorMath.c:2033

需要注意的是,如果我在提供的测试套件中运行训练有素的网络(张量输入已经准备好),它不会出现错误并按预期运行。我认为这意味着网络架构本身的设计没有任何问题。我认为我准备图像的方式有问题。

'roi.shape' 的输出是:

torch.Size([1, 1, 244, 244]) 

应该没问题,因为 ([batch_size, color_channel, x, y])。

更新:我在通过网络运行时打印出层的形状。事实证明,对于任务的测试图像和来自测试套件的给定测试图像,FC 的匹配输入维度是不同的。然后我几乎 80% 确定我为网络准备输入图像是错误的。但是,如果两者的输入张量具有精确相同的维度([1,1,244,244]),它们怎么会有不同的匹配维度呢?

使用提供的测试套件时(运行良好):

input: torch.Size([1, 1, 224, 224])
layer before 1st CV: torch.Size([1, 1, 224, 224])
layer after 1st CV pool: torch.Size([1, 32, 110, 110])
layer after 2nd CV pool: torch.Size([1, 64, 54, 54])
layer after 3rd CV pool: torch.Size([1, 128, 26, 26])
flattend layer for the 1st FC: torch.Size([1, 86528])

准备/运行测试图像时:

input: torch.Size([1, 1, 244, 244])
layer before 1st CV: torch.Size([1, 1, 244, 244])
layer after 1st CV pool: torch.Size([1, 32, 120, 120]) #<- what happened here??
layer after 2nd CV pool: torch.Size([1, 64, 59, 59])
layer after 3rd CV pool: torch.Size([1, 128, 28, 28])
flattend layer for the 1st FC: torch.Size([1, 100352])

【问题讨论】:

  • 您是否在 GPU 上运行它?如果是这样,请尝试在您的 CPU 上运行它以进行调试 - 错误消息通常更有帮助,而且您也不需要在那里进行训练。也许您还可以发布“更新的错误消息”。
  • @dennlinger 我在 CPU 上运行它。如果您知道我如何生成/调试更多信息,我将很高兴听到。
  • @dennlinger 我更新了问题
  • 当这种情况发生在我身上时,通常是 fc1 中的输入数量错误。确实,如果您在 conv3 之后打印出张量大小并且它具有相同的大小,这将很奇怪。你能确认它是一样的吗?
  • @Demplo 请查看更新:)

标签: python deep-learning computer-vision conv-neural-network pytorch


【解决方案1】:

你有没有注意到你在图像准备中有这条线。

## TODO: Rescale the detected face to be the expected square size for your CNN (224x224, suggested)
roi = cv2.resize(roi, (244,244))

所以您只需将其调整为 244x244 而不是 224x224。

【讨论】:

    猜你喜欢
    • 2018-06-22
    • 2020-02-03
    • 2019-09-20
    • 1970-01-01
    • 2019-12-12
    • 1970-01-01
    • 2018-11-11
    • 2021-09-06
    • 2019-06-13
    相关资源
    最近更新 更多