为什么 tensorflow 和 pytorch 中的 Resnet 模型给出不同的特征长度？答案

【问题标题】：Why Resnet model in tensorflow and pytorch give different feature length?为什么 tensorflow 和 pytorch 中的 Resnet 模型给出不同的特征长度？
【发布时间】：2021-09-02 08:09:16
【问题描述】：

我正在尝试通过在 imagenet 数据集上预训练的 Resnet 模型提取图像的特征，因为网络应该给出 2048 个特征的长度。当我尝试使用 TensorFlow 时，它给出了相同数量的特征长度，但是当我尝试 PyTorch 版本的 Resnet 时，它给了我 1000 的长度。

代码如下对于张量流

import numpy as np
from numpy.linalg import norm
import pickle
from tqdm import tqdm, tqdm_notebook
import os
import random
import time
import math
import tensorflow
from tensorflow.keras.preprocessing import image
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications.resnet50 import ResNet50, preprocess_input
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.applications.vgg19 import VGG19
from tensorflow.keras.applications.mobilenet import MobileNet
from tensorflow.keras.applications.inception_v3 import InceptionV3
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Flatten, Dense, Dropout, GlobalAveragePooling2D

def model_picker(name):
    if (name == 'vgg16'):
        model = VGG16(weights='imagenet',
                      include_top=False,
                      input_shape=(224, 224, 3),
                      pooling='max')
    elif (name == 'vgg19'):
        model = VGG19(weights='imagenet',
                      include_top=False,
                      input_shape=(224, 224, 3),
                      pooling='max')
    elif (name == 'mobilenet'):
        model = MobileNet(weights='imagenet',
                          include_top=False,
                          input_shape=(224, 224, 3),
                          pooling='max',
                          depth_multiplier=1,
                          alpha=1)
    elif (name == 'inception'):
        model = InceptionV3(weights='imagenet',
                            include_top=False,
                            input_shape=(224, 224, 3),
                            pooling='max')
    elif (name == 'resnet'):
        model = ResNet50(weights='imagenet',
                         include_top=False,
                         input_shape=(224, 224, 3),
                        pooling='max')
    elif (name == 'xception'):
        model = Xception(weights='imagenet',
                         include_top=False,
                         input_shape=(224, 224, 3),
                         pooling='max')
    else:
        print("Specified model not available")
    return model


model_architecture = 'resnet'
model = model_picker(model_architecture)

def extract_features(img_path, model):
    input_shape = (224, 224, 3)
    img = image.load_img(img_path,
                         target_size=(input_shape[0], input_shape[1]))
    img_array = image.img_to_array(img)
    expanded_img_array = np.expand_dims(img_array, axis=0)
    preprocessed_img = preprocess_input(expanded_img_array)
    features = model.predict(preprocessed_img)
    flattened_features = features.flatten()
    normalized_features = flattened_features / norm(flattened_features)
    return normalized_features
features = extract_features('dog.jpg', model)
print(len(features))

> 2048

如你所见，它通过 resnet50 模型给出了 2048 个特征的长度

下面是 PyTorch 的代码

from torchvision import models, transforms
from PIL import Image
from torch.autograd import Variable
import torch
res_model = models.resnet50(pretrained=True)
def image_loader(image,model,use_gpu= False):
  transform = transforms.Compose([
                                  transforms.Resize(256),
                                  transforms.CenterCrop(224),
                                  transforms.ToTensor()
  ])
  img = Image.open(image)
  img = transform(img)
  print(img.shape)

  x = Variable(torch.unsqueeze(img, dim = 0).float(), requires_grad = False)
  print(x.shape)
  if use_gpu:
    x = x.cuda()
    model = model.cuda()
  y = model(x).cpu()
  print(y.size())
  y = torch.squeeze(y)
  y = y.data.numpy()
  print(y.shape)
  print(len(y))
  np.savetxt('features.txt',y,delimiter=',')
image_loader('dog.jpg',res_model)

> torch.Size([3, 224, 224]) torch.Size([1, 3, 224, 224]) torch.Size([1,
> 1000]) (1000,) 1000

如您所见，它为通过 Resnet 模型和 PyTorch 模型提取的特征提供了 1000 的长度为什么我得到不同的长度不是根据 2048 的架构得到相同的长度还是我做错了什么？

【问题讨论】：

标签： python tensorflow machine-learning deep-learning pytorch

【解决方案1】：

打印 pytorch resnet 的层将产生：

(fc): Linear(in_features=2048, out_features=1000, bias=True)

作为 Pytorch 中 resnet 的最后一层，因为该模型默认设置为用作 imagenet 数据（1000 个类）的分类器。如果你想要 2048 个特征，你可以简单地删除最后一层。

del model.fc

然后您的结果输出将具有所需的尺寸。

编辑：也许更好的是用一个标识函数简单地覆盖model.fc，而不是删除它，这样在调用forward 时它不会导致错误：

model.fc = torch.nn.Identity()

【讨论】：

感谢您的回答，但当我在删除 FC 层后运行推理时，出现 AttributeError: 'ResNet' object has no attribute 'fc' 的错误
啊，是的，您需要在forward 函数中删除对model.fc 的引用，这会有点麻烦，因为您必须覆盖原来的@987654328 @ 定义。相反，您可以将原始图层替换为不执行任何操作的虚拟图层。请参阅上面的编辑答案。