【问题标题】:RuntimeError: Input type (torch.cuda.LongTensor) and weight type (torch.cuda.FloatTensor) should be the sameRuntimeError:输入类型(torch.cuda.LongTensor)和权重类型(torch.cuda.FloatTensor)应该相同
【发布时间】:2021-07-18 07:01:07
【问题描述】:

我正在尝试使用 PyTorch 的示例和我自己的数据来训练 CNN。我有以下与 PyTorch 相同的训练循环:

def train_model(model, criterion, optimizer, scheduler, num_epochs=25):
    since = time.time()

    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0

    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch, num_epochs - 1))
        print('-' * 10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'val']:
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            # Iterate over data.
            for i, batch in enumerate(loaders[phase]):
                inputs = batch["image"].type(torch.cuda.LongTensor).to(device)
                labels = batch["label"].type(torch.cuda.LongTensor).to(device)

                # zero the parameter gradients
                optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs.type(torch.cuda.LongTensor).to(device))
                    _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)

                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # statistics
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)
            if phase == 'train':
                scheduler.step()

            epoch_loss = running_loss / dataset_sizes[phase]
            epoch_acc = running_corrects.double() / dataset_sizes[phase]

            print('{} Loss: {:.4f} Acc: {:.4f}'.format(
                phase, epoch_loss, epoch_acc))

            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())

        print()

    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(
        time_elapsed // 60, time_elapsed % 60))
    print('Best val Acc: {:4f}'.format(best_acc))

    # load best model weights
    model.load_state_dict(best_model_wts)
    return model

但是,我得到了错误:

Epoch 0/24
----------
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-24-79684c739f29> in <module>()
----> 1 model_ft = train_model(resnet_cnn, criterion, optimizer_ft, exp_lr_scheduler, num_epochs=25)

6 frames
<ipython-input-21-393aa43e7b06> in train_model(model, criterion, optimizer, scheduler, num_epochs)
     30                 # track history if only in train
     31                 with torch.set_grad_enabled(phase == 'train'):
---> 32                     outputs = model(inputs.type(torch.cuda.LongTensor).to(device))
     33                     _, preds = torch.max(outputs, 1)
     34                     loss = criterion(outputs, labels)

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    887             result = self._slow_forward(*input, **kwargs)
    888         else:
--> 889             result = self.forward(*input, **kwargs)
    890         for hook in itertools.chain(
    891                 _global_forward_hooks.values(),

/usr/local/lib/python3.7/dist-packages/torchvision/models/resnet.py in forward(self, x)
    247 
    248     def forward(self, x: Tensor) -> Tensor:
--> 249         return self._forward_impl(x)
    250 
    251 

/usr/local/lib/python3.7/dist-packages/torchvision/models/resnet.py in _forward_impl(self, x)
    230     def _forward_impl(self, x: Tensor) -> Tensor:
    231         # See note [TorchScript super()]
--> 232         x = self.conv1(x)
    233         x = self.bn1(x)
    234         x = self.relu(x)

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    887             result = self._slow_forward(*input, **kwargs)
    888         else:
--> 889             result = self.forward(*input, **kwargs)
    890         for hook in itertools.chain(
    891                 _global_forward_hooks.values(),

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py in forward(self, input)
    397 
    398     def forward(self, input: Tensor) -> Tensor:
--> 399         return self._conv_forward(input, self.weight, self.bias)
    400 
    401 class Conv3d(_ConvNd):

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight, bias)
    394                             _pair(0), self.dilation, self.groups)
    395         return F.conv2d(input, weight, bias, self.stride,
--> 396                         self.padding, self.dilation, self.groups)
    397 
    398     def forward(self, input: Tensor) -> Tensor:

RuntimeError: Input type (torch.cuda.LongTensor) and weight type (torch.cuda.FloatTensor) should be the same

如上图所示,我尝试使用torch.cuda.LongTensor 转换我的数据,但由于某种原因它不起作用。有人有什么想法吗?提前非常感谢您!

编辑 1:

def train_model(model, criterion, optimizer, scheduler, num_epochs=25):
    since = time.time()

    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0

    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch, num_epochs - 1))
        print('-' * 10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'val']:
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            # Iterate over data.
            for i, batch in enumerate(loaders[phase]):
                inputs = batch["image"].type(torch.cuda.FloatTensor).to(device)
                labels = batch["label"].type(torch.cuda.FloatTensor).to(device)

                # zero the parameter gradients
                optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs.to(device))
                    _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)

                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # statistics
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)
            if phase == 'train':
                scheduler.step()

            epoch_loss = running_loss / dataset_sizes[phase]
            epoch_acc = running_corrects.double() / dataset_sizes[phase]

            print('{} Loss: {:.4f} Acc: {:.4f}'.format(
                phase, epoch_loss, epoch_acc))

            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())

        print()

    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(
        time_elapsed // 60, time_elapsed % 60))
    print('Best val Acc: {:4f}'.format(best_acc))

    # load best model weights
    model.load_state_dict(best_model_wts)
    return model

这会返回新的错误:

Epoch 0/24
----------
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-38-79684c739f29> in <module>()
----> 1 model_ft = train_model(resnet_cnn, criterion, optimizer_ft, exp_lr_scheduler, num_epochs=25)

4 frames
<ipython-input-36-9b4381de034f> in train_model(model, criterion, optimizer, scheduler, num_epochs)
     32                     outputs = model(inputs.to(device))
     33                     _, preds = torch.max(outputs, 1)
---> 34                     loss = criterion(outputs, labels)
     35 
     36                     # backward + optimize only if in training phase

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    887             result = self._slow_forward(*input, **kwargs)
    888         else:
--> 889             result = self.forward(*input, **kwargs)
    890         for hook in itertools.chain(
    891                 _global_forward_hooks.values(),

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/loss.py in forward(self, input, target)
   1046         assert self.weight is None or isinstance(self.weight, Tensor)
   1047         return F.cross_entropy(input, target, weight=self.weight,
-> 1048                                ignore_index=self.ignore_index, reduction=self.reduction)
   1049 
   1050 

/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction)
   2691     if size_average is not None or reduce is not None:
   2692         reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2693     return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
   2694 
   2695 

/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction)
   2386         )
   2387     if dim == 2:
-> 2388         ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
   2389     elif dim == 4:
   2390         ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)

RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 'target' in call to _thnn_nll_loss_forward

【问题讨论】:

    标签: python pandas pytorch


    【解决方案1】:

    我不建议您进行第一次编辑,而是使用 model = model.type(torch.cuda.LongTensor) 将模型类型更改为 LongTensor

    【讨论】:

    • 嗨@Prajot Kuvalekar - 谢谢你的回复,我已经尝试过model = model.type(torch.cuda.LongTensor) 但是我得到了与我发布的相同的初始错误
    • 将所有数据保存在 LongTensor 类型 ....即 model , inputs , target
    • 嗨@Prajot Kuvalekar - 这是如何实现的?我是否也在modelinputstarget 上使用.type(torch.cuda.LongTensor) 方法?谢谢
    • 是的......每个地方
    【解决方案2】:

    默认情况下,模型的参数是 FloatTensor 数据类型。

    inputs = batch["image"].type(torch.cuda.FloatTensor).to(device)
    labels = batch["label"].type(torch.cuda.FloatTensor).to(device)
    

    应该纠正这个错误,或者你可以修改你的数据加载器类本身。

    【讨论】:

    • 嗨@Nivesh Gadipudi - 非常感谢您的回复。我试图包含您的建议,但是我遇到了一个新错误。我已将其包含在编辑 1 下的帖子中
    猜你喜欢
    • 2023-03-27
    • 2021-03-20
    • 2021-07-19
    • 1970-01-01
    • 2022-08-10
    • 2020-09-29
    • 2021-01-03
    • 2022-11-11
    • 2021-12-30
    相关资源
    最近更新 更多