【问题标题】:The train loss and test loss are the same high in CNN Pytorch(FashionMNIST)CNN Pytorch(FashionMNIST)中的train loss和test loss一样高
【发布时间】:2019-04-30 17:35:00
【问题描述】:

问题是训练损失和测试损失相同,损失和准确率没有变化,我的 CNN 结构和训练过程有什么问题?

训练结果:

Epoch:1/30.. 训练损失:2.306.. 测试损失:2.306.. 测试准确度:0.100

Epoch:2/30.. 训练损失:2.306.. 测试损失:2.306.. 测试准确度:0.100

类代码:

class Model(nn.Module):

    def __init__(self):
        super(Model, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
        self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
        self.fc1 = nn.Linear(in_features=12 * 4 * 4, out_features=120)
        self.fc2 = nn.Linear(in_features=120, out_features=60)
        self.out = nn.Linear(in_features=60, out_features=10)
        #the output will be 0~9 (10)

以下是我的 CNN 和训练过程:

def forward(self, t):
    # implement the forward pass 
    # (1)input layer
    t = t 
    # (2) hidden conv layer
    t = self.conv1(t)
    t = F.relu(t)
    t = F.max_pool2d(t, kernel_size=2, stride=2)

    # (3) hidden conv layer
    t = self.conv2(t)
    t = F.relu(t)
    t = F.max_pool2d(t, kernel_size=2, stride=2)
    
    # (4) hidden linear layer
    t = t.reshape(-1, 12 * 4 * 4)
    t = self.fc1(t)
    t = F.relu(t)

    # (5) hidden linear layer
    t = self.fc2(t)
    t = F.relu(t)
    # (6) output layer
    t = self.out(t)
    #t = F.softmax(t, dim=1)
    return t

epoch = 30

train_losses, test_losses = [], []

for e in range(epoch):
    train_loss = 0
    test_loss = 0
    accuracy = 0

    for images, labels in train_loader:

        optimizer.zero_grad()
        op = model(images) #output 
        loss = criterion(op, labels)
        train_loss += loss.item()
        loss.backward()
        optimizer.step()

    else:
        with torch.no_grad():
            model.eval()
            for images,labels in testloader:
                log_ps = model(images)
                prob = torch.exp(log_ps)
                top_probs, top_classes = prob.topk(1, dim=1)
                equals = labels == top_classes.view(labels.shape)
                accuracy += equals.type(torch.FloatTensor).mean()
                test_loss += criterion(log_ps, labels)
        model.train()
    print("Epoch: {}/{}.. ".format(e+1, epoch),
              "Training Loss: {:.3f}.. ".format(train_loss/len(train_loader)),
              "Test Loss: {:.3f}.. ".format(test_loss/len(testloader)),
              "Test Accuracy: {:.3f}".format(accuracy/len(testloader)))
    train_losses.append(train_loss/len(train_loader))
    test_losses.append(test_loss/len(testloader))

【问题讨论】:

  • 你能告诉我你用的是哪个criterion吗?
  • 我最好的猜测是您的最后一层缺少softmax(或任何其他形式的规范化)。如果您在F.softmax(t, dim=1) 中发表评论,您是否观察到相同的结果?
  • @DavidNg 尊敬的先生,我使用 nn.Crossentropy() 函数作为标准
  • @dennlinger 亲爱的先生,我试过了,但结果相同,交叉熵函数是否应该包含 softmax ?

标签: python-3.x neural-network pytorch


【解决方案1】:

使用nn.CrossEntropyLossnn.NLLLoss 时要小心,不要混淆。

我认为您的代码没有问题,我尝试按照您定义的方式运行它。也许你没有给我们其他部分的初始化代码行,这可能是一个问题。

  • log_ps 应该是 log_softmax 值,但您的网络只产生 logits 值(正如您所说,您使用了 CrossEntropyLoss。这些行可以修改如下:
log_ps = model(images)
prob = torch.exp(log_ps)
top_probs, top_classes = prob.topk(1, dim=1)

# Change into simple code:
logits = model(images)
output = logits.argmax(dim=-1) # should give you the class of predicted label
  • 我刚刚制作了一个与您的代码非常相似的版本,它运行良好:

    1. 定义您的模型

import torch
import torch.nn as nn
import torch.nn.functional as F


class Model(nn.Module):

    def __init__(self):
        super(Model, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
        self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
        self.fc1 = nn.Linear(in_features=12 * 4 * 4, out_features=120)
        self.fc2 = nn.Linear(in_features=120, out_features=60)
        self.out = nn.Linear(in_features=60, out_features=10)
        #the output will be 0~9 (10)

    def forward(self, t):
        # implement the forward pass 
        # (1)input layer
        t = t 
        # (2) hidden conv layer
        t = self.conv1(t)
        t = F.relu(t)
        t = F.max_pool2d(t, kernel_size=2, stride=2)

        # (3) hidden conv layer
        t = self.conv2(t)
        t = F.relu(t)
        t = F.max_pool2d(t, kernel_size=2, stride=2)

        # (4) hidden linear layer
        t = t.reshape(-1, 12 * 4 * 4)
        t = self.fc1(t)
        t = F.relu(t)

        # (5) hidden linear layer
        t = self.fc2(t)
        t = F.relu(t)
        # (6) output layer
        t = self.out(t)
        return t
  1. 准备数据集
import torchvision
import torchvision.transforms as T

train_dataset = torchvision.datasets.FashionMNIST('./data', train=True, 
                                            transform=T.ToTensor(),
                                            download=True)

test_dataset = torchvision.datasets.FashionMNIST('./data', train=False, 
                                            transform=T.ToTensor(),
                                            download=True)

train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=64, shuffle=False)
  1. 开始训练
epoch = 5
model = Model();
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())

train_losses, test_losses = [], []

for e in range(epoch):
    train_loss = 0
    test_loss = 0
    accuracy = 0

    for images, labels in train_loader:

        optimizer.zero_grad()
        logits = model(images) #output 
        loss = criterion(logits, labels)
        train_loss += loss.item()
        loss.backward()
        optimizer.step()

    else:
        with torch.no_grad():
            model.eval()
            for images,labels in test_loader:
                logits = model(images)
                output = logits.argmax(dim=-1)
                equals = (labels == output)
                accuracy += equals.to(torch.float).mean()
                test_loss += criterion(logits, labels)
        model.train()
    print("Epoch: {}/{}.. ".format(e+1, epoch),
              "Training Loss: {:.3f}.. ".format(train_loss/len(train_loader)),
              "Test Loss: {:.3f}.. ".format(test_loss/len(test_loader)),
              "Test Accuracy: {:.3f}".format(accuracy/len(test_loader)))
    train_losses.append(train_loss/len(train_loader))
    test_losses.append(test_loss/len(test_loader))

这是结果,它至少收敛:

Epoch: 1/5..  Training Loss: 0.721..  Test Loss: 0.525..  Test Accuracy: 0.809
Epoch: 2/5..  Training Loss: 0.473..  Test Loss: 0.464..  Test Accuracy: 0.829
Epoch: 3/5..  Training Loss: 0.408..  Test Loss: 0.391..  Test Accuracy: 0.858
Epoch: 4/5..  Training Loss: 0.370..  Test Loss: 0.396..  Test Accuracy: 0.858
Epoch: 5/5..  Training Loss: 0.348..  Test Loss: 0.376..  Test Accuracy: 0.858

【讨论】:

  • 尊敬的先生,非常感谢!我刚刚发现我的 train_set 没有设置为 batch_size = 64,Shuffle = True,现在它可以工作了!!!!最终测试精度0.92左右,非常感谢!
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2021-08-28
  • 2021-08-19
  • 2021-10-22
  • 1970-01-01
  • 2021-10-22
  • 2022-12-23
相关资源
最近更新 更多