在带有 Pytorch 的 MNIST 数据集上使用 SGD，损失没有减少答案

【问题标题】：Using SGD on MNIST dataset with Pytorch, loss not decreasing在带有 Pytorch 的 MNIST 数据集上使用 SGD，损失没有减少
【发布时间】：2021-02-26 15:59:36
【问题描述】：

我尝试在批量大小为 32 的 MNIST 数据集上使用 SGD，但损失根本没有减少。我检查了我的模型、损失函数并阅读了文档，但无法弄清楚我做错了什么。

我将我的神经网络定义如下

class classification(nn.Module):
def __init__(self):
    super(classification, self).__init__()
    
    # construct layers for a neural network
    self.classifier1 = nn.Sequential(
        nn.Linear(in_features=28*28, out_features=20*20),
        nn.Sigmoid(),
    ) 
    self.classifier2 = nn.Sequential(
        nn.Linear(in_features=20*20, out_features=10*10),
        nn.Sigmoid(),
    ) 
    self.classifier3 = nn.Sequential(
        nn.Linear(in_features=10*10, out_features=10),
        nn.LogSoftmax(dim=1),
    ) 
    
    
def forward(self, inputs):                 # [batchSize, 1, 28, 28]
    x = inputs.view(inputs.size(0), -1)    # [batchSize, 28*28]
    x = self.classifier1(x)                # [batchSize, 20*20]
    x = self.classifier2(x)                # [batchSize, 10*10]
    out = self.classifier3(x)              # [batchSize, 10]
    
    return out

我将我的训练过程定义如下


classifier = classification().to("cuda")
#optimizer
optimizer = torch.optim.SGD(classifier.parameters(), lr=learning_rate_value)
#loss function
criterion = nn.NLLLoss()
batch_size=32
epoch = 30
#array to save loss history
loss_train_arr=np.zeros(epoch)

#used DataLoader to make split batch
batched_train = torch.utils.data.DataLoader(training_set, batch_size, shuffle=True)

for i in range(epoch):
    
    loss_train=0
    
    #train and compute loss, accuracy
    for img, label in batched_train:
        img=img.to(device)
        label=label.to(device)

        optimizer.zero_grad()
        predicted = classifier(img)
        
        label_predicted = torch.argmax(predicted,dim=1)
        loss = criterion(predicted, label)
        loss.backward
        optimizer.step()
        loss_train += loss.item()
        
    loss_train_arr[i]=loss_train/(len(batched_train.dataset)/batch_size)

我使用的是带有 LogSoftmax 层的模型，所以我的损失函数看起来是正确的。但是损失并没有减少。

【问题讨论】：

为了确保损失函数不是问题，请在最后一层使用softmax激活并使用MSE作为损失函数并检查是否有效

标签： neural-network pytorch mnist stochastic-gradient

【解决方案1】：

如果您发布的代码与您使用的代码完全相同，那么问题在于您实际上并没有对损失进行反向调用（缺少括号 ()）。

【讨论】：

@CodeYong 那么请点击答案顶部左侧的勾号接受答案。
对不起，我不知道有接受按钮。帮了我很多！