【发布时间】:2019-04-30 17:35:00
【问题描述】:
问题是训练损失和测试损失相同,损失和准确率没有变化,我的 CNN 结构和训练过程有什么问题?
训练结果:
Epoch:1/30.. 训练损失:2.306.. 测试损失:2.306.. 测试准确度:0.100
Epoch:2/30.. 训练损失:2.306.. 测试损失:2.306.. 测试准确度:0.100
类代码:
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
self.fc1 = nn.Linear(in_features=12 * 4 * 4, out_features=120)
self.fc2 = nn.Linear(in_features=120, out_features=60)
self.out = nn.Linear(in_features=60, out_features=10)
#the output will be 0~9 (10)
以下是我的 CNN 和训练过程:
def forward(self, t):
# implement the forward pass
# (1)input layer
t = t
# (2) hidden conv layer
t = self.conv1(t)
t = F.relu(t)
t = F.max_pool2d(t, kernel_size=2, stride=2)
# (3) hidden conv layer
t = self.conv2(t)
t = F.relu(t)
t = F.max_pool2d(t, kernel_size=2, stride=2)
# (4) hidden linear layer
t = t.reshape(-1, 12 * 4 * 4)
t = self.fc1(t)
t = F.relu(t)
# (5) hidden linear layer
t = self.fc2(t)
t = F.relu(t)
# (6) output layer
t = self.out(t)
#t = F.softmax(t, dim=1)
return t
epoch = 30
train_losses, test_losses = [], []
for e in range(epoch):
train_loss = 0
test_loss = 0
accuracy = 0
for images, labels in train_loader:
optimizer.zero_grad()
op = model(images) #output
loss = criterion(op, labels)
train_loss += loss.item()
loss.backward()
optimizer.step()
else:
with torch.no_grad():
model.eval()
for images,labels in testloader:
log_ps = model(images)
prob = torch.exp(log_ps)
top_probs, top_classes = prob.topk(1, dim=1)
equals = labels == top_classes.view(labels.shape)
accuracy += equals.type(torch.FloatTensor).mean()
test_loss += criterion(log_ps, labels)
model.train()
print("Epoch: {}/{}.. ".format(e+1, epoch),
"Training Loss: {:.3f}.. ".format(train_loss/len(train_loader)),
"Test Loss: {:.3f}.. ".format(test_loss/len(testloader)),
"Test Accuracy: {:.3f}".format(accuracy/len(testloader)))
train_losses.append(train_loss/len(train_loader))
test_losses.append(test_loss/len(testloader))
【问题讨论】:
-
你能告诉我你用的是哪个
criterion吗? -
我最好的猜测是您的最后一层缺少
softmax(或任何其他形式的规范化)。如果您在F.softmax(t, dim=1)中发表评论,您是否观察到相同的结果? -
@DavidNg 尊敬的先生,我使用 nn.Crossentropy() 函数作为标准
-
@dennlinger 亲爱的先生,我试过了,但结果相同,交叉熵函数是否应该包含 softmax ?
标签: python-3.x neural-network pytorch