【发布时间】:2021-10-20 06:05:08
【问题描述】:
我正在尝试使用二元交叉熵训练一个连体网络。
我在 train_epoch 中出现以下错误:
y_true_2[range(y_true_2.shape[0]), y_true.long()] = 1
IndexError: index -9223372036854775808 is out of bounds for dimension 1 with size 2
以下是代码sn-p供参考:
def train_epoch(train_loader, model, loss_fn, optimizer, cuda, log_interval, metrics, logging):
for metric in metrics:
metric.reset()
model.train()
losses = []
total_loss = 0
for batch_idx, ((x0, x1), y) in enumerate(train_loader):
x0, x1, y_true = x0.cpu(), x1.cpu(), y.cpu()
gc.collect()
optimizer.zero_grad()
output1, output2 = model(x0, x1)
'''Distance metric - PairwiseDistance'''
p_dist = torch.nn.PairwiseDistance(keepdim=True)
dy = p_dist(output1, output2)
dy = torch.nan_to_num(dy)
y_true = torch.nan_to_num(y_true)
'''2 lines indicated the normalization of dy to 0 and 1 by dividing it with max value'''
maximum_dy = torch.max(dy)
maximum_dy = torch.nan_to_num(maximum_dy)
dy = dy / maximum_dy
maximum_y_true = torch.max(y_true)
maximum_y_true = torch.nan_to_num(maximum_y_true)
y_true = y_true / maximum_y_true
dy = torch.squeeze(dy, 1)
'Output tensor of dimension [4,2] and input tensor of dimension [4] to BCE loss function'
input_dy = torch.empty(dy.size(0), 2)
input_dy[:, 0] = 1 - dy
input_dy[:, 1] = dy
y_true_2 = torch.zeros(dy.size(0), 2)
y_true_2[range(y_true_2.shape[0]), y_true.long()] = 1
m = nn.Sigmoid()
loss = loss_fn(m(input_dy), y_true_2)
loss.backward()
optimizer.step()
losses.append(loss.item())
total_loss += loss.item()
input_dy_metric = torch.round(input_dy)
for metric in metrics:
metric(input_dy_metric, y_true_2)
metric.total += y_true_2.shape[0]
if batch_idx % log_interval == 0:
message = 'Train: [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
batch_idx, len(train_loader),
100. * batch_idx / len(train_loader), np.mean(losses))
for metric in metrics:
message += '\t{}: {}'.format(metric.name(), metric.value())
print(message)
losses = []
total_loss /= (batch_idx + 1)
return total_loss, metrics
请帮助我解决可能的问题。 提前致谢。
【问题讨论】:
-
除了使用debugger,您还可以尝试打印出
y_true_2.shape[0]和y_true.long()。这至少会给你一个关于哪个索引产生IndexError的提示。 -
另外,在转换为
Tensor.long之前看看y_true -
以下是您建议检查的输出: 1. print(y_true_2) = tensor([[0., 1.], [0., 1.], [0., 1.] , [0., 1.]]) 2. print(y_true) = 张量([1., 1., 1., 1.]) 3. print(y_true_2.shape[0]) = 4 4. print( range(y_true_2.shape[0])) = range(0,4) 5. print(y_true.long()) = tensor([1, 1, 1, 1])
-
请发布错误的完整堆栈跟踪以确保错误行
标签: python pytorch conv-neural-network training-data loss-function