创建仅使用部分张量的自定义卷积损失函数答案

【问题标题】：Create custom convolutional Loss function that only takes parts of the tensor创建仅使用部分张量的自定义卷积损失函数
【发布时间】：2022-01-04 20:57:31
【问题描述】：

我有一个卷积网络，它可以获取图像，而且每个图像上都有一个彩色边框，用于向网络输入额外的信息。现在我想计算损失，但是通常的损失函数也会考虑到预测的边界。边界是完全随机的，只是系统的输入。我不希望模型在预测错误颜色时认为它表现不佳。这发生在 DataLoader.getitem 中：

def __getitem__(self, index):
        path = self.input_data[index]
        imgs_path = sorted(glob.glob(path + '/*.png'))
        #read light conditions
        lightConditions = []
        with open(path +"/lightConditions.json", 'r') as file:
            lightConditions = json.load(file)
        #shift light conditions
        lightConditions.pop(0)
        lightConditions.append(False)
        frameNumber = 0
        imgs = []
        for img_path in imgs_path:
            img = cv2.imread(img_path)
            img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
            im_pil = Image.fromarray(img)
            #img = cv2.resize(img, (256,448))
            if lightConditions[frameNumber] ==False:
                imgBorder = ImageOps.expand(im_pil,border = 6, fill='black')
            else:
                imgBorder = ImageOps.expand(im_pil, border = 6, fill='orange')
            img = np.asarray(imgBorder)
            img = cv2.resize(img, (256,448))
            #img = cv2.resize(img, (0, 0), fx=0.5, fy=0.5, interpolation=cv2.INTER_CUBIC) #has been 0.5 for official data, new is fx = 2.63 and fy = 2.84

            img_tensor = ToTensor()(img).float()
            imgs.append(img_tensor)
            frameNumber +=1
        imgs = torch.stack(imgs, dim=0)
        return imgs

然后这是在训练中完成的：

for idx_epoch in range(startEpoch, nEpochs):

        #set epoch in dataloader for right shuffle ->set seed really random
        val_loader.sampler.set_epoch(idx_epoch)
        #Remember time for displaying time for epoch
        startTimeEpoch = datetime.now()
        i = 0
        if processGPU==0:
            running_loss = 0
        beenValuated = False
        for index, data_sr in enumerate(train_loader):
            #Transfer Data to GPU but don't block other processes because this only effects this single process
            data_sr = data_sr.cuda(processGPU, non_blocking=True)

            startTimeIteration = time.time()
            #Remove all dimensions of size 1
            data_sr = data_sr.squeeze()
            # calculate the index of the input images and GT images
            num_f = len(data_sr)
            #If model_type is 0 -> only calculate one frame that is marked with gt
            if cfg.model_type == 0:
                idx_start = random.randint(-2, 2)
                idx_all = list(np.arange(idx_start, idx_start + num_f).clip(0, num_f - 1))
                idx_gt = [idx_all.pop(int(num_f / 2))]
                idx_input = idx_all
            #Else when model_type is 1 then input frames 1,2,3 and predict frame 4 to number of cfg.dec_frames. Set all images that will be predicted to 'gt' images
            else:
                idx_all = np.arange(0, num_f)
                idx_input = list(idx_all[0:4])
                idx_gt = list(idx_all[4:4+cfg.dec_frames])
            imgs_input = data_sr[idx_input]
            imgs_gt = data_sr[idx_gt]

            # get predicted result
            imgs_pred = model(imgs_input)

我使用 cfg.model_type = 1。此模型将为我提供带有彩色边框的新图像。通常这里会进行损失计算：

loss = criterion_mse(imgs_pred, imgs_gt)

但我不能再使用它了。有谁知道如何编写只考虑张量的某些部分或张量中的哪些部分代表哪些图像的自定义损失函数？

【问题讨论】：

标签： image-processing pytorch tensor loss mse

【解决方案1】：

你可以像在 numpy 中一样对张量进行切片。图像批次的尺寸是 NCHW。如果b 是您的边框大小并且它从各个方面都是对称的，那么只需 crop 张量：

loss = criterion_mse(imgs_pred[:, :, b:-b, b:-b] , imgs_gt[:, :, b:-b, b:-b])

【讨论】：

非常感谢您的回答！它对我有用。我的网络最喜欢的颜色似乎是绿色 :)