如何在未裁剪的原始图像上预测分割掩码？答案

【问题标题】：How to predict segmentation mask on non-cropped original images?如何在未裁剪的原始图像上预测分割掩码？
【发布时间】：2022-06-13 18:15:20
【问题描述】：

我正在尝试在只有 50 个维度（648、432、3）的图像的眼病数据集上创建一个分割模型。首先，我使用 patchify 库创建了尺寸为 256x256x3 的单个图像的非重叠补丁/裁剪，该库将我的数据集从 (50, 500, 500, 3) 转换为 (817,256,256,3) 并使用预训练的分割模型进行训练.

该模型倾向于在裁剪后的图像上生成良好的蒙版，但是当我传递调整大小为裁剪图像尺寸 (256x256x3) 的完整图像 (648、432、3) 时。它提供的面具很差。

当模型在农作物上进行训练时，有人知道如何对原始图像进行预测吗？或者解决问题的可能方法？

def create_patches(dataset):
    all_img_patches = []
    for instance in dataset:

    large_image = instance[0]  # Shape is (648, 432, 3)

    patches_img = patchify(large_image, (256, 256, 3), step=256)  # Step=256 for 256 patches means no overlap

    for i in range(patches_img.shape[0]):
        for j in range(patches_img.shape[1]):
            single_patch_img = patches_img[i, j, :, :]
            single_patch_img = (single_patch_img.astype('float32')) / 255.

            all_img_patches.append(single_patch_img)

images = np.array(all_img_patches)

all_mask_patches = []
for instance in dataset:

    large_mask = instance[1]  # Shape is (648, 432)

    patches_mask = patchify(large_mask, (256, 256), step=400)  # Step=256 for 256 patches means no overlap

    for i in range(patches_mask.shape[0]):
        for j in range(patches_mask.shape[1]):
            single_patch_mask = patches_mask[i, j, :, :]

            all_mask_patches.append(single_patch_mask)

masks = np.array(all_mask_patches)
masks = np.expand_dims(masks, -1)

images = np.squeeze(images, axis=1)
# print(images.shape)
# print(masks.shape)
# print("Pixel values in the mask are: ", np.unique(masks))

return images, masks

if __name__ == '__main__':
config_path = os.path.join(os.curdir, 'Configuration', 'config.yml')
configuration = return_configuration(config_path)
configuration = configure_all_paths(configuration, os.path.dirname(__file__))
image, mask = setup_paths(configuration)   # Sets up the path to image and mask
dataset = read_image(image, mask) 

# This function reads all the image and corresponding mask and returns a nested output
# dataset = [[img1, mask1], [img2, mask2],...]


images, mask = create_patches(dataset)

print(images.shape)   # (817, 256, 256, 3)
print(mask.shape)     # (817, 256, 256, 1)

BACKBONE = 'resnet34'
preprocess_input1 = sm.get_preprocessing(BACKBONE)

# preprocess input
images1 = preprocess_input1(images)

X_train, X_test, y_train, y_test = train_test_split(images1, mask, test_size=0.10, random_state=42)

# sanity_check(X_train, y_train)

seed = 24
from keras.preprocessing.image import ImageDataGenerator

img_data_gen_args = dict(rotation_range=90,
                         width_shift_range=0.3,
                         height_shift_range=0.3,
                         shear_range=0.5,
                         zoom_range=0.3,
                         horizontal_flip=True,
                         vertical_flip=True,
                         fill_mode='reflect')

mask_data_gen_args = dict(rotation_range=90,
                          width_shift_range=0.3,
                          height_shift_range=0.3,
                          shear_range=0.5,
                          zoom_range=0.3,
                          horizontal_flip=True,
                          vertical_flip=True,
                          fill_mode='reflect',
                          preprocessing_function=lambda x: np.where(x > 0, 1, 0).astype(
                              x.dtype))  # Binarize the output again.

image_data_generator = ImageDataGenerator(**img_data_gen_args)
image_data_generator.fit(X_train, augment=True, seed=seed)

image_generator = image_data_generator.flow(X_train, seed=seed)
valid_img_generator = image_data_generator.flow(X_test, seed=seed)

mask_data_generator = ImageDataGenerator(**mask_data_gen_args)
mask_data_generator.fit(y_train, augment=True, seed=seed)
mask_generator = mask_data_generator.flow(y_train, seed=seed)
valid_mask_generator = mask_data_generator.flow(y_test, seed=seed)


def my_image_mask_generator(image_generator, mask_generator):
    train_generator = zip(image_generator, mask_generator)
    for (img, mask) in train_generator:
        yield (img, mask)



my_generator = my_image_mask_generator(image_generator, mask_generator)

validation_datagen = my_image_mask_generator(valid_img_generator, valid_mask_generator)

model = sm.Unet(BACKBONE, encoder_weights='imagenet')

model.compile('Adam', loss=sm.losses.bce_jaccard_loss, metrics=[sm.metrics.iou_score])
earlystop = tf.keras.callbacks.EarlyStopping(patience=30, restore_best_weights=True)
print(model.summary())

history = model.fit(my_generator, validation_data=validation_datagen, steps_per_epoch=100, validation_steps=100,
                    epochs=200, callbacks=[earlystop])

model.save(configuration['train']['Output']['weight'])

【问题讨论】：

请提供足够的代码，以便其他人更好地理解或重现问题。

标签： deep-learning conv-neural-network tensorflow2.0 image-segmentation unity3d-unet