U-net 图像分割的 Dice 系数不增加答案

【问题标题】：Dice coefficent not increasing for U-net image segmentationU-net 图像分割的 Dice 系数不增加
【发布时间】：2021-07-05 04:13:26
【问题描述】：

问题

我正在使用Image segmentation guide by fchollet 来执行语义分割。我已尝试修改指南以适应我的数据集，方法是将 8 位 img 掩码值标记为 1 和 2，就像 Oxford Pets 数据集中的一样，这将在 Generator(keras.utils.Sequence) 类中减去 0 和 1。输入图像是 RGB-图片。

我尝试了什么

我不知道为什么，但我的骰子系数根本没有增加。我试图降低学习率，并将优化器更改为 SGD/RMSProp，对数据进行归一化，将不平衡标签考虑在内，但结果非常奇怪。模型的准确率/IoU 随着数字的增加而降低。 epochs 增加。

如果有帮助，我之前曾问过一个关于我应该用于不平衡数据集here 的指标的问题。预测的可视化还可以，但指标不行。

接下来我可以做些什么来调试这个问题？我的代码有什么问题吗？将不胜感激任何建议。

这里是结果

Epoch 1/10
304/304 [==============================] - 693s 2s/step - loss: 0.7648 - accuracy: 0.5100 - dice_metric: 0.6664 - IOU: 0.5100 - jaccard_distance_loss: 50.0260 - val_loss: 0.6799 - val_accuracy: 0.5178 - val_dice_metric: 0.6664 - val_IOU: 0.5178 - val_jaccard_distance_loss: 50.0260
Epoch 2/10
304/304 [==============================] - 176s 579ms/step - loss: 0.6727 - accuracy: 0.3135 - dice_metric: 0.6664 - IOU: 0.3135 - jaccard_distance_loss: 50.0257 - val_loss: 0.6396 - val_accuracy: 0.1632 - val_dice_metric: 0.6664 - val_IOU: 0.1632 - val_jaccard_distance_loss: 50.0260
Epoch 3/10
304/304 [==============================] - 176s 579ms/step - loss: 0.6377 - accuracy: 0.1728 - dice_metric: 0.6664 - IOU: 0.1728 - jaccard_distance_loss: 50.0258 - val_loss: 0.6574 - val_accuracy: 0.2565 - val_dice_metric: 0.6664 - val_IOU: 0.2565 - val_jaccard_distance_loss: 50.0260
Epoch 4/10
304/304 [==============================] - 176s 579ms/step - loss: 0.5886 - accuracy: 0.0689 - dice_metric: 0.6664 - IOU: 0.0689 - jaccard_distance_loss: 50.0264 - val_loss: 0.5933 - val_accuracy: 0.0334 - val_dice_metric: 0.6664 - val_IOU: 0.0334 - val_jaccard_distance_loss: 50.0260
Epoch 5/10
304/304 [==============================] - 176s 579ms/step - loss: 0.5710 - accuracy: 0.0281 - dice_metric: 0.6664 - IOU: 0.0281 - jaccard_distance_loss: 50.0260 - val_loss: 0.5643 - val_accuracy: 0.0130 - val_dice_metric: 0.6664 - val_IOU: 0.0130 - val_jaccard_distance_loss: 50.0260
Epoch 6/10
304/304 [==============================] - 176s 579ms/step - loss: 0.5601 - accuracy: 0.0188 - dice_metric: 0.6664 - IOU: 0.0188 - jaccard_distance_loss: 50.0252 - val_loss: 0.5457 - val_accuracy: 0.0082 - val_dice_metric: 0.6664 - val_IOU: 0.0082 - val_jaccard_distance_loss: 50.0260
Epoch 7/10
304/304 [==============================] - 176s 580ms/step - loss: 0.5494 - accuracy: 0.0147 - dice_metric: 0.6664 - IOU: 0.0147 - jaccard_distance_loss: 50.0254 - val_loss: 0.5353 - val_accuracy: 0.0068 - val_dice_metric: 0.6664 - val_IOU: 0.0068 - val_jaccard_distance_loss: 50.0260
Epoch 8/10
304/304 [==============================] - 176s 580ms/step - loss: 0.5383 - accuracy: 0.0115 - dice_metric: 0.6664 - IOU: 0.0115 - jaccard_distance_loss: 50.0264 - val_loss: 0.5241 - val_accuracy: 0.0051 - val_dice_metric: 0.6664 - val_IOU: 0.0051 - val_jaccard_distance_loss: 50.0260
Epoch 9/10
304/304 [==============================] - 176s 579ms/step - loss: 0.5268 - accuracy: 0.0090 - dice_metric: 0.6664 - IOU: 0.0090 - jaccard_distance_loss: 50.0268 - val_loss: 0.5115 - val_accuracy: 0.0039 - val_dice_metric: 0.6664 - val_IOU: 0.0039 - val_jaccard_distance_loss: 50.0260
Epoch 10/10
304/304 [==============================] - 176s 579ms/step - loss: 0.5149 - accuracy: 0.0069 - dice_metric: 0.6664 - IOU: 0.0069 - jaccard_distance_loss: 50.0254 - val_loss: 0.4960 - val_accuracy: 0.0033 - val_dice_metric: 0.6664 - val_IOU: 0.0033 - val_jaccard_distance_loss: 50.0260

这是我的代码

batch_size = 4
num_classes = 2
img_size = (512, 512)

# |--------------------- Load image and masks ---------------------|   
input_dir = "/content/drive/MyDrive/input_dir"
target_dir = "/content/drive/MyDrive/target_dir"

# Sort images and masks
input_img_paths = sorted(
    [
        os.path.join(input_dir, fname)
        for fname in os.listdir(input_dir)
        if fname.endswith(".png")
    ]
)
target_img_paths = sorted(
    [
        os.path.join(target_dir, fname)
        for fname in os.listdir(target_dir)
        if fname.endswith(".png") and not fname.startswith(".")
    ]
)

# |--------------------- Define custom generator ---------------------|   
class Generator(keras.utils.Sequence):
    """Helper to iterate over the data (as Numpy arrays)."""

    def __init__(self, batch_size, img_size, input_img_paths, target_img_paths):
        self.batch_size = batch_size
        self.img_size = img_size
        self.input_img_paths = input_img_paths
        self.target_img_paths = target_img_paths

    def __len__(self):
        return len(self.target_img_paths) // self.batch_size

    def __getitem__(self, idx):
        """Returns tuple (input, target) correspond to batch #idx."""
        i = idx * self.batch_size
        batch_input_img_paths = self.input_img_paths[i : i + self.batch_size]
        batch_target_img_paths = self.target_img_paths[i : i + self.batch_size]
        x = np.zeros((self.batch_size,) + self.img_size + (3,), dtype="float32")
        x /= 255.0 # normalize data
        for j, path in enumerate(batch_input_img_paths):
            img = load_img(path, target_size=self.img_size)
            x[j] = img
        y = np.zeros((self.batch_size,) + self.img_size + (1,), dtype="uint8")
        for j, path in enumerate(batch_target_img_paths):
            img = load_img(path, target_size=self.img_size, color_mode="grayscale")
            y[j] = np.expand_dims(img, 2)
            # Ground truth labels are 1, 2. Subtract one to make them 0, 1:
            y[j] -= 1
        return x, y

# |--------------------- Train Validation Split ---------------------|   
val_samples = 304
random.Random(7).shuffle(input_img_paths)
random.Random(7).shuffle(target_img_paths)
train_input_img_paths = input_img_paths[:-val_samples]
train_target_img_paths = target_img_paths[:-val_samples]
val_input_img_paths = input_img_paths[-val_samples:]
val_target_img_paths = target_img_paths[-val_samples:]

# Instantiate data Sequences for each split
train_gen = Generator(batch_size, img_size, train_input_img_paths, train_target_img_paths)
val_gen = Generator(batch_size, img_size, val_input_img_paths, val_target_img_paths)

# |--------------------- Define U-net Model ---------------------|   
def conv_block(tensor, nfilters, size=3, padding='same', initializer="he_normal"):
    x = Conv2D(filters=nfilters, kernel_size=(size, size), padding=padding, kernel_initializer=initializer)(tensor)
    x = BatchNormalization()(x)
    x = Activation("relu")(x)
    x = Conv2D(filters=nfilters, kernel_size=(size, size), padding=padding, kernel_initializer=initializer)(x)
    x = BatchNormalization()(x)
    x = Activation("relu")(x)
    return x


def deconv_block(tensor, residual, nfilters, size=3, padding='same', strides=(2, 2)):
    y = Conv2DTranspose(nfilters, kernel_size=(size, size), strides=strides, padding=padding)(tensor)
    y = concatenate([y, residual], axis=3)
    y = conv_block(y, nfilters)
    return y

def Unet(img_height, img_width, nclasses=2, filters=64):
# down
    input_layer = Input(shape=(img_height, img_width, 3), name='image_input')
    conv1 = conv_block(input_layer, nfilters=filters)
    conv1_out = MaxPooling2D(pool_size=(2, 2))(conv1)
    conv2 = conv_block(conv1_out, nfilters=filters*2)
    conv2_out = MaxPooling2D(pool_size=(2, 2))(conv2)
    conv3 = conv_block(conv2_out, nfilters=filters*4)
    conv3_out = MaxPooling2D(pool_size=(2, 2))(conv3)
    conv4 = conv_block(conv3_out, nfilters=filters*8)
    conv4_out = MaxPooling2D(pool_size=(2, 2))(conv4)
    conv4_out = Dropout(0.5)(conv4_out)
    conv5 = conv_block(conv4_out, nfilters=filters*16)
    conv5 = Dropout(0.5)(conv5)
# up
    deconv6 = deconv_block(conv5, residual=conv4, nfilters=filters*8)
    deconv6 = Dropout(0.5)(deconv6)
    deconv7 = deconv_block(deconv6, residual=conv3, nfilters=filters*4)
    deconv7 = Dropout(0.5)(deconv7) 
    deconv8 = deconv_block(deconv7, residual=conv2, nfilters=filters*2)
    deconv9 = deconv_block(deconv8, residual=conv1, nfilters=filters)
# output
    output_layer = Conv2D(filters=3, kernel_size=(1, 1))(deconv9)
    output_layer = BatchNormalization()(output_layer)
    output_layer = Conv2D(nclasses, 3, activation="softmax", padding="same")(output_layer)

    model = Model(inputs=input_layer, outputs=output_layer, name='Unet')
    return model

model = Unet(512,512)

# |--------------------- Define custom metrics ---------------------|   
def jaccard_distance_loss(y_true, y_pred, smooth=100):
    intersection = K.sum(K.sum(K.abs(y_true * y_pred), axis=-1))
    sum_ = K.sum(K.sum(K.abs(y_true) + K.abs(y_pred), axis=-1))
    jac = (intersection + smooth) / (sum_ - intersection + smooth)
    return (1 - jac) * smooth

def IOU(y_true, y_pred):
    true_pixels = K.argmax(y_true, axis=-1)
    pred_pixels = K.argmax(y_pred, axis=-1)
    true_pixels=K.flatten(true_pixels)
    pred_pixels=K.flatten(pred_pixels)
    true_labels = K.equal(true_pixels, 0) # target label
    pred_labels = K.equal(pred_pixels, 0) # target label
    inter = tf.cast(true_labels & pred_labels,tf.int32)
    union = tf.cast(true_labels | pred_labels,tf.int32)
    iou = K.sum(inter)/K.sum(union)
    return iou

def dice_metric(y_pred, y_true):
    intersection = K.sum(K.sum(K.abs(y_true * y_pred), axis=-1))
    union = K.sum(K.sum(K.abs(y_true) + K.abs(y_pred), axis=-1))
    return 2*intersection / union

# |--------------------- Compile Model ---------------------|
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-5), 
              loss='sparse_categorical_crossentropy',
              sample_weight_mode='temporal',
              metrics=['accuracy', dice_metric, IOU, jaccard_distance_loss])

# |--------------------- Train Model ---------------------|
model.fit(train_gen, 
          epochs=10, 
          validation_data=val_gen)

【问题讨论】：

你好。您遇到准确性问题的原因可能是您的标签。背景应为 0，前景（要检测的对象）为 1。
嗯，是的，我怀疑情况也可能如此。我交换了背景和前景，但骰子系数仍然保持不变，但为 0.0023。以下是结果 (codeshare.io/G7vm3k)
我意识到模型输出错误，我应该使用 1 通道的 sigmoid 激活函数。
我遇到了类似的问题。我很好奇你为什么改成sigmoid。我正在做多类图像分割。所以我需要softmax。对吗？

标签： python keras deep-learning metrics image-segmentation

【解决方案1】：

编辑（解决方案）

模型输出错误。它应该是一个具有 1 个输出通道的 sigmoid 激活函数。将output_layer = Conv2D(nclasses, 3, activation="softmax", padding="same")(output_layer) 更改为output_layer = Conv2D(1, 1, activation="sigmoid", padding="same")(output_layer) 解决了我的问题。

另外，在阅读了post 之后，我决定使用真阳性率 (TPR)，也称为 recall/sensitivity/probability of detection 作为我的主要指标。

def POD(y_true, y_pred):
    y_true_pos = K.flatten(y_true)
    y_pred_pos = K.flatten(y_pred)
    true_pos = K.sum(y_true_pos * y_pred_pos)
    false_neg = K.sum(y_true_pos * (1 - y_pred_pos))
    return true_pos / (true_pos + false_neg)

【讨论】：