使用灰度图像数据集进行训练 - MASKRCNN Matterport答案

【问题标题】：Training with Grey images dataset - MASKRCNN Matterport使用灰度图像数据集进行训练 - MASKRCNN Matterport
【发布时间】：2021-07-08 06:11:22
【问题描述】：

我正在尝试训练一组灰色图像的自定义数据集。

matterport 是为 RGB 数据集设计的。

这些是到目前为止我为灰度数据集所遵循的步骤。

第 1 步

> class DetectorConfig(Config):
>     Configuration for training pneumonia detection on the RSNA pneumonia dataset.
>     Overrides values in the base Config class https://github.com/matterport/Mask_RCNN/blob/master/mrcnn/config.py.
>     IMAGE_CHANNEL_COUNT = 1
>     MEAN_PIXEL = [123.7] # this value is the one that I chose

第 2 步

> def load_image(self, image_id):
>         # Load image
>         image = skimage.io.imread(self.image_info[image_id]['path'])
>         # Convert to grayscale for consistency.
>         if image.ndim != 1:
>             image = skimage.color.gray2rgb(image) #Instead of rgb2gray(image)
> 
>         # Extending the size of the image to be (h,w,1)
>         image = image[..., np.newaxis]
>         return image

备用步骤 2

>     def load_image(self, image_id):
>         """Load the specified image and return a [H,W,3] Numpy array.
>         # Load image
>         image = skimage.io.imread(self.image_info[image_id]['path'])         
>         image = image[..., np.newaxis] # Extending the size of the image to be (h,w,1)
>         return image

备用步骤 2a

>     def load_image(self, image_id):
>         """Load the specified image and return a [H,W,3] Numpy array.
>         # Load image
>         image = cv2.imread(self.image_info[image_id]['path'])         
>         image = image[..., np.newaxis] # Extending the size of the image to be (h,w,1)
>         return image

第 3 步

> model.load_weights(COCO_MODEL_PATH, by_name=True,
>                         exclude=["mrcnn_class_logits", "mrcnn_bbox_fc", 
>                                  "mrcnn_bbox", "mrcnn_mask", "conv1"])

第 4 步

>  layer_regex = {
>             # all layers but the backbone
>             "heads": r"(conv1\_.*)|(mrcnn\_.*)|(rpn\_.*)|(fpn\_.*)",
> 

> def load_image(self, image_id):
>      image = image[..., np.newaxis]

第 5 步

> def resize_image(image, min_dim=None, max_dim=None, min_scale=None, mode="square"):
> padding = [(top_pad, bottom_pad), (left_pad, right_pad)]
> image = np.pad(image, padding, mode='constant', constant_values=0)

第 6 步

>  if len(image.shape) != 3 or image.shape[2] != 3:
>         image = np.squeeze(image, axis = -1)
>         image = np.stack((image,) * 3, -1)

当我运行这段代码时，我从火车（模型）遇到了这个问题 ValueError: len(output_shape) 不能小于图像尺寸

如果我在第 2 步使用“rgb2gray(image)”代替，问题就不同了

ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (2,2) and requested shape (3,2)

备用步骤 2

ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (2,2) and requested shape (3,2)

备用步骤 2a

ValueError: len(output_shape) cannot be smaller than the image dimensions

请提供一些帮助。所有数据集图像的尺寸相同。

【问题讨论】：

标签： image-segmentation

【解决方案1】：

您面临的问题是由于尺寸不匹配。

len(output_shape) cannot be smaller than the image dimensions 的原因：原因是您的源图像是 4D（H、W、3、1），即 3 个通道。可能的原因是
```
image = skimage.color.gray2rgb(image)
image = image[..., np.newaxis]
```
在这里，您首先加载具有 3 个通道的图像，然后添加一个新轴。为避免这种情况，请使用image = skimage.color.gray2rgb(image) 或image = image[..., np.newaxis]，但不要同时使用load_image。以下是我使用的供参考：
```
def load_image(self, image_id):
    """Load the specified image and return a [H,W,3] Numpy array.
    """
    # Load image
    image = skimage.io.imread(self.image_info[image_id]['path'])
    image = image[..., np.newaxis]
    return image
```
为了解决相关问题，ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (2,2) and requested shape (3,2) 的原因是因为您的图像是 3D（即 (H, W, 1)，因为 image = image[..., np.newaxis]），而您的填充 padding = [(top_pad, bottom_pad), (left_pad, right_pad)] 是 2D。恢复到原来的填充
```
padding = [(top_pad, bottom_pad), (left_pad, right_pad), (0, 0)]
```
将解决此问题。

希望这能解决您的问题。

【讨论】：