Tensorflow：如何以相同的方式随机裁剪输入图像和标签？答案

【问题标题】：Tensorflow: How to randomly crop input images and labels in the same way?Tensorflow：如何以相同的方式随机裁剪输入图像和标签？
【发布时间】：2017-02-11 00:15:27
【问题描述】：

我正在尝试从图像中随机裁剪。就像为了增强数据在 Caffe 中所做的一样。

我知道tensorflow已经有函数了

    img = tf.random_crop(img, [h, w, 3])
    label = tf.random_crop(label, [h, w, 1])

但我不确定是否需要对图像和标签进行相同的裁剪。此外，此功能无法自动对一维或二维小于裁剪尺寸 [h,w] 的图像进行 0 填充。

这又是由

    img = tf.image.resize_image_with_crop_or_pad(img, h, w)
    label = tf.image.resize_image_with_crop_or_pad(label, h, w)

但它只需要中心作物而不是随机作物。

编辑：
这是一些如何完成填充的代码：

# Cropping dimensions (crops of 700 x 800)
crp_h = tf.constant(700)
crp_w = tf.constant(800)

shape = tf.shape(img)
img_h = shape[0]
img_w = shape[1]    
img = tf.cond(img_h < crp_h, lambda: tf.image.pad_to_bounding_box(img, 0, 0, crp_h, img_w), lambda: img)
# Update image dimensions
shape = tf.shape(img)
img_h = shape[0]
img = tf.cond(img_w < crp_w, lambda: tf.image.pad_to_bounding_box(img, 0, 0, img_h, crp_w), lambda: img)
# Update image dimensions
shape = tf.shape(img)
img_w = shape[1]

不幸的是，这里不能使用 python if 条件，所以必须使用丑陋的tf.cond(...)。

【问题讨论】：

标签： tensorflow

【解决方案1】：

我建议将图像与标签结合起来，然后随机裁剪它们：

import tensorflow as tf

def random_crop_and_pad_image_and_labels(image, labels, size):
  """Randomly crops `image` together with `labels`.

  Args:
    image: A Tensor with shape [D_1, ..., D_K, N]
    labels: A Tensor with shape [D_1, ..., D_K, M]
    size: A Tensor with shape [K] indicating the crop size.
  Returns:
    A tuple of (cropped_image, cropped_label).
  """
  combined = tf.concat([image, labels], axis=2)
  image_shape = tf.shape(image)
  combined_pad = tf.image.pad_to_bounding_box(
      combined, 0, 0,
      tf.maximum(size[0], image_shape[0]),
      tf.maximum(size[1], image_shape[1]))
  last_label_dim = tf.shape(labels)[-1]
  last_image_dim = tf.shape(image)[-1]
  combined_crop = tf.random_crop(
      combined_pad,
      size=tf.concat([size, [last_label_dim + last_image_dim]],
                     axis=0))
  return (combined_crop[:, :, :last_image_dim],
          combined_crop[:, :, last_image_dim:])

举个例子：

cropped_image, cropped_labels = random_crop_and_pad_image_and_labels(
    image=tf.reshape(tf.range(4*4*3), [4, 4, 3]),
    labels=tf.reshape(tf.range(4*4), [4, 4, 1]),
    size=[2, 2])

with tf.Session() as session:
  print(session.run([cropped_image, cropped_labels]))

打印类似：

[array([[[30, 31, 32],
        [33, 34, 35]],

       [[42, 43, 44],
        [45, 46, 47]]], dtype=int32), array([[[10],
        [11]],

       [[14],
        [15]]], dtype=int32)]

还有第二个图片尺寸过小的例子：

cropped_image, cropped_labels = random_crop_and_pad_image_and_labels(
    image=tf.reshape(tf.range(4*1*3), [4, 1, 3]),
    labels=tf.reshape(tf.range(4*1), [4, 1, 1]),
    size=[2, 2])

with tf.Session() as session:
  print(session.run([cropped_image, cropped_labels]))

打印：

[array([[[3, 4, 5],
        [0, 0, 0]],

       [[6, 7, 8],
        [0, 0, 0]]], dtype=int32), array([[[1],
        [0]],

       [[2],
        [0]]], dtype=int32)]

【讨论】：

非常感谢您的回答。事实证明，带有填充的部分也有所改进。我假设可以使用函数tf.image.pad_to_bounding_box(...)，但不幸的是不能使用pyhton if 作为条件。我在我的问题中添加了伪代码。我正在努力，但如果有任何帮助，我将不胜感激
我想我找到了（请参阅我对问题的编辑）
我添加了一个填充步骤
原来 tensorflow 会抛出这样的错误，因为它无法在图形编译时推断图像和标签的第三维（静态形状）。这会导致函数 tf.train.batch(...) 失败。我会尝试修复它并在这里报告它
那总是分别是 3 和 1，对吧？如果是这样，您可以使用image.set_shape((None, None, 3)) 和labels.set_shape((None, None, 1))。