Tensorflow 入门 - 将图像拆分为子图像答案

【问题标题】：Getting started with Tensorflow - Split image into sub-imagesTensorflow 入门 - 将图像拆分为子图像
【发布时间】：2016-11-09 04:36:27
【问题描述】：

这是我第一次使用卷积神经网络和 Tensorflow。

我正在尝试实现一个能够从数字视网膜图像中提取血管的卷积神经网络。我正在使用公开的Drive database（图片为 .tif 格式）。

由于我的图像非常大，我的想法是将它们分成大小为 28x28x1 的子图像（“1”是绿色通道，我唯一需要的）。为了创建训练集，我从每张图像中迭代地随机裁剪一个 28x28 的批次，并在该集上训练网络。

现在，我想在数据库中的一张大图像上测试我训练有素的网络（也就是说，我想将网络应用于一个完整的眼睛）。由于我的网络是在大小为 28x28 的子图像上训练的，因此想法是将眼睛分成“n”个子图像，将它们传递给网络，重新组合它们并显示结果，如图 1 所示：

Fig1

我尝试使用一些功能，例如： tf.extract_image_pathces 或 tf.train.batch，但我想知道这样做的正确方法是什么。

下面是我的代码的 sn-p。我卡住的功能是split_image(image)

import numpy
import os
import random

from PIL import Image
import tensorflow as tf

BATCH_WIDTH = 28;
BATCH_HEIGHT = 28;

NUM_TRIALS = 10;

class Drive:
    def __init__(self,train):
        self.train = train

class Dataset:
    def __init__(self, inputs, labels):
        self.inputs = inputs
        self.labels = labels
        self.current_batch = 0

    def next_batch(self):
        batch = self.inputs[self.current_batch], self.labels[self.current_batch]
        self.current_batch = (self.current_batch + 1) % len(self.inputs)
        return batch


#counts the number of black pixel in the batch
def mostlyBlack(image):
    pixels = image.getdata()
    black_thresh = 50
    nblack = 0
    for pixel in pixels:
        if pixel < black_thresh:
            nblack += 1

    return nblack / float(len(pixels)) > 0.5

#crop the image starting from a random point
def cropImage(image, label):
    width  = image.size[0]
    height = image.size[1]
    x = random.randrange(0, width - BATCH_WIDTH)
    y = random.randrange(0, height - BATCH_HEIGHT)
    image = image.crop((x, y, x + BATCH_WIDTH, y + BATCH_HEIGHT)).split()[1]
    label = label.crop((x, y, x + BATCH_WIDTH, y + BATCH_HEIGHT)).split()[0]
    return image, label

def split_image(image):

    ksizes_ = [1, BATCH_WIDTH, BATCH_HEIGHT, 1]
    strides_ = [1, BATCH_WIDTH, BATCH_HEIGHT, 1]

    input = numpy.array(image.split()[1])
    #input = tf.reshape((input), [image.size[0], image.size[1]])

    #input = tf.train.batch([input],batch_size=1)
    split = tf.extract_image_patches(input, padding='VALID', ksizes=ksizes_, strides=strides_, rates=[1,28,28,1], name="asdk")

#creates NUM_TRIALS images from a dataset
def create_dataset(images_path, label_path):
    files = os.listdir(images_path)
    label_files = os.listdir(label_path)

    images = [];
    labels = [];
    t = 0
    while t < NUM_TRIALS:
        index = random.randrange(0, len(files))
        if files[index].endswith(".tif"):
            image_filename = images_path + files[index]
            label_filename = label_path  + label_files[index]
            image = Image.open(image_filename)
            label = Image.open(label_filename)
            image, label = cropImage(image, label)
            if not mostlyBlack(image):
                #images.append(tf.convert_to_tensor(numpy.array(image)))
                #labels.append(tf.convert_to_tensor(numpy.array(label)))
                images.append(numpy.array(image))
                labels.append(numpy.array(label))

                t+=1

    image = Image.open(images_path + files[1])
    split_image(image)

    train = Dataset(images, labels)
    return Drive(train)

【问题讨论】：

我认为您的意思是补丁而不是批处理，这令人困惑。

标签： python dataset tensorflow image-segmentation

【解决方案1】：

您可以结合使用reshape 和transpose 调用来将图像切割成图块，而无需使用循环：

def split_image(image3, tile_size):
    image_shape = tf.shape(image3)
    tile_rows = tf.reshape(image3, [image_shape[0], -1, tile_size[1], image_shape[2]])
    serial_tiles = tf.transpose(tile_rows, [1, 0, 2, 3])
    return tf.reshape(serial_tiles, [-1, tile_size[1], tile_size[0], image_shape[2]])

其中 image3 是一个 3 维张量（例如图像），tile_size 是一对值 [H, W] 指定图块的大小。输出是一个形状为[B, H, W, C] 的张量。在你的情况下，电话是：

tiles = split_image(image, [28, 28])

产生一个形状为[B, 28, 28, 1] 的张量。您还可以通过反向执行这些操作来重新组合图块中的原始图像：

def unsplit_image(tiles4, image_shape):
    tile_width = tf.shape(tiles4)[1]
    serialized_tiles = tf.reshape(tiles4, [-1, image_shape[0], tile_width, image_shape[2]])
    rowwise_tiles = tf.transpose(serialized_tiles, [1, 0, 2, 3])
    return tf.reshape(rowwise_tiles, [image_shape[0], image_shape[1], image_shape[2]])

其中tiles4 是形状为[B, H, W, C] 的4D 张量，image_shape 是原始图像的形状。在您的情况下，电话可能是：

image = unsplit_image(tiles, tf.shape(image))

请注意，这仅适用于图像大小可被图块大小整除的情况。如果不是这种情况，您需要将图像填充到最接近的平铺大小倍数：

def pad_image_to_tile_multiple(image3, tile_size, padding="CONSTANT"):
    imagesize = tf.shape(image3)[0:2]
    padding_ = tf.to_int32(tf.ceil(imagesize / tile_size)) * tile_size - imagesize
    return tf.pad(image3, [[0, padding_[0]], [0, padding_[1]], [0, 0]], padding)

你会这样称呼：

image = pad_image_to_tile_multiple(image, [28,28])

然后在从瓷砖重新组装图像后通过拼接移除 paddig：

image = image[0:original_size[0], 0:original_size[1], :]

【讨论】：

【解决方案2】：

将一批图像（-1、X、Y、3）裁剪成 N 块的简单解决方案：

crops = tf.reshape(tensor_images, (-1, N, tensor_images.shape[1]//N, N, tensor_images.shape[2]//N, tensor_images.shape[3]))
crops = tf.transpose(crops, [0, 1, 3, 2, 4, 5])

像这样检查解决方案：

def show_images(segs, x, y):
  fig, axs = plt.subplots(x, y, figsize=(x*2, y*2))
  for i in range(x):
    for j in range(y):
      axs[i, j].imshow(segs[i][j], cmap=plt.cm.binary, vmin=0, vmax=1)
  plt.show()
  plt.close()
tensor_images = tf.convert_to_tensor(image_batch, dtype=tf.float32)
crops = tf.reshape(tensor_images, (-1, 8, tensor_images.shape[1]//8, 8,
tensor_images.shape[2]//8, tensor_images.shape[3]))
crops = tf.transpose(crops, [0, 1, 3, 2, 4, 5])
show_images(crops.numpy()[0], 8, 8)

【讨论】：