如果图像是 tif float32，如何将图块读入张量？答案

【问题标题】：How to read the tiles into the tensor if the images are tif float32?如果图像是 tif float32，如何将图块读入张量？
【发布时间】：2021-04-15 08:24:48
【问题描述】：

我正在尝试运行一个 CNN，其中输入图像具有三个通道 (rgb)，而标签（目标）图像是灰度图像（1 个通道）。输入和标签图像为float32和tif格式。

我得到的图像和标签瓦片对列表如下：

def get_train_test_lists(imdir, lbldir):
    imgs = glob.glob(imdir+"/*.tif")
    dset_list = []
    for img in imgs:
        filename_split = os.path.splitext(img) 
        filename_zero, fileext = filename_split 
        basename = os.path.basename(filename_zero) 
        dset_list.append(basename)
    
    x_filenames = []
    y_filenames = []
    for img_id in dset_list:
        x_filenames.append(os.path.join(imdir, "{}.tif".format(img_id)))
        y_filenames.append(os.path.join(lbldir, "{}.tif".format(img_id)))
    
    print("number of images: ", len(dset_list))
    return dset_list, x_filenames, y_filenames

train_list, x_train_filenames, y_train_filenames = get_train_test_lists(img_dir, label_dir)
test_list, x_test_filenames, y_test_filenames = get_train_test_lists(test_img_dir, test_label_dir)

from sklearn.model_selection import train_test_split
x_train_filenames, x_val_filenames, y_train_filenames, y_val_filenames = 
train_test_split(x_train_filenames, y_train_filenames, test_size=0.1, random_state=42)

num_train_examples = len(x_train_filenames)
num_val_examples = len(x_val_filenames)
num_test_examples = len(x_test_filenames)

为了将瓦片读入张量，首先我定义了图像尺寸和批量大小：

img_shape = (128, 128, 3)
batch_size = 2

我注意到基于this link 的 tif 图像在 tensorflow 中没有解码器。 tfio.experimental.image.decode_tiff 可以使用，但it decodes to unit8 tensor.

这是png images的示例代码：

def _process_pathnames(fname, label_path):
  # We map this function onto each pathname pair  
  img_str = tf.io.read_file(fname)
  img = tf.image.decode_png(img_str, channels=3)

  label_img_str = tf.io.read_file(label_path)

  # These are png images so they return as (num_frames, h, w, c)
  label_img = tf.image.decode_png(label_img_str, channels=1)
  # The label image should have any values between 0 and 9, indicating pixel wise
  # cropt type class or background (0). We take the first channel only. 
  label_img = label_img[:, :, 0]
  label_img = tf.expand_dims(label_img, axis=-1)
  return img, label_img

是否可以通过tf.convert_to_tensor 或任何其他选项修改此代码以从 tif 图像中获取 float32 张量？（我之前问过this question，但我不知道如何将tf.convert_to_tensor与提到的代码集成）

【问题讨论】：

标签： python tensorflow conv-neural-network

【解决方案1】：

您可以读取几乎任何图像格式并使用 Pillow 图像包将其转换为 numpy 数组：

from PIL import Image
import numpy as np

img = Image.open("image.tiff")
img = np.array(img)

print(img.shape, img.dtype)
# (986, 1853, 4) uint8

您可以将此函数集成到您的代码中，然后将 numpy 数组转换为 tensorflow 张量以及进行适当的图像转换。

旁注：您可以使用pathlib 包（与os 一样集成到Python3，但使用起来更简单）来简化很多get_train_test_lists 函数。

def get_train_test_lists(imdir, lbldir):
    x_filenames = list(Path(imdir).glob("*.tif"))
    y_filenames = [Path(lbldir) / f.name for f in x_filenames]
    dset_list = [f.stem for f in x_filenames]
    return dset_list, x_filenames, y_filenames

请注意，x_filenames 和 y_filenames 现在是绝对路径，但这在您的代码中应该不是问题。

【讨论】：

感谢您的回复。我从这个链接知道 [stackoverflow.com/questions/67093424/… OpenCV 或 Pillow 可用于读取图像，但我有大量图像，我不确定应该在代码中的哪个位置调用它们。另外，我不确定tf.convert_to_tensor应该如何添加到代码中
那么您的实际问题是什么？从uint8 到float32 的图像转换？如何以及在哪里处理数据库读取？
uint8 是一个 0 到 255 范围内的整数，CNN 通常接受在 0 到 1 范围内归一化的 float32 类型的张量，并且可以选择使用 imagenet 归一化。要将 uint8 转换为标准化的 float32，请将其除以 255。
我的第一个问题与处理图像有关。我使用的图像最初是 float32 包含十进制值。它们是具有连续值的灰度图像：[0 到 790.65]、[-2.74174 到 2.4126]、[150.87 到 260.45]、[-32.927 到 69.333]。前三个是形成rgb 图像的输入，最后一个是目标。我对如何将它们标准化并输入 CNN 感到困惑？是否需要转为unit8？
我计划将图像划分为 128x128 的图块，但首先我不知道如何标准化它们，因为它们已经是 float32。如果你能指出我正确的方向，我真的很感激。