Tensorflow：卷积网络的数据输入答案

【问题标题】：Tensorflow: Datainput for convolutional networkTensorflow：卷积网络的数据输入
【发布时间】：2020-11-20 02:19:34
【问题描述】：

我有一个“图像”，其数据为：

点：x、y 列

特征：红、绿、蓝、权重（对于像素）列，对应每个点

它们被保存为每列长度为 100 的张量，我在其中得到了大约 1000 行全部的。我真的不明白输入是如何工作的，所以我可以测试一个基本的卷积网络，如 tensorflow 示例中所示：

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(1))

https://www.tensorflow.org/tutorials/images/cnn

【问题讨论】：

标签： python tensorflow machine-learning input deep-learning

【解决方案1】：

将输入放入模型有两个原因，训练和推理（预测）。对于训练，您应该使用compile 和fit 模型。

加载图片的方法很少。您可以一次加载一张图片，也可以批量加载多张图片。如果你想从目录加载图片，你应该tf.keras.preprocessing.image_dataset_from_directory

data_file = 'image_file_path'

img = keras.preprocessing.image.load_img(
    data_file, target_size=(img_height, img_width)
)
img_array = keras.preprocessing.image.img_to_array(img)
img_array = tf.expand_dims(img_array, 0) # Create a batch

predictions = model.predict(img_array)
score = tf.nn.softmax(predictions[0])

data_dir = 'image_dir_path'

val_ds = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  seed=0,
  image_size=(img_height, img_width),
  batch_size=batch_size)

参考：https://www.tensorflow.org/tutorials/load_data/images

【讨论】：

谢谢，但这并不能真正回答我关于如何使用我提到的数据作为输入的问题。

【解决方案2】：

以下是三个示例，您可以通读并玩弄/调试和破解，以获得在笔记本或 IDE 中的实践经验。

1.从目录中的图像文件生成 tf.data.Dataset

tf.keras.preprocessing.image_dataset_from_directory(
    directory, labels='inferred', label_mode='int', class_names=None,
    color_mode='rgb', batch_size=32, image_size=(256, 256), shuffle=True, seed=None,
    validation_split=None, subset=None, interpolation='bilinear', follow_links=False
)

2。 TensorFlow 2.0 - 将图像加载到 tensorflow

import pathlib
data_dir = tf.keras.utils.get_file(origin='https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz',fname='flower_photos', untar=True)
data_dir = pathlib.Path(data_dir)

pathlib 是一个模块，它提供了表示文件系统路径的类，这些类具有适用于不同操作系统的语义。（来自 docs.python.org）目录
我们将使用tf.keras.utils.get_file 下载一个名为fname='flower_photos' 的文件。回顾）train_file_path = tf.keras.utils.get_file("train.csv", TRAIN_DATA_URL) 来自加载 csv 帖子。
现在，我们想使用 pathlib.Path 查看显式文件系统路径然后，data_dir 将输出 PosixPath('/root/.keras/datasets/flower_photos')。
我们的flower_photos 文件将位于“.keras/datasets”目录中仅供参考）.keras 对我们来说是不可见的，但却存在。

使用 tf.data.Dataset 加载

dataset = tf.data.Dataset.from_tensor_slices((df.values, target.values)) 
### Or...
train_dataset = tf.data.Dataset.from_tensor_slices((train_examples, 
train_labels))


BATCH_SIZE = 32
IMG_HEIGHT = 224
IMG_WIDTH = 224

创建有关文件路径的数据集。

list_ds = tf.data.Dataset.list_files(str(data_dir/'*/*'))

为了便于理解，让我们看看‘list_ds’是什么样子的。

for f in list_ds.take(1):
  print(f)

str(data_dir/'/') 'stringlizes' (data_dir /'/'); /root/.keras/datasets/flower_photos//。第一个星号是类名或子目录的空白，然后是照片的名称。
tf.data.Dataset.list_files(...) 会将字符串保存到张量中。

从文件路径获取标签：get_label() 函数：

def get_label(file_path):
  parts = tf.strings.split(file_path, '/')
  return parts[-2] == CLASS_NAMES

tf.strings.split 溢出了以“/”为分隔符的“file_path”。碎片或路径组件将作为列表保存到“零件”中

解码图像：decode_img() 函数。解码是一个模糊的概念，但它基本上是一种将人类版本的图像转换为计算机版本的图像的方法。我们使用 decode_img 的目的是根据图像的 RGB 内容将图像转换为网格格式。

def decode_img(img):
  img = tf.image.decode_jpeg(img, channels=3) #color images
  img = tf.image.convert_image_dtype(img, tf.float32) 
   #convert unit8 tensor to floats in the [0,1]range
  return tf.image.resize(img, [IMG_WIDTH, IMG_HEIGHT]) 
#resize the image into 224*224

结合 get_label() 和 decode_img() 以便我们可以获得给定 file_path: process_path() 函数的 (image, label) 对。

def process_path(file_path):
  label = get_label(file_path)
  img = tf.io.read_file(file_path)
  img = decode_img(img)
  return img, label
There are two lines of code that were not looked into.
def decode_img(img):
img = tf.image.decode_jpeg(img, channels=3)

上面的行会将压缩字符串转换为 3D 单位 8 张量。这是必需的，因为我们的 process_path 函数具有 tf.io.read_file( file_path ) 函数，该函数可以“作为字符串”读取和输出输入的全部内容

def process_path(file_path):
img = tf.io.read_file(file_path)

创建（图像、标签）对的数据集

我们将使用 Dataset.map 并定义 num_parallel_calls 以便同时加载多个图像。

labeled_ds = list_ds.map(process_path, num_parallel_calls=AUTOTUNE)

让我们检查一下labeled_ds中的内容。

for image, label in labeled_ds.take(1):
  print("Image shape: ", image.numpy().shape)
  print("Label: ", label.numpy())

图像形状：(224, 224, 3) 标签：[False False False True False] 我们将 decode_image() 中的图像大小调整为 224×224。而 3 用于 RGB 彩色图像。标签：[False False False True False]。我们的例子在蒲公英类中。

准备培训

为了提高效率，我们将使用 tf.data api。

def prepare_for_training(ds, cache=True, shuffle_buffer_size=1000):
  
  if cache:
    if isinstance(cache, str):
      ds = ds.cache(cache)
    else:
      ds = ds.cache()
  ds = ds.shuffle(buffer_size=shuffle_buffer_size)
  ds = ds.repeat() #repeat forever
  ds = ds.batch(BATCH_SIZE)
  ds = ds.prefetch(buffer_size=AUTOTUNE)
  return ds

如果缓存是字符串类型，isinstance(cache,str) 函数返回 true。
缓存可以节省时间（我们稍后会检查），因为字符串数据保存在内存中，无需多次加载。（在colab环境下，保存在colab内存中）。
Prefetch 将在模型训练时在后台准备您的下一批。

train_ds = prepare_for_training(labeled_ds) image_batch, label_batch = next(iter(train_ds)) show_batch(image_batch.numpy(), label_batch.numpy())

性能

我们将通过测量加载图像所花费的时间来检查性能。

import time
default_timeit_steps = 1000
def timeit(ds, steps=default_timeit_steps):
  start = time.time()
  it = iter(ds)
  for i in range(steps):
    batch = next(it)
    if i%10 == 0:
      print('.',end='')
  print()
  end = time.time()
  duration = end-start
  print("{} batches: {} s".format(steps, duration))
  print("{:0.5f} Images/s".format(BATCH_SIZE*steps/duration))

比较 keras.preprocessing 和 tf.data

#keras.preprocessing - *see example #1 above*
timeit(train_data_gen)

1000 批：97.74200057983398 s 327.39252 Images/s

#tf.data
timeit(train_ds)

1000批次：16.27811074256897 s 1965.83010 Images/s

tf.data“数据生成器”使用 .cache 大约快 6 倍，大约快 2 倍

3.加载和处理图像数据集

读取磁盘上的图像目录

设置您的代码：

import numpy as np
import os
import PIL
import PIL.Image
import tensorflow as tf
import tensorflow_datasets as tfds

设置图像数据集路径

import pathlib
dataset_path = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
data_dir = tf.keras.utils.get_file(origin=dataset_path, 
                                   fname='flower_photos', 
                                   untar=True)
data_dir = pathlib.Path(data_dir)
image_count = len(list(data_dir.glob('*/*.jpg')))
print(image_count)

数据集中的每个文件夹包含不同类别的花卉

roses = list(data_dir.glob('roses/*'))
PIL.Image.open(str(roses[0])) # index 0, 1, 2, 3...

使用keras.preprocessing 加载并创建数据集

为加载器定义一些参数
从磁盘目录中加载这些图像

在开发模型时使用验证拆分。我们将使用 80% 的图像进行训练，20% 用于验证。

batch_size = 32
img_height = 180
img_width = 180


train_ds = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="training",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)


val_ds = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="validation",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

打印数据集图像的类名：

class_names = train_ds.class_names
print(class_names)

可视化数据并获取张量形状

import matplotlib.pyplot as plt

plt.figure(figsize=(10, 10))
for images, labels in train_ds.take(1):
  for i in range(9):
    ax = plt.subplot(3, 3, i + 1)
    plt.imshow(images[i].numpy().astype("uint8"))
    plt.title(class_names[labels[i]])
    plt.axis("off")

image_batch 是形状 (32, 180, 180, 3) 的张量。这是一组 32 张形状为 180x180x3 的图像（最后一个维度是指颜色通道 RGB）。 label_batch 是形状 (32,) 的张量，这些是 32 张图像的对应标签。

for image_batch, labels_batch in train_ds:
  print(image_batch.shape)
  print(labels_batch.shape)
  break

数据标准化

RGB 通道值在 [0, 255] 范围内。这对于神经网络来说并不理想；通常，您应该设法使输入值变小。在这里，我们将使用 Rescaling 层将值标准化为 [0, 1]。或者，您可以在模型定义中包含该层以简化部署。我们将在这里使用第二种方法。

注意：如果您想将像素值缩放到 [-1,1]，您可以改为编写 Rescaling(1./127.5, offset=-1)

注意：我们之前使用 image_dataset_from_directory 的 image_size 参数调整了图像的大小。如果您想在模型中包含调整大小的逻辑，您可以改用 Resizing 层。

from tensorflow.keras import layers

normalization_layer = tf.keras.layers.experimental.preprocessing.Rescaling(1./255)
normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
image_batch, labels_batch = next(iter(normalized_ds))
first_image = image_batch[0]
# Notice the pixels values are now in `[0,1]`.
print(np.min(first_image), np.max(first_image))

性能配置

.cache() 将图像从磁盘加载后保存在内存中在第一个时代。这将确保数据集不会变成训练模型时的瓶颈。如果您的数据集太大为了适应内存，您还可以使用此方法创建一个高性能磁盘缓存。

.prefetch() 与数据预处理和模型执行重叠，而训练。

 AUTOTUNE = tf.data.experimental.AUTOTUNE

 train_ds = train_ds.cache().prefetch(buffer_size=AUTOTUNE)
 val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

训练模型

num_classes = 5

model = tf.keras.Sequential([
  layers.experimental.preprocessing.Rescaling(1./255),
  layers.Conv2D(32, 3, activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(32, 3, activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(32, 3, activation='relu'),
  layers.MaxPooling2D(),
  layers.Flatten(),
  layers.Dense(128, activation='relu'),
  layers.Dense(num_classes)
])

model.compile(
  optimizer='adam',
  loss=tf.losses.SparseCategoricalCrossentropy(from_logits=True),
  metrics=['accuracy'])

model.fit(
  train_ds,
  validation_data=val_ds,
  epochs=3
)

【讨论】：

谢谢，但这并不能真正回答我关于如何使用我提到的数据作为输入的问题。
好的，我明白你的意思了。坐等，我要添加一大堆代码。
我坐稳了
好的，我给你设置了三个例子。
你看到我上面的答案了吗？