将 cv.imread() 数据发送到 Keras 模型答案

【问题标题】：Sending cv.imread() data to Keras model将 cv.imread() 数据发送到 Keras 模型
【发布时间】：2020-07-31 16:38:57
【问题描述】：

我无法弄清楚如何将数据从 cv.imread() 发送到我的机器学习模型。

从我的图像读取函数中，我得到一个 numpy 数组列表，其中包含尺寸为 (256, 256, 3) 的图像。

# image reading
res_img = []
for i in files:
    img = cv2.imread(os.path.join("temp", i))
    res = cv2.resize(img, (256, 256))
    res_img.append(res)
return res_img

然后将其存储在数据框中并发送到模型。但是，数据帧被检测为具有维度 (56, 1)，其中 56 是我的数据的长度，而 1 因为每个 numpy 数组都被检测为 1 个对象。

# train model
model = create_model(trainX)
model_history = model.fit(trainX, trainY, validation_data=(testX, testY), epochs=..., batch_size=...)

# create model
def create_model(data):
    model = Sequential()
    model.add(Conv2D(32, kernel_size=4, activation='relu', input_shape=(256, 256, 3)))
    ...
    return model

但是，这会返回

ValueError: Error when checking input: expected conv2d_input to have 4 dimensions, but got array with shape (56, 1)

我尝试的另一件事是将数据中的所有 numpy 数组组合成一个大型 numpy 数组，该数组确实具有正确的维度

trainX_arr = []
trainX = trainX.to_numpy()
for i in trainX:
    trainX_arr.append(i)
    trainX_arr = np.asarray(trainX_arr)

这确实给出了正确的形状：

print(trainX_arr.shape)
# (56, 256, 256, 3)

但是，当发送到模型时它会返回

ValueError: No data provided for "conv2d_input". Need data for each key in: ['conv2d_input']

我假设是因为输入不是数据框。最后，我尝试在第一步中组合 numpy 数组，然后将其存储在数据框中，就像这样

res_img = []
for i in files:
    img = cv2.imread(os.path.join("temp", i))
    res = cv2.resize(img, (256, 256))
    res_img.append(res)
img_arr = []
for i in res_img:
    img_arr.append(i)
    img_arr = np.asarray(img_arr)
return img_arr

但是，当尝试将其插入数据框时：

df.insert(0, "x", img_arr)

ValueError: Wrong number of dimensions. values.ndim != ndim [4 != 2]

我假设是因为数据框不能保存多维数组，但这让我回到了开始的地方。我真的很困惑我应该做什么才能让它工作，任何帮助都将不胜感激。

【问题讨论】：

df是什么对象？请提供最小的工作示例
df 是一个熊猫数据框
你必须把它转换成数据框吗？如果没有，您可以使用给出正确形状的数字矩阵进行训练：trainX

标签： python numpy opencv keras

【解决方案1】：

我设法让它工作，我的第二种方法将 numpy 数组组合成一个具有有效维度的大数组，能够成功地训练一个模型。我不确定为什么它以前不起作用，但这是我的代码：

定义了一个函数（任意称为numpyfy）来执行这个数组合并

import os

import cv2
import numpy as np
import pandas as pd
from keras.layers import Conv2D, Dense, Flatten, MaxPooling2D
from keras.engine.input_layer import Input
from keras.models import Model
from sklearn.model_selection import train_test_split


# download images
def process_link(link_list):
    counter = 0
    for i in link_list:
        if i.find("jpg") != -1:
            ext = ".jpg"
        elif i.find("png") != -1:
            ext = ".png"
        f = open(os.path.join("temp", str(counter) + ext), "wb")
        f.write(requests.get(i).content)
        f.close()
        counter += 1
    files = os.listdir("temp") # images stored in temp directory
    res_img = []
    for i in files:
        img = cv2.imread(os.path.join("temp", i))
        res = cv2.resize(img, (256, 256))
        res_img.append(res)
    return res_img

# process data
def process_data():
    # link_list and y are a list of: links to images, and data labels, respectively
    df = pd.DataFrame()
    df.insert(0, "Y", y])
    df.insert(0, "img", process_link(link_list))
    (train, test) = train_test_split(clean_df, test_size=0.25, random_state=42)
    return (train, test)

# numpy array combining
def numpyfy(df):
    arr = []
    df_numpy = df.to_numpy()
    print(df_numpy[:2])
    for i in df_numpy:
        arr.append(i)
    arr = np.asarray(arr)
    #print(arr.shape), returns 4 dimensional array
    return arr

# Deep learning model, changed to use the Functional API
def create_model():
    input1 = Input(shape=(256, 256, 3))
    conv1 = Conv2D(32, (3, 3), input_shape=(3, 256, 256), activation="relu")(inpu1)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
    conv2 = Conv2D(32, (3, 3), activation="relu")(pool1)
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
    flat1 = Flatten()(pool2)
    dense1 = Dense(16, activation="relu")(flat1)
    dense2 = Dense(1, activation="sigmoid")(dense1)
    model = Model(inputs=input1, outputs=dense2)
    model.compile(loss='mse', optimizer='adadelta', metrics=['mse', 'mae'])
    return model

# train model
def train_model():
    (train, test) = process_data() #Returns 2 dataframes (train, test)
    train_img, test_img = numpyfy(train["img"]), numpyfy(test["img"])
    model = create_model()
    model.fit(train_img, train["Y"], validation_data=(test_img, test["Y"]),
            epochs=epochs, batch_size=batch_size)

最后我不确定我做了什么不同的事情没有引发错误，但这有效。

【讨论】：