为什么我的用于糖尿病视网膜病变检测的 Keras CNN 根本不起作用答案

【问题标题】：Why my Keras CNN for Diabetic Retinopathy detections isn't work at all为什么我的用于糖尿病视网膜病变检测的 Keras CNN 根本不起作用
【发布时间】：2019-09-30 12:10:41
【问题描述】：

我必须做一个 CNN 来检测第 4 阶段的糖尿病视网膜病变（它必须检测第 4 阶段是否存在 DR，不需要检测其他级别）。输入将是这样的图像：https://i.imgur.com/DsU06Xv.jpg

为了更好地分类，我正在完善我的形象：https://i.imgur.com/X1p9G1c.png

所以，我有一个数据库，其中包含 0 级视网膜的 700 张图像和 4 级视网膜的 700 张图像。

问题是我尝试制作的所有模型都不起作用，通常它变成了一个过度拟合的问题..

我已经尝试使用 Sequential 模型、Functional API.. 在我在这里提出的一个问题中，一位用户推荐我使用 VGG16 >> 问题：https://datascience.stackexchange.com/questions/60706/how-do-i-handle-with-my-keras-cnn-overfitting

现在，我正在尝试使用 VGG16 但仍然无法正常工作，我所有的预测都是 0，我不知道该怎么处理它..

这是我的 train.py：

import cv2
import os
import numpy as np

from keras.layers.core import Flatten, Dense, Dropout, Reshape
from keras.layers.normalization import BatchNormalization
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
from keras import regularizers
from keras.models import Model
from keras.layers import Input, ZeroPadding2D, Dropout
from keras import optimizers
from keras.optimizers import SGD
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.utils import to_categorical 

from keras.applications.vgg16 import VGG16

# example of using a pre-trained model as a classifier
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.applications.vgg16 import preprocess_input
from keras.applications.vgg16 import decode_predictions

TRAIN_DIR = 'train/'
TEST_DIR = 'test/'
v = 'v/'
BATCH_SIZE = 32
NUM_EPOCHS = 5

def ReadImages(Path):
    LabelList = list()
    ImageCV = list()
    classes = ["nonPdr", "pdr"]

    # Get all subdirectories
    FolderList = [f for f in os.listdir(Path) if not f.startswith('.')]
    
    # Loop over each directory
    for File in FolderList:
        for index, Image in enumerate(os.listdir(os.path.join(Path, File))):
            # Convert the path into a file
            ImageCV.append(cv2.resize(cv2.imread(os.path.join(Path, File) + os.path.sep + Image), (224,224)))
            #ImageCV[index]= np.array(ImageCV[index]) / 255.0
            LabelList.append(classes.index(os.path.splitext(File)[0])) 
            
            ImageCV[index] = cv2.addWeighted(ImageCV[index],4, cv2.GaussianBlur(ImageCV[index],(0,0), 224/30), -4, 128)

    return ImageCV, LabelList

data, labels = ReadImages(TRAIN_DIR)
valid, vlabels = ReadImages(TEST_DIR)

vgg16_model = VGG16(weights="imagenet", include_top=True)
 
# (1) visualize layers
print("VGG16 model layers")
for i, layer in enumerate(vgg16_model.layers):
    print(i, layer.name, layer.output_shape)

# (2) remove the top layer
base_model = Model(input=vgg16_model.input, 
                   output=vgg16_model.get_layer("block5_pool").output)

# (3) attach a new top layer
base_out = base_model.output
base_out = Reshape((25088,))(base_out)
top_fc1 = Dropout(0.5)(base_out)
# output layer: (None, 5)
top_preds = Dense(1, activation="sigmoid")(top_fc1)

# (4) freeze weights until the last but one convolution layer (block4_pool)
for layer in base_model.layers[0:14]:
    layer.trainable = False

# (5) create new hybrid model
model = Model(input=base_model.input, output=top_preds)

# (6) compile and train the model
sgd = SGD(lr=1e-4, momentum=0.9)
model.compile(optimizer=sgd, loss="binary_crossentropy", metrics=["accuracy"])

datagen = ImageDataGenerator(
    featurewise_center=True,
    featurewise_std_normalization=True,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True)

# compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied)
datagen.fit(data)

# fits the model on batches with real-time data augmentation:
model.fit_generator(datagen.flow(np.array(data), np.array(labels), batch_size=32), 
                    steps_per_epoch=len(np.array(data)) / 32, epochs=5)


#history = model.fit([data], [labels], nb_epoch=NUM_EPOCHS, 
#                    batch_size=BATCH_SIZE, validation_split=0.1)

# evaluate final model
#vlabels = model.predict(np.array(valid))

model.save('model.h5')

当我运行它时，返回的准确率约为 1.0 或 0.99 %，损失最小约为 0.01..

这是我的 predict.py：

from keras.models import load_model
import cv2
import os
import json
import h5py
import numpy as np
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input

TEST_DIR = 'v/'

def fix_layer0(filename, batch_input_shape, dtype):
    with h5py.File(filename, 'r+') as f:
        model_config = json.loads(f.attrs['model_config'].decode('utf-8'))
        layer0 = model_config['config']['layers'][0]['config']
        layer0['batch_input_shape'] = batch_input_shape
        layer0['dtype'] = dtype
        f.attrs['model_config'] = json.dumps(model_config).encode('utf-8')

fix_layer0('model.h5', [None, 224, 224, 3], 'float32')

model = load_model('model.h5')

for filename in os.listdir(r'v/'):
    if filename.endswith(".jpg") or filename.endswith(".ppm") or filename.endswith(".jpeg") or filename.endswith(".png"):
        ImageCV = cv2.resize(cv2.imread(os.path.join(TEST_DIR) + filename), (224,224))
        
        x = image.img_to_array(ImageCV)
        x = np.expand_dims(x, axis=0)
        x = preprocess_input(x)
        print(np.argmax(model.predict(x)))

当我运行它时，我所有的预测都是 0.. 如果删除 'np.argmax' 并仅运行 model.predict，则返回以下结果：

[[0.03993018]]
[[0.9984968]]
[[1.]]
[[1.]]
[[0.]]
[[0.9999999]]
[[0.8691623]]
[[1.01611796e-07]]
[[1.]]
[[0.]]
[[1.]]
[[0.17786741]]

考虑到前 2 个图像是 0 类，其他图像是 1 类（4 级），结果不是 0.99 或 1.0 的 acc..

我应该怎么做？我真的非常感谢任何帮助！

更新

正如@Manoj 所说，我已经更新了我的代码。我已经添加了验证和提前停止：

es = EarlyStopping(monitor='val_loss', verbose=1)

# fits the model on batches with real-time data augmentation:
model.fit_generator(datagen.flow(np.array(data), np.array(labels), batch_size=32), 
                    steps_per_epoch=len(np.array(data)) / 32, epochs=5,
                    validation_data=(np.array(valid), np.array(vlabels)),
                    nb_val_samples=72, callbacks=[es])

并返回这些数字：

Epoch 1/5
44/43 [==============================] - 452s 10s/step - loss: 0.2377 - acc: 0.9162 - val_loss: 1.9521 - val_acc: 0.8472
Epoch 2/5
44/43 [==============================] - 445s 10s/step - loss: 0.0229 - acc: 0.9991 - val_loss: 1.8908 - val_acc: 0.8611
Epoch 3/5
44/43 [==============================] - 447s 10s/step - loss: 0.0107 - acc: 0.9993 - val_loss: 1.7658 - val_acc: 0.8611
Epoch 4/5
44/43 [==============================] - 458s 10s/step - loss: 0.0090 - acc: 0.9993 - val_loss: 1.6805 - val_acc: 0.8750
Epoch 5/5
44/43 [==============================] - 463s 11s/step - loss: 0.0052 - acc: 0.9993 - val_loss: 1.6730 - val_acc: 0.8750

但在那之后我的预测（7/12 正确）现在是 5/12 正确..

我能做些什么来处理它？

更新 2

我已将此代码放入我的 train.py 中：

mean = datagen.mean  
std = datagen.std

print(mean, "mean")
print(std, "std")

以及我在 predict.py 中插入的这些打印返回的值：

def normalize(x, mean, std):
    x[..., 0] -= mean[0]
    x[..., 1] -= mean[1]
    x[..., 2] -= mean[2]
    x[..., 0] /= std[0]
    x[..., 1] /= std[1]
    x[..., 2] /= std[2]
    return x

for filename in os.listdir(r'v/'):
    if filename.endswith(".jpg") or filename.endswith(".ppm") or filename.endswith(".jpeg") or filename.endswith(".png"):
        ImageCV = cv2.resize(cv2.imread(os.path.join(TEST_DIR) + filename), (224,224))
        
        x = image.img_to_array(ImageCV)
        x = np.expand_dims(x, axis=0)
        x = normalize(x, [59.5105,61.141457,61.141457], [60.26705,61.85445,63.139835])

        prob = model.predict(x)
        if prob < 0.5:
            print("nonPDR")
        else:
            print("PDR")
        print(filename)

现在我所有的预测都是（1 级）PDR...我做错了什么？

更新 3

我已经放弃了在 ReadImages 中使用的 gaussianblur，并包含以下内容：

data = np.asarray(data)
valid = np.asarray(valid)

data = data.astype('float32')
valid = valid.astype('float32')

data /= 255
valid /= 255

在运行我的 train.py 之后：

Epoch 1/15

44/43 [==============================] - 476s 11s/step - loss: 0.7153 - acc: 0.5788 - val_loss: 0.6937 - val_acc: 0.5556

Epoch 2/15

44/43 [==============================] - 468s 11s/step - loss: 0.5526 - acc: 0.7275 - val_loss: 0.6838 - val_acc: 0.5833

Epoch 3/15

44/43 [==============================] - 474s 11s/step - loss: 0.5068 - acc: 0.7595 - val_loss: 0.6927 - val_acc: 0.5694

Epoch 00003: early stopping

之后，我在 predict.py 上更新标准和均值：

for filename in os.listdir(r'v/'):
    if filename.endswith(".jpg") or filename.endswith(".ppm") or filename.endswith(".jpeg") or filename.endswith(".png"):
        ImageCV = cv2.resize(cv2.imread(os.path.join(TEST_DIR) + filename), (224,224))
        
        ImageCV = np.asarray(ImageCV)
        
        ImageCV = ImageCV.astype('float32')
        
        ImageCV /= 255  
        x = ImageCV
        
        x = np.expand_dims(x, axis=0)
        x = normalize(x, [0.12810835, 0.17897758, 0.23883381], [0.14304605, 0.18229756, 0.2362126])
        
        prob = model.predict(x)
        if prob <= 0.70: # I CHANGE THE THRESHOLD TO 0.7
            print("nonPDR >>>", filename)
            nonPdr += 1
        else:
            print("PDR >>>", filename)
            pdr += 1
        print(prob)
print("Number of retinas with PDR: ",pdr)
print("Number of retinas without PDR: ",nonPdr)

运行此代码后，我的测试目录中的准确率大约为 75%..

那么，我可以改进一些东西吗，或者这是这些少量图像的最大值？

【问题讨论】：

更改优化器、增加学习率和具有强衰减因子的动量可能有助于解决过度拟合问题。在有限的时间内我只能说这些，但我今天会仔细研究一下以帮助你。
此外，您可以添加EarlyStopping 训练更多纪元。好像你这样做了 5 个时期。
提供指向 jpg/png 图像的链接，而不是 imgur 上的图库
我已经更新了帖子

标签： python machine-learning keras neural-network conv-neural-network

【解决方案1】：

对数据进行的预处理步骤应与训练和测试相同。我看到至少有两个不一致之处。首先，在训练数据上，对所有图像应用 GaussianBlur。通常，这种转换被用作数据增强策略，而不是应用于整个训练集。其次，用于训练和测试的归一化应该相同。在上面的代码 sn-ps 中，应用vgg16.preprocess_input 进行预测，它使用imagenet 数据的均值/方差，而在训练期间，均值/方差是根据训练数据本身计算的。您可以做的是在调用datagen.fit 后获取datagen.mean 和datagen.std 值，并在预测期间使用它来规范化数据，而不是preprocess_input。
您没有定义验证生成器。训练时，你使用训练集和验证集，当验证损失没有改善时停止训练。否则，模型将过度拟合训练数据集。

https://gist.github.com/fchollet/7eb39b44eb9e16e59632d25fb3119975 https://keras.io/callbacks/#earlystopping
由于网络的最后一层是这样的 sigmoid

top_preds = Dense(1, activation="sigmoid")(top_fc1)

只有一个输出，它是一个从 0 到 1 的概率值。 np.argmax 在这里不相关。

np.argmax 在最后一层使用softmax 激活时使用概率和为 1 的两个输出和具有较高的索引概率被选为结果。

回到您使用sigmoid 获得的结果，通常是一个阈值被选择来决定是将其分类为 0 类还是 1 类。默认阈值为 0.5。 ROC 曲线可以使用提出最佳阈值的概率。

https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_curve.html

使用阈值 0.5，
```
prob = model.predict(x)
if prob < 0.5:
    output = 0
else:
    output = 1

[[0.03993018]] => < 0.5, class 0 correct
[[0.9984968]]  => > 0.5, class 1 incorrect
[[1.]]         => > 0.5, class 1 correct
[[1.]]         => > 0.5, class 1 correct
[[0.]]         => < 0.5, class 0 incorrect
[[0.9999999]]  => > 0.5, class 1 correct
[[0.8691623]]  => > 0.5, class 1 correct
[[1.01611796e-07]] => < 0.5, class 0 incorrect
[[1.]]             => > 0.5, class 1 correct
[[0.]]             => < 0.5, class 0 incorrect
[[1.]]             => > 0.5, class 1 correct
[[0.17786741]]     => < 0.5, class 0 incorrect
```
准确度 = 7/12 = 58%

【讨论】：

我已经生成了其他具有验证和早停的模型，但预测变得更糟......它仅在 5 个样本中预测正确
嗨...你只回答1次？
std = datagen.std(data), mean = datagen.mean(data) 是这样的吗？
datagen.mean 和 datagen.std 是包含均值和标准的 numpy 数组。
以及如何在我的 predict.py 中调用它？