如何修复恒定精度和 val_accuracy答案

【问题标题】：how to fix constant accuracy and val_accuracy如何修复恒定精度和 val_accuracy
【发布时间】：2021-07-31 07:34:17
【问题描述】：

在看到此 [post][1] 后，我尝试通过添加 dropout 来修复它，但它不起作用。而且我仍然获得一致的准确性，因此我们将不胜感激。

import os
os.environ['KAGGLE_CONFIG_DIR'] = "/content"
!kaggle datasets download -d jakeshbohaju/brain-tumor
!unzip \*.zip -d brain_tumor_dataset 
!rm -rf yes
!rm -rf no
!rm -rf *.zip

# Commented out IPython magic to ensure Python compatibility.
import pandas as pd
import numpy as np
import os

import tensorflow as tf
import cv2
from tensorflow import keras
from tensorflow.keras import layers, Input
from keras.layers import InputLayer, MaxPooling2D, Flatten, Dense, Conv2D, Dropout, BatchNormalization
from keras.losses import BinaryCrossentropy
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions, ResNet50
from tensorflow.keras.optimizers import Adam, SGD

from sklearn.utils import shuffle
from sklearn.model_selection import train_test_split
from PIL.Image import open

from  matplotlib import pyplot as plt
import matplotlib.image as mpimg
import random
# %matplotlib inline

# Constants
IMAGE_DATASET = "/content/brain_tumor_dataset/Brain Tumor/Brain Tumor"
IMAGE_DATASET_RAW = '/content/brain_tumor_dataset/Brain Tumor/Brain Tumor'
WORKING_FOLDER = "/content/brain_tumor_dataset/working"
IMG_HEIGHT = 224
IMG_WIDTH = 224
EPOCHS = 100

# # Image3202
plt.figure(figsize=(20,20))
test_folder="/content/brain_tumor_dataset/Brain Tumor/Brain Tumor/Image100.jpg" 
img=mpimg.imread(test_folder)
print(img.size)
ax=plt.subplot(1,5,4)
# # ax.title.set_text(file)
plt.imshow(img)

# We will import the csv file containing the features and the classes of the images
cortex_df = pd.read_csv("/content/brain_tumor_dataset/Brain Tumor.csv")
cortex_df.head()

plt.figure(figsize=(20,20))
test_folder="/content/brain_tumor_dataset/Brain Tumor/Brain Tumor" 
for i in range(5):
    file = random.choice(os.listdir(test_folder))
    image_path= os.path.join(test_folder, file)
    img=mpimg.imread(image_path)
    ax=plt.subplot(1,5,i+1)
    ax.title.set_text(file)
    plt.imshow(img)

dataset_df = pd.DataFrame()
dataset_df["Image"] = cortex_df["Image"]
dataset_df["Class"] = cortex_df["Class"]
path_list = []
for img_path in os.listdir(IMAGE_DATASET):
    path_list.append( os.path.join(IMAGE_DATASET,img_path))
path_dict = {os.path.splitext(os.path.basename(x))[0]: x for x in path_list}
dataset_df["paths"] = cortex_df["Image"].map(path_dict.get)
dataset_df["pixels"] = dataset_df["paths"].map(lambda x:np.asarray(open(x).resize((IMG_HEIGHT,IMG_WIDTH))))
dataset_df.head()

image_list = []
for i in range(len(dataset_df)):
    brain_image = dataset_df["pixels"][i].astype(np.float32)
    brain_image /= 255
    image_list.append(brain_image)
X = np.array(image_list)
print(X.shape)

y = np.array(dataset_df.Class)
#y.shape

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print('The shape of the X_train :'+' '+str(X_train.shape))
print('The size of the X_train :'+' '+str(X_train.shape[0]))
print('The shape of the X_test :'+' '+str(X_test.shape))
print('The size of the X_test:'+' '+str(X_test.shape[0]))

def model(input_shape):
#     res_conv = ResNet50(include_top=False, weights="imagenet", input_tensor=None, input_shape=input_shape, pooling=None)
    model = Sequential()
    
    model.add(Input(shape=input_shape))
    
    model.add(Conv2D(16, kernel_size=3, strides=(2, 2), padding="same", activation="relu", kernel_initializer="he_normal"))
    model.add(Conv2D(16, kernel_size=3, strides=(2, 2), padding="same", activation="relu", kernel_initializer="he_normal"))
    model.add(Dropout(0.25))
    model.add(MaxPooling2D(pool_size=(2, 2), data_format="channels_last", padding='same'))
            
    model.add(Conv2D(32, kernel_size=3, strides=(2, 2), padding="same", activation="relu", kernel_initializer="he_normal"))
    model.add(Conv2D(32, kernel_size=3, strides=(2, 2), padding="same", activation="relu", kernel_initializer="he_normal"))
    model.add(Dropout(0.25))
    model.add(MaxPooling2D(pool_size=(2, 2), data_format="channels_last", padding='same'))
    
    model.add(Conv2D(64, kernel_size=3, strides=(2, 2), padding="same", activation="relu", kernel_initializer="he_normal"))
    model.add(Conv2D(64, kernel_size=3, strides=(2, 2), padding="same", activation="relu", kernel_initializer="he_normal"))
    model.add(Conv2D(64, kernel_size=3, strides=(2, 2), padding="same", activation="relu", kernel_initializer="he_normal"))
    model.add(Conv2D(64, kernel_size=3, strides=(2, 2), padding="same", activation="relu", kernel_initializer="he_normal"))
    model.add(Dropout(0.25))
    model.add(MaxPooling2D(pool_size=(2, 2), data_format="channels_last", padding='same'))
    
    model.add(Conv2D(128, kernel_size=3, strides=(2, 2), padding="same", activation="relu", kernel_initializer="he_normal"))
    model.add(Conv2D(128, kernel_size=3, strides=(2, 2), padding="same", activation="relu", kernel_initializer="he_normal"))
    model.add(Conv2D(128, kernel_size=3, strides=(2, 2), padding="same", activation="relu", kernel_initializer="he_normal"))
    model.add(Conv2D(128, kernel_size=3, strides=(2, 2), padding="same", activation="relu", kernel_initializer="he_normal"))
    model.add(Dropout(0.25))
    model.add(MaxPooling2D(pool_size=(2, 2), data_format="channels_last", padding='same'))
    
    model.add(Conv2D(256, kernel_size=3, strides=(2, 2), padding="same", activation="relu", kernel_initializer="he_normal"))
    model.add(Conv2D(256, kernel_size=3, strides=(2, 2), padding="same", activation="relu", kernel_initializer="he_normal"))
    model.add(Conv2D(256, kernel_size=3, strides=(2, 2), padding="same", activation="relu", kernel_initializer="he_normal"))
    model.add(Conv2D(256, kernel_size=3, strides=(2, 2), padding="same", activation="relu", kernel_initializer="he_normal"))
    model.add(Dropout(0.25))
    model.add(MaxPooling2D(pool_size=(2, 2), data_format="channels_last", padding='same'))
    
    model.add(Flatten())
    model.add(Dense(256, activation="relu"))
    model.add(Dropout(0.5))
    model.add(Dense(128, activation="relu"))
    model.add(Dropout(0.4))
    model.add(Dense(1, activation="sigmoid"))    # Never use sigmoid for binary classification
    
    return model

model = model(input_shape = (IMG_HEIGHT, IMG_WIDTH, 3))

model.summary()

# optimizer = Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=False, name="Adam",)
optimizer = SGD(learning_rate=0.01)
loss_fn = BinaryCrossentropy(from_logits=True)
model.compile(optimizer=optimizer, loss=loss_fn, metrics=['accuracy'])

# Training the model
history = model.fit(x=X_train, y=y_train, epochs=EPOCHS, batch_size=10)


loss = history.history["loss"]
acc = history.history["accuracy"]

epoch = np.arange(EPOCHS)
plt.plot(epoch, loss)
# plt.plot(epoch, val_loss)
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title('Training Loss')
plt.legend(['train', 'val'])

epoch = np.arange(EPOCHS)
plt.plot(epoch, acc)
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.title('Training Accuracy');

eval_score = model.evaluate(X_test, y_test)
print("Test loss:", eval_score[0])
print("Test accuracy:", eval_score[1])

一些输出

Epoch 70/100
301/301 [==============================] - 4s 13ms/step - loss: 0.6864 - accuracy: 0.5577
Epoch 71/100
301/301 [==============================] - 4s 13ms/step - loss: 0.6867 - accuracy: 0.5577
Epoch 72/100
301/301 [==============================] - 4s 12ms/step - loss: 0.6866 - accuracy: 0.5577
Epoch 73/100
301/301 [==============================] - 4s 12ms/step - loss: 0.6866 - accuracy: 0.5577
Epoch 74/100
301/301 [==============================] - 4s 12ms/step - loss: 0.6867 - accuracy: 0.5577
Epoch 75/100
301/301 [==============================] - 4s 12ms/step - loss: 0.6868 - accuracy: 0.5577
Epoch 76/100
301/301 [==============================] - 4s 12ms/step - loss: 0.6869 - accuracy: 0.5577
Epoch 77/100
301/301 [==============================] - 4s 12ms/step - loss: 0.6867 - accuracy: 0.5577
Epoch 78/100
301/301 [==============================] - 4s 12ms/step - loss: 0.6866 - accuracy: 0.5577
Epoch 79/100
301/301 [==============================] - 4s 12ms/step - loss: 0.6867 - accuracy: 0.5577
Epoch 80/100
301/301 [==============================] - 4s 12ms/step - loss: 0.6864 - accuracy: 0.5577
Epoch 81/100
301/301 [==============================] - 4s 12ms/step - loss: 0.6866 - accuracy: 0.5577
Epoch 82/100
301/301 [==============================] - 4s 12ms/step - loss: 0.6866 - accuracy: 0.5577
Epoch 83/100
301/301 [==============================] - 4s 13ms/step - loss: 0.6867 - accuracy: 0.5577
Epoch 84/100
301/301 [==============================] - 4s 13ms/step - loss: 0.6867 - accuracy: 0.557

我尝试了一些其他技术，例如添加 Epoch 或 dropout，但准确率保持不变？ [1]：Keras model gets constant loss and accuracy

【问题讨论】：

标签： python deep-learning conv-neural-network artificial-intelligence

【解决方案1】：

在您的情况下，一个问题是您同时使用from_logits = True 和sigmoid 激活函数。

logit 是模型的非标准化预测；换句话说，它是应用sigmoid或softmax之前的网络预测。

默认参数如下：

tf.keras.losses.BinaryCrossentropy(
    from_logits=False, label_smoothing=0, reduction=losses_utils.ReductionV2.AUTO,
    name='binary_crossentropy'
)

如果您使用from_logits=True，那么您必须在此处更改此行：model.add(Dense(1)，这实际上转化为线性激活。

然后您的网络将开始学习。

另一个建议是将learning_rate 减少到0.0001 的初始值。

【讨论】：