【问题标题】:Adding a layer stops learning Keras添加层停止学习 Keras
【发布时间】:2018-08-01 05:33:07
【问题描述】:

代码

import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential,Model
from keras.layers import LeakyReLU,Dropout, Flatten, Dense,Input
from keras import applications
from keras.preprocessing import image
from keras import backend as K
from keras import regularizers
from keras.optimizers import adam
K.set_image_dim_ordering('tf')
input_tensor = Input(shape=(150,150,3))

img_width, img_height = 150,150

top_model_weights_path = 'bottleneck_fc_model.h5'
train_data_dir = 'Cats and Dogs Dataset/train'
validation_data_dir = 'Cats and Dogs Dataset/validation'
nb_train_samples = 20000
nb_validation_samples = 5000
epochs = 50
batch_size = 128

base_model=applications.inception_v3.InceptionV3(include_top=False, weights='imagenet', input_tensor=input_tensor, pooling=None)
i=0;
for layer in base_model.layers:
    layer.trainable = False
    i+=1
base_model.output
top_model=Sequential()
top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
top_model.add(Dense(1024,activation="relu"))
top_model.add(Dropout(0.5))
top_model.add(Dense(10,activation="relu"))//Layer with issue 
top_model.add(Dropout(0.8))//
top_model.add(Dense(2, activation='softmax'))
model = Model(inputs=base_model.input,outputs=top_model(base_model.output))

model.summary
datagen = ImageDataGenerator(rescale=1. / 255)

train_data = datagen.flow_from_directory(train_data_dir,target_size=(img_width, img_height),batch_size=batch_size,classes=[ 'cats','dogs'])#,class_mode="binary",shuffle=True)


validation_data = datagen.flow_from_directory(validation_data_dir,target_size=(img_width, img_height), batch_size=batch_size,classes=['cats','dogs'])#,class_mode="binary",shuffle=True)

adm=adam(lr=0.02)
model.compile(optimizer=adm,loss='categorical_crossentropy', metrics=['accuracy'])

model.fit_generator(train_data, steps_per_epoch=nb_train_samples//batch_size, epochs=epochs,validation_data=validation_data, shuffle=True,verbose=1)

我已经使用 keras 在猫狗数据集(https://www.kaggle.com/c/dogs-vs-cats/data)上实现了一个图像分类器(使用初始网络学习传输)。代码运行没有错误,但从第一个 epoch 开始,验证集和训练集的准确率停留在 50%,并且损失没有减少。我正在使用 Atom 和氢。

当我删除标记层时问题就消失了,我似乎无法理解为什么会发生这种情况。 我试图解决这个问题

  1. 不同的批量大小 - 4,16,64,256
  2. 更改优化器 - 尝试使用修改后的学习率的 adam、rmsprop、sgd
  3. 尝试了层的不同激活方式 - relu、sigmoid 和leakyrelu
  4. 更改了 dropout - 当 dropout 为 0.9 时问题消失(即使 层没用,这显然是有效的,但也指出我缺少一些东西)
  5. 将最终激活更改为 sigmoid

有人可以告诉我我缺少什么,因为我想不出添加层停止学习的任何原因

【问题讨论】:

    标签: keras deep-learning conv-neural-network transfer-learning


    【解决方案1】:
    import numpy as np
    from keras.preprocessing.image import ImageDataGenerator
    from keras.models import Sequential,Model
    from keras.layers import LeakyReLU,Dropout, Flatten, Dense,Input
    from keras import applications
    from keras.preprocessing import image
    from keras import backend as K
    from keras import regularizers
    from keras.optimizers import adam
    K.set_image_dim_ordering('tf')
    input_tensor = Input(shape=(150,150,3))
    
    img_width, img_height = 150,150
    
    top_model_weights_path = 'bottleneck_fc_model.h5'
    train_data_dir = 'Cats and Dogs Dataset/train'
    validation_data_dir = 'Cats and Dogs Dataset/validation'
    nb_train_samples = 20000
    nb_validation_samples = 5000
    epochs = 50
    batch_size = 64
    
    base_model=applications.inception_v3.InceptionV3(include_top=False, weights='imagenet', input_tensor=input_tensor, pooling=None)
    i=0;
    for layer in base_model.layers:
        layer.trainable = False
        i+=1
    base_model.output
    top_model=Sequential()
    top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
    top_model.add(Dense(512,activation="relu")) //decrease in units
    top_model.add(Dropout(0.4)) // change the drop out
    top_model.add(Dense(128,activation="relu")) //increase in units
    top_model.add(Dropout(0.2)) // decrease in dropout
    top_model.add(Dense(2, activation='softmax'))
    model = Model(inputs=base_model.input,outputs=top_model(base_model.output))
    
    model.summary
    datagen = ImageDataGenerator(rescale=1. / 255)
    
    train_data = datagen.flow_from_directory(train_data_dir,target_size=(img_width, img_height),batch_size=batch_size,classes=[ 'cats','dogs'])#,class_mode="binary",shuffle=True)
    
    
    validation_data = datagen.flow_from_directory(validation_data_dir,target_size=(img_width, img_height), batch_size=batch_size,classes=['cats','dogs'])#,class_mode="binary",shuffle=True)
    
    adm=adam(lr=0.02)
    model.compile(optimizer=adm,loss='categorical_crossentropy', metrics=['accuracy'])
    
    model.fit_generator(train_data, steps_per_epoch=nb_train_samples//batch_size, epochs=epochs,validation_data=validation_data, shuffle=True,verbose=1)
    

    我减少了第一个密集层中的单元数量,同时增加了第二个密集层中的单元数量..并且还降低了退出率..运行此代码并让我知道。网络更复杂的另一件事是过度拟合的机会更高.. dropout 值的增加可能导致该层无法学习。尽量让你的网络简单。

    【讨论】:

    • 它没有用。我注意到的一件事是,只要我有额外的层,无论我改变什么,我的验证损失都会停留在 8.0590,无论我运行了多少个 epoch 或层中的节点数或激活
    • 是的,就像我在下面所说的那样 .. 降低复杂性 .. 删除它在火车数据上过度拟合的那一层
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2019-10-20
    • 1970-01-01
    • 2018-11-12
    • 1970-01-01
    • 2020-06-04
    • 2020-05-13
    • 1970-01-01
    相关资源
    最近更新 更多