验证准确度先高后低答案

【问题标题】：Validation accuracy initially high then low验证准确度先高后低
【发布时间】：2021-11-03 06:25:09
【问题描述】：

我正在使用 CNN 训练一个文本情感分类模型。其中，验证准确度最初高于训练准确度，然后下降。这种行为可以接受吗？如果不是，那可能是什么原因以及如何解决？

我的模特：

class hyper():
  def __init__(self,embedding_dim,filter_sizes,num_filters,dropout_prob,hidden_dims,batch_size,num_epochs):
    # Model Hyperparameters
    self.embedding_dim = embedding_dim
    self.filter_sizes = filter_sizes
    self.num_filters = num_filters
    self.dropout_prob = dropout_prob
    self.hidden_dims = hidden_dims
    # Training parameters
    self.batch_size = batch_size
    self.num_epochs = num_epochs

class prep_hyper():
  def __init__(self,sequenceLength,max_words):
    # Prepossessing parameters
    self.sequenceLength = sequenceLength
    self.max_words = max_words
    
m_hyper=hyper(embedding_dim=embed_dim,filter_sizes=(3,4,5,6,8),num_filters=80,dropout_prob=(0.2,0.5),
              hidden_dims=50,batch_size=128,num_epochs= 30)

pr_hyper = prep_hyper(sequenceLength=sequence_length,max_words=vocab_size)

模型架构：

def build_model(pr_hyper,m_hyper):
    
    # Convolutional block
    model_input = Input(shape=(pr_hyper.sequenceLength))
    # use a random embedding for the text
    x = Embedding(pr_hyper.max_words, m_hyper.embedding_dim,weights=[emb],trainable=False)(model_input)
#     x = SpatialDropout1D(m_hyper.dropout_prob[0])(x)

    conv_kern_reg = regularizers.l2(0.0001)
    conv_bias_reg = regularizers.l2(0.0001)
    
    conv_blocks = []
    for sz in m_hyper.filter_sizes:
        conv = Convolution1D(filters=m_hyper.num_filters,
                             kernel_size=sz,
#                              padding="same",
                             activation="relu",
                             strides=1,
                             kernel_regularizer=conv_kern_reg,
                             bias_regularizer=conv_bias_reg
                            )(x)
        conv = GlobalMaxPooling1D()(conv)
        conv_blocks.append(conv)
    # merge
    x = Concatenate()(conv_blocks) if len(conv_blocks) > 1 else conv_blocks[0]
    
    x = Dense(m_hyper.hidden_dims, activation="relu")(x)
    x = Dropout(m_hyper.dropout_prob[1])(x)
    x = Dense(100, activation="relu")(x)
    x = Dropout(m_hyper.dropout_prob[1])(x)
    model_output = Dense(3, activation="softmax")(x)
    model = Model(model_input, model_output)
    model.compile(loss="categorical_crossentropy", optimizer=keras.optimizers.Adam(learning_rate=0.00005), metrics=["accuracy"]) #categorical_crossentropy
    print(model.summary())
    tf.keras.utils.plot_model(model, show_shapes=True)#, to_file='multichannel.png')
    return model

初始时期：

【问题讨论】：

标签： python tensorflow keras conv-neural-network text-classification

【解决方案1】：

发生这种情况的原因有多种，例如，在验证期间禁用了 dropout 层。有关更多信息，我建议您查看this 描述了发生这种情况的几个可能原因。

【讨论】：