【问题标题】:Keras multi-class classification loss is too highKeras多类分类损失太高
【发布时间】:2021-07-17 06:57:23
【问题描述】:

我正在训练多类分类模型以生成文本。以下是数据集的示例。

state district month rainfall max_temp min_temp max_rh min_rh wind_speed advice
Orissa Kendrapada february 0.0 34.6 19.4 88.2 29.6 12.0 chances of foot rot disease in paddy crop; apply urea at 3 weeks after transplanting at active tillering stage for paddy;......
Jharkhand Saraikela Kharsawan february 0 35.2 16.6 29.4 11.2 3.6 provide straw mulch and go for intercultural operations to avoid moisture losses from soil; chance of leaf blight disease in potato crop; .......

下面是我制作模型的代码。

def create_model():
    input1 = tf.keras.layers.Input(shape=(1,), name='state')
    input2 = tf.keras.layers.Input(shape=(1,), name='district')
    input3 = tf.keras.layers.Input(shape=(1,), name='month')
    input4 = tf.keras.layers.Input(shape=(1,), name='rainfall')
    input5 = tf.keras.layers.Input(shape=(1,), name='max_temp')
    input6 = tf.keras.layers.Input(shape=(1,), name='min_temp')
    input7 = tf.keras.layers.Input(shape=(1,), name='max_rh')
    input8 = tf.keras.layers.Input(shape=(1,), name='min_rh')
    input9 = tf.keras.layers.Input(shape=(1,), name='wind_speed')
    xz= [input1, input2, input3, input4, input5, input6, input7, input8, input9]
    x1= layers.Dense(128, activation='relu')(input1)
    x2=layers.Dense(128, activation='relu')(input2)
    x3=layers.Dense(128, activation='relu')(input3)
    x4=layers.Dense(128, activation='relu')(input4)
    x5=layers.Dense(128, activation='relu')(input5)
    x6=layers.Dense(128, activation='relu')(input6)
    x7=layers.Dense(128, activation='relu')(input7)
    x8=layers.Dense(128, activation='relu')(input8)
    x9=layers.Dense(128, activation='relu')(input9)
    base_model =  layers.Add()([x1,x2, x3, x4, x5, x6, x7, x8, x9])
    first_output = layers.Dense(30, name='output_1')(base_model) 
    second_output = layers.Dense(30, name='output_2')(base_model)
    third_output = layers.Dense(30, name='output_3')(base_model)
    fourth_output = layers.Dense(30, name='output_4')(base_model)
    fifth_output = layers.Dense(30, name='output_5')(base_model)
    models = tf.keras.Model(inputs=xz,
                  outputs=[first_output, second_output, third_output, fourth_output, fifth_output])
    return models

我的模型编译代码。

model=create_model()
optimizer = tf.keras.optimizers.Adam(learning_rate=0.01)
model.compile(optimizer=optimizer,
              loss={'output_1': 'categorical_crossentropy', 
                    'output_2': 'categorical_crossentropy',
                    'output_3': 'categorical_crossentropy',
                    'output_4': 'categorical_crossentropy',
                    'output_5': 'categorical_crossentropy'},
              metrics={'output_1':tf.keras.metrics.Accuracy(),
                       'output_2':tf.keras.metrics.Accuracy(),
                       'output_3':tf.keras.metrics.Accuracy(),
                       'output_4':tf.keras.metrics.Accuracy(),
                       'output_5':tf.keras.metrics.Accuracy()})

最后,我面临的问题,损失和准确性。损失太大了。

Epoch 499/500
2/2 [==============================] - 0s 11ms/step - loss: 66362.0130 - output_1_loss: 5827.9458 - output_2_loss: 10478.4935 - output_3_loss: 16566.5957 - output_4_loss: 16831.8887 - output_5_loss: 16657.0967 - output_1_accuracy: 0.0000e+00 - output_2_accuracy: 0.0000e+00 - output_3_accuracy: 0.0000e+00 - output_4_accuracy: 0.0000e+00 - output_5_accuracy: 0.0000e+00
Epoch 500/500
2/2 [==============================] - 0s 11ms/step - loss: 66362.0130 - output_1_loss: 5827.9458 - output_2_loss: 10478.4935 - output_3_loss: 16566.5957 - output_4_loss: 16831.8887 - output_5_loss: 16657.0967 - output_1_accuracy: 0.0000e+00 - output_2_accuracy: 0.0000e+00 - output_3_accuracy: 0.0000e+00 - output_4_accuracy: 0.0000e+00 - output_5_accuracy: 0.0000e+00

请帮助我并纠正我的错误。我是这个领域的新手。

替代模型更新

model = tf.keras.Sequential([
  feature_layer,
  layers.Dense(128, activation='relu'),
  layers.Dense(128, activation='relu'),
  layers.Dropout(.1),
  layers.Dense(150),
])
opt = Adam(learning_rate=0.01)
model.compile(optimizer=opt,
              loss='mean_squared_error',
              metrics=['accuracy'])

它将 [5,30] 形状的输入重新整形为 [150]。

【问题讨论】:

  • 您要预测哪些类?我看到的唯一可能的类是statedistrictmonth
  • 你试过降低学习率吗?也许是梯度爆炸的情况
  • 我要预测的类是advices,其形状为 [5,30]。实际上,在我的代码中,我将 [5,30] 单列分成 5 列,每列都有一个形状为 [30] 的张量。
  • 它们是单热编码的吗?
  • @yudhiesh 好吧,不,它们不是一个热编码。我使用了 Keras 文本预处理的 Tokenizer 和 pad_sequences。

标签: python tensorflow machine-learning keras multiclass-classification


【解决方案1】:

要增强模型结构,请参阅以下示例代码,其中包括原始网络的“model_simple”替代方案。使用相同的输入数据训练两者,改变“model_simple”的结构,找出哪种结构的精度最高。

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers


def create_model():
    input1 = tf.keras.layers.Input(shape=(1,), name='state')
    input2 = tf.keras.layers.Input(shape=(1,), name='district')
    input3 = tf.keras.layers.Input(shape=(1,), name='month')
    input4 = tf.keras.layers.Input(shape=(1,), name='rainfall')
    input5 = tf.keras.layers.Input(shape=(1,), name='max_temp')
    input6 = tf.keras.layers.Input(shape=(1,), name='min_temp')
    input7 = tf.keras.layers.Input(shape=(1,), name='max_rh')
    input8 = tf.keras.layers.Input(shape=(1,), name='min_rh')
    input9 = tf.keras.layers.Input(shape=(1,), name='wind_speed')
    xz= [input1,input2,input3,input4,input5,input6,input7,input8,input9]
    x1= layers.Dense(128, activation='relu')(input1)
    x2=layers.Dense(128, activation='relu')(input2)
    x3=layers.Dense(128, activation='relu')(input3)
    x4=layers.Dense(128, activation='relu')(input4)
    x5=layers.Dense(128, activation='relu')(input5)
    x6=layers.Dense(128, activation='relu')(input6)
    x7=layers.Dense(128, activation='relu')(input7)
    x8=layers.Dense(128, activation='relu')(input8)
    x9=layers.Dense(128, activation='relu')(input9)
    base_model =  layers.Add()([x1,x2, x3, x4, x5, x6, x7, x8, x9])
    first_output = layers.Dense(30,name='output_1')(base_model)
    second_output = layers.Dense(30,name='output_2')(base_model)
    third_output= layers.Dense(30,name='output_3')(base_model)
    fourth_output= layers.Dense(30,name='output_4')(base_model)
    fifth_output = layers.Dense(30,name='output_5')(base_model)
    models = tf.keras.Model(inputs=xz,
                  outputs=[first_output,second_output,third_output,fourth_output,fifth_output])
    return models

def create_model_simple():
    input1 = tf.keras.layers.Input(shape=(1,), name='state')
    input2 = tf.keras.layers.Input(shape=(1,), name='district')
    input3 = tf.keras.layers.Input(shape=(1,), name='month')
    input4 = tf.keras.layers.Input(shape=(1,), name='rainfall')
    input5 = tf.keras.layers.Input(shape=(1,), name='max_temp')
    input6 = tf.keras.layers.Input(shape=(1,), name='min_temp')
    input7 = tf.keras.layers.Input(shape=(1,), name='max_rh')
    input8 = tf.keras.layers.Input(shape=(1,), name='min_rh')
    input9 = tf.keras.layers.Input(shape=(1,), name='wind_speed')
    #xz= [input1,input2,input3,input4,input5,input6,input7,input8,input9]
    #x1=layers.Dense(128, activation='relu')(input1)
    #x2=layers.Dense(128, activation='relu')(input2)
    #x3=layers.Dense(128, activation='relu')(input3)
    #x4=layers.Dense(128, activation='relu')(input4)
    #x5=layers.Dense(128, activation='relu')(input5)
    #x6=layers.Dense(128, activation='relu')(input6)
    #x7=layers.Dense(128, activation='relu')(input7)
    #x8=layers.Dense(128, activation='relu')(input8)
    #x9=layers.Dense(128, activation='relu')(input9)
    yhdistelma=layers.concatenate([input1,input2, input3, input4, input5, input6, input7, input8, input9])
    #base_model =  layers.Add()([x1,x2, x3, x4, x5, x6, x7, x8, x9])
    first_output = layers.Dense(30,name='output_1')(yhdistelma)
    second_output = layers.Dense(30,name='output_2')(yhdistelma)
    third_output= layers.Dense(30,name='output_3')(yhdistelma)
    fourth_output= layers.Dense(30,name='output_4')(yhdistelma)
    fifth_output = layers.Dense(30,name='output_5')(yhdistelma)
    models = tf.keras.Model(inputs=[input1,input2,input3,input4,input5, input6, input7, input8, input9],
                  outputs=[first_output,second_output,third_output,fourth_output,fifth_output])
    return models

model=create_model()
optimizer = tf.keras.optimizers.Adam(learning_rate=0.01)
model.compile(optimizer=optimizer,
              loss={'output_1': 'categorical_crossentropy',
                    'output_2': 'categorical_crossentropy',
                    'output_3': 'categorical_crossentropy',
                    'output_4': 'categorical_crossentropy',
                    'output_5': 'categorical_crossentropy'},
              metrics={'output_1':tf.keras.metrics.Accuracy(),
                       'output_2':tf.keras.metrics.Accuracy(),
                       'output_3':tf.keras.metrics.Accuracy(),
                       'output_4':tf.keras.metrics.Accuracy(),
                       'output_5':tf.keras.metrics.Accuracy()})

model.summary()

keras.utils.plot_model(model,'model_structure.png',show_dtype=True)


#Let's create a more simple model version:
model_simple=create_model_simple()

model.compile(optimizer=optimizer,
              loss={'output_1': 'categorical_crossentropy',
                    'output_2': 'categorical_crossentropy',
                    'output_3': 'categorical_crossentropy',
                    'output_4': 'categorical_crossentropy',
                    'output_5': 'categorical_crossentropy'},
              metrics={'output_1':tf.keras.metrics.Accuracy(),
                       'output_2':tf.keras.metrics.Accuracy(),
                       'output_3':tf.keras.metrics.Accuracy(),
                       'output_4':tf.keras.metrics.Accuracy(),
                       'output_5':tf.keras.metrics.Accuracy()})

model_simple.summary()

keras.utils.plot_model(model_simple,'model_simple_structure.png',show_dtype=True)

...特别是,请注意,您的原始模型和更简单的模型之间的主要区别在于“添加”已替换为“连接”。 “添加”导致输出大小与其输入之一相同,但“连接”输出的大小要高得多,这种事情可能会对性能产生影响。

【讨论】:

  • 这使代码更易于理解。我可以将 LSTM 添加到每个输出而不是单个 Dense 吗?
猜你喜欢
  • 1970-01-01
  • 2017-08-28
  • 1970-01-01
  • 2018-05-11
  • 2017-12-08
  • 2018-08-07
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多