1、使用很小的学习率进行学习,且:

for l in bert_model.layers:

  l.trainable = True

 

2、由于bert模型巨大,我们每次训练只能取batch=4进行训练,而训练4个epoch之后,可以freeze bert模型,单独训练softmax

for l in bert_model.layers:

  l.trainable = False

__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, None) 0
__________________________________________________________________________________________________
input_2 (InputLayer) (None, None) 0
__________________________________________________________________________________________________
model_2 (Model) multiple 101677056 input_1[0][0]
input_2[0][0]
__________________________________________________________________________________________________
lambda_1 (Lambda) (None, 768) 0 model_2[1][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 50) 38450 lambda_1[0][0]
==================================================================================================
Total params: 101,715,506
Trainable params: 38,450
Non-trainable params: 101,677,056

 

调整学习率,单独训练几个epoch

尤其是我们有一个类是其他,这种类别不确定的分类问题的时候,单独训练softmax有很大的帮助

相关文章:

  • 2021-08-19
  • 2021-10-11
  • 2021-06-05
  • 2021-04-26
  • 2022-01-11
  • 2021-12-29
  • 2022-12-23
  • 2021-09-07
猜你喜欢
  • 2022-01-12
  • 2021-06-14
  • 2022-12-23
  • 2022-01-01
  • 2021-04-08
相关资源
相似解决方案