【问题标题】:how to save val_loss and val_acc in Keras如何在 Keras 中保存 val_loss 和 val_acc
【发布时间】:2018-05-03 02:12:26
【问题描述】:

我无法在 Keras 中记录“val_loss”和“val_acc”。 'loss' 和 'acc' 很容易,因为它们总是记录在 model.fit 的历史中。

如果在fit 中启用了验证,则记录'val_loss',如果启用了验证和准确性监控,则记录val_acc。但是,这是什么意思?

我的节点是model.fit(train_data, train_labels,epochs = 64,batch_size = 10,shuffle = True,validation_split = 0.2, callbacks=[history])

如您所见,我使用 5 折交叉验证并打乱数据。在这种情况下,如何启用fit中的validation记录'val_loss'和'val_acc'?

谢谢

【问题讨论】:

  • 我最不明白的是,这个信息每次都会自动打印在屏幕上,是每个用户最关心的信息。为什么 Keras 没有用户友好的方法将它们存储在文件中???
  • 看看CSVLogger?
  • 在你的例子中history 是什么?

标签: python keras


【解决方案1】:

从 Keras 文档中,我们有 models.fit 方法:

fit(x=None, y=None, 
    batch_size=None, 
    epochs=1, 
    verbose=1, 
    callbacks=None, 
    validation_split=0.0, validation_data=None, 
    shuffle=True, 
    class_weight=None, 
    sample_weight=None, 
    initial_epoch=0, 
    steps_per_epoch=None, 
    validation_steps=None
)

'val_loss' is recorded if validation is enabled in fit, and val_accis recorded if validation and accuracy monitoring are enabled. - 如果用于上述 fit 方法中的回调参数,则来自 keras.callbacks.Callback() 对象。

不使用你用过的历史回调,可以如下使用:

    from keras.callbacks import Callback
    logs = Callback()
    model.fit(train_data, 
                train_labels,
                epochs = 64, 
                batch_size = 10,
                shuffle = True,
                validation_split = 0.2, 
                callbacks=[logs]
           ) 

如果在fit 中启用验证,则记录'val_loss' 意味着:使用model.fit 方法时,您使用validatoin_split 参数或使用validation_data 参数to specify the tuple (x_val, y_val) or tuple (x_val, y_val, val_sample_weights) on which to evaluate the loss and any model metrics at the end of each epoch.

一个 History 对象。 它的 History.history 属性是一个历史记录 在连续的时期训练损失值和指标值,以及 作为验证损失值和验证指标值(如果 适用的)。 - Keras 文档(model.fit 方法的返回值)

您正在使用 History 回调,在您的模型中如下所示:

model.fit(train_data, 
            train_labels,
            epochs = 64,
            batch_size = 10,
            shuffle = True,
            validation_split = 0.2, 
            callbacks=[history]
       )

history.history 将为您输出一个字典:lossaccval_lossval_acc,如果您使用变量来保存 model.fit,如下所示:

history = model.fit(
     train_data, 
     train_labels,
     epochs = 64,
     batch_size = 10,
     shuffle = True,
     validation_split = 0.2, 
     callbacks=[history]
)
history.history

输出将如下所示:

{'val_loss': [14.431451635814849,
              14.431451635814849,
              14.431451635814849,
              14.431451635814849,
              14.431451635814849,
              14.431451635814849,
              14.431451635814849,
              14.431451635814849,
              14.431451635814849,
              14.431451635814849],
 'val_acc':  [0.1046428571712403,
              0.1046428571712403,
              0.1046428571712403,
              0.1046428571712403,
              0.1046428571712403,
              0.1046428571712403,
              0.1046428571712403,
              0.1046428571712403,
              0.1046428571712403,
              0.1046428571712403],
 'loss': [14.555215610322499,
          14.555215534028553,
          14.555215548560733,
          14.555215588524229,
          14.555215592157273,
          14.555215581258137,
          14.555215575808571,
          14.55521561940511,
          14.555215563092913,
          14.555215624854679],
 'acc': [0.09696428571428571,
         0.09696428571428571,
         0.09696428571428571,
         0.09696428571428571,
         0.09696428571428571,
         0.09696428571428571,
         0.09696428571428571,
         0.09696428571428571,
         0.09696428571428571,
         0.09696428571428571]}

您可以使用csvlogger(如下面在 cmets 中给出)或使用更长的方法将字典写入 csv 文件(如此处给出的writing a dictionary to a csv)来保存数据

csv_logger = CSVLogger('training.log')
model.fit(X_train, Y_train, callbacks=[csv_logger])

【讨论】:

    【解决方案2】:

    可以使用Keras的ModelCheckpoint类保存val_lossval_acc的数据。

    from keras.callbacks import ModelCheckpoint
    
    checkpointer = ModelCheckpoint(filepath='yourmodelname.hdf5', 
                                   monitor='val_loss', 
                                   verbose=1, 
                                   save_best_only=False)
    
    history = model.fit(X_train, y_train, epochs=100, validation_split=0.02, callbacks=[checkpointer])
    
    history.history.keys()
    
    # output
    # dict_keys(['val_loss', 'val_mae', 'val_acc', 'loss', 'mae', 'acc'])
    

    重要的一点,如果省略validation_split 属性,则只会得到lossmaeacc 的值。

    希望这会有所帮助!

    【讨论】:

      【解决方案3】:

      更新:val_accuracy 字典键今天似乎不再起作用。不知道为什么,但是尽管 OP 询问如何记录它,但我还是从这里删除了该代码(此外,对于交叉验证结果的比较而言,损失实际上很重要)。

      使用 Python 3.7 和 Tensorflow 2.0,经过多次搜索、猜测和反复失败,以下内容对我有用。我从别人的脚本开始,将我需要的内容写入.json 文件;它在每次训练运行时生成一个这样的.json 文件,显示每个时期的验证损失,因此您可以看到模型如何收敛(或没有收敛);准确性会被记录,但不会作为性能指标。

      注意:您需要填写yourTrainDiryourTrainingDatayourValidationDatayourOptimizeryourLossFunctionFromKerasOrElsewhereyourNumberOfEpochs等才能使此代码运行:

      import numpy as np
      import os
      import tensorflow as tf
      from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint, LambdaCallback
      import json
      model.compile(
          optimizer=yourOptimizer,
          loss=yourLossFunctionFromKerasOrElsewhere()
          )
      
      # create a custom callback to enable future cross-validation efforts
      yourTrainDir = os.getcwd() + '/yourOutputFolderName/'
      uniqueID = np.random.randint(999999) # To distinguish validation runs by saved JSON name
      epochValidationLog = open(
          yourTrainDir +
          'val_log_per_epoch_' +
          '{}_'.format(uniqueID) +
          '.json',
          mode='wt',
          buffering=1
          )
      ValidationLogsCallback = LambdaCallback(
          on_epoch_end = lambda epoch,
              logs: epochValidationLog.write(
                  json.dumps(
                      {
                          'oneIndexedEpoch': epoch + 1,
                          'Validationloss': logs['val_loss']
                      }
                      ) + '\n'
                  ),
          on_train_end = lambda logs: epochValidationLog.close()
          )
      
      # set up the list of callbacks
      callbacksList = [
          ValidationLogsCallback,
          EarlyStopping(patience=40, verbose=1),
          ]
      results = model.fit(
          x=yourTrainingData,
          steps_per_epoch=len(yourTrainingData),
          validation_data=yourValidationData,
          validation_steps=len(yourValidationData),
          epochs=yourNumberOfEpochs,
          verbose=1,
          callbacks=callbacksList
          )
      

      这会在TrainDir 文件夹中生成一个 JSON 文件,将每个训练时期的验证损失和准确性记录为自己的类字典项。请注意,纪元编号的索引从 1 开始,因此它与 tensorflow 的输出匹配,而不是 Python 中的实际索引。

      我正在输出到 .JSON 文件,但它可以是任何东西。这是我用于分析生成的 JSON 文件的代码;我本可以将所有内容放在一个脚本中,但没有。

      import os
      from pathlib import Path
      import json
      
      currentDirectory = os.getcwd()
      outFileName = 'CVResults.json'
      outFile = open(outFileName, mode='wt')
      validationLogPaths = Path().glob('val_log_per_epoch_*.json')
      
      # Necessary list to detect short unique IDs for each training session
      stringDecimalDigits = [
          '1',
          '2',
          '3',
          '4',
          '5',
          '6',
          '7',
          '8',
          '9',
          '0'
      ]
      setStringDecimalDigits = set(stringDecimalDigits)
      trainingSessionsList = []
      
      # Load the JSON files into memory to allow reading.
      for validationLogFile in validationLogPaths:
          trainingUniqueIDCandidate = str(validationLogFile)[18:21]
      
          # Pad unique IDs with fewer than three digits with zeros at front
          thirdPotentialDigitOfUniqueID = trainingUniqueIDCandidate[2]
          if setStringDecimalDigits.isdisjoint(thirdPotentialDigitOfUniqueID):
              secondPotentialDigitOfUniqueID = trainingUniqueIDCandidate[1]
              if setStringDecimalDigits.isdisjoint(secondPotentialDigitOfUniqueID):
                  trainingUniqueID = '00' + trainingUniqueIDCandidate[:1]
              else:
                  trainingUniqueID = '0' + trainingUniqueIDCandidate[:2]
          else:
              trainingUniqueID = trainingUniqueIDCandidate
          trainingSessionsList.append((trainingUniqueID, validationLogFile))
      trainingSessionsList.sort(key=lambda x: x[0])
      
      # Analyze and export cross-validation results
      for replicate in range(len(dict(trainingSessionsList).keys())):
          validationLogFile = trainingSessionsList[replicate][1]
          fileOpenForReading = open(
              validationLogFile, mode='r', buffering=1
          )
      
          with fileOpenForReading as openedFile:
              jsonValidationData = [json.loads(line) for line in openedFile]
      
          bestEpochResultsDict = {}
          oneIndexedEpochsList = []
          validationLossesList = []
          for line in range(len(jsonValidationData)):
              tempDict = jsonValidationData[line]
              oneIndexedEpochsList.append(tempDict['oneIndexedEpoch'])
              validationLossesList.append(tempDict['Validationloss'])
          trainingStopIndex = min(
              range(len(validationLossesList)),
              key=validationLossesList.__getitem__
          )
          bestEpochResultsDict['Integer_unique_ID'] = trainingSessionsList[replicate][0]
          bestEpochResultsDict['Min_val_loss'] = validationLossesList[trainingStopIndex]
          bestEpochResultsDict['Last_train_epoch'] = oneIndexedEpochsList[trainingStopIndex]
          outFile.write(json.dumps(bestEpochResultsDict, sort_keys=True) + '\n')
      
      outFile.close()
      

      最后一段代码创建了一个 JSON 来总结上面生成的 CVResults.json 中的内容:

      from pathlib import Path
      import json
      import os
      import statistics
      
      outFile = open("CVAnalysis.json", mode='wt')
      CVResultsPath = sorted(Path().glob('*CVResults.json'))
      if len(CVResultsPath) > 1:
          print('\nPlease analyze only one CVResults.json file at at time.')
          userAnswer = input('\nI understand only one will be analyzed: y or n')
          if (userAnswer == 'y') or (userAnswer == 'Y'):
              print('\nAnalyzing results in file {}:'.format(str(CVResultsPath[0])))
      
      # Load the first CVResults.json file into memory to allow reading.
      CVResultsFile = CVResultsPath[0]
      fileOpenForReading = open(
          CVResultsFile, mode='r', buffering=1
      )
      
      outFile.write(
          'Analysis of cross-validation results tabulated in file {}'.format(
              os.getcwd()
          ) +
          str(CVResultsFile) +
          ':\n\n'
      )
      
      with fileOpenForReading as openedFile:
          jsonCVResultsData = [json.loads(line) for line in openedFile]
      
      minimumValidationLossesList = []
      trainedOneIndexedEpochsList = []
      for line in range(len(jsonCVResultsData)):
          tempDict = jsonCVResultsData[line]
          minimumValidationLossesList.append(tempDict['Min_val_loss'])
          trainedOneIndexedEpochsList.append(tempDict['Last_train_epoch'])
      outFile.write(
          '\nTrained validation losses: ' +
          json.dumps(minimumValidationLossesList) +
          '\n'
      )
      outFile.write(
          '\nTraining epochs required: ' +
          json.dumps(trainedOneIndexedEpochsList) +
          '\n'
      )
      outFile.write(
          '\n\nMean trained validation loss: ' +
          str(round(statistics.mean(minimumValidationLossesList), 4)) +
          '\n'
      )
      outFile.write(
          'Median of mean trained validation losses per session: ' +
          str(round(statistics.median(minimumValidationLossesList), 4)) +
          '\n'
      )
      outFile.write(
          '\n\nMean training epochs required: ' +
          str(round(statistics.mean(trainedOneIndexedEpochsList), 1)) +
          '\n'
      )
      outFile.write(
          'Median of mean training epochs required per session: ' +
          str(round(statistics.median(trainedOneIndexedEpochsList), 1)) +
          '\n'
      )
      outFile.close()
      

      【讨论】:

      • 我在第一块代码中概括了优化器和损失函数。所以你需要选择这些并具体实施它们;如果尝试运行它会中断。
      • 我意识到这段代码非常脆弱;事实上,所谓的uniqueID 经常被覆盖,所以每次发生这种情况时,我都会浪费数小时进行验证复制。我不是一个优秀的程序员,所以除了原始实用程序之外,请忽略 cmets。
      • 可能更简单的记录结果的方法是创建class lossLogger(tf.keras.callbacks.Callback): def __init__(self, fileName): self.fileName = fileName self.json_log = open(self.fileName+'.json', mode='w+', buffering=1) def on_epoch_end(self, epoch, logs=None): self.json_log.write(json.dumps('epoch {}: '.format(epoch) + str(logs)) + '\n' ) def on_train_end(self, logs=None): self.json_log.close()
      • 出于多种原因,我建议手动设置验证数据。这个错误现在可能已经修复,但非线性建模领域中的验证和测试概念还有很多理论需要完善:github.com/tensorflow/tensorflow/issues/37840
      猜你喜欢
      • 1970-01-01
      • 2019-07-21
      • 1970-01-01
      • 1970-01-01
      • 2018-04-28
      • 1970-01-01
      • 2019-07-11
      • 2019-09-08
      • 2019-11-03
      相关资源
      最近更新 更多