【问题标题】:Accessing validation data within a custom callback在自定义回调中访问验证数据
【发布时间】:2018-05-20 10:47:53
【问题描述】:

我正在安装一个 train_generator,并且我想通过一个自定义回调来计算我的 validation_generator 上的自定义指标。 如何在自定义回调中访问参数 validation_stepsvalidation_data? 不在self.params,在self.model也找不到。这就是我想做的。欢迎任何不同的方法。

model.fit_generator(generator=train_generator,
                    steps_per_epoch=steps_per_epoch,
                    epochs=epochs,
                    validation_data=validation_generator,
                    validation_steps=validation_steps,
                    callbacks=[CustomMetrics()])


class CustomMetrics(keras.callbacks.Callback):

    def on_epoch_end(self, batch, logs={}):        
        for i in validation_steps:
             # features, labels = next(validation_data)
             # compute custom metric: f(features, labels) 
        return

keras:2.1.1

更新

我设法将我的验证数据传递给自定义回调的构造函数。但是,这会导致令人讨厌的“内核似乎已经死机。它将自动重新启动。”信息。我怀疑这是否是正确的方法。有什么建议吗?

class CustomMetrics(keras.callbacks.Callback):

    def __init__(self, validation_generator, validation_steps):
        self.validation_generator = validation_generator
        self.validation_steps = validation_steps


    def on_epoch_end(self, batch, logs={}):

        self.scores = {
            'recall_score': [],
            'precision_score': [],
            'f1_score': []
        }

        for batch_index in range(self.validation_steps):
            features, y_true = next(self.validation_generator)            
            y_pred = np.asarray(self.model.predict(features))
            y_pred = y_pred.round().astype(int) 
            self.scores['recall_score'].append(recall_score(y_true[:,0], y_pred[:,0]))
            self.scores['precision_score'].append(precision_score(y_true[:,0], y_pred[:,0]))
            self.scores['f1_score'].append(f1_score(y_true[:,0], y_pred[:,0]))
        return

metrics = CustomMetrics(validation_generator, validation_steps)

model.fit_generator(generator=train_generator,
                    steps_per_epoch=steps_per_epoch,
                    epochs=epochs,
                    validation_data=validation_generator,
                    validation_steps=validation_steps,
                    shuffle=True,
                    callbacks=[metrics],
                    verbose=1)

【问题讨论】:

标签: python keras metrics


【解决方案1】:

我正在锁定相同问题的解决方案,然后我在已接受的答案here 中找到了您的解决方案和另一个解决方案。如果第二个解决方案有效,我认为这比在“纪元结束”时再次遍历所有验证要好

想法是将target和pred占位符保存在变量中,并在“批处理结束”时通过自定义回调更新变量

【讨论】:

    【解决方案2】:

    方法如下:

    from sklearn.metrics import r2_score
    
    class MetricsCallback(keras.callbacks.Callback):
        def on_epoch_end(self, epoch, logs=None):
            if epoch:
                print(self.validation_data[0])
                x_test = self.validation_data[0]
                y_test = self.validation_data[1]
                predictions = self.model.predict(x_test)
                print('r2:', r2_score(prediction, y_test).round(2))
    
    model.fit( ..., callbacks=[MetricsCallback()])
    

    Reference

    Keras 2.2.4

    【讨论】:

    • 据你在github上的参考,self.validation数据是None,这个问题还没有解决。
    • @VadymB。 - 那是因为Unfortunately, since moving from fit to flow_from_directory and fit_generator, this has erred because self.validation_data is None. 我正在使用fit
    【解决方案3】:

    您可以直接遍历 self.validation_data 以在每个 epoch 结束时聚合所有验证数据。如果您想计算整个验证数据集的准确率、召回率和 F1:

    # Validation metrics callback: validation precision, recall and F1
    # Some of the code was adapted from https://medium.com/@thongonary/how-to-compute-f1-score-for-each-epoch-in-keras-a1acd17715a2
    class Metrics(callbacks.Callback):
    
        def on_train_begin(self, logs={}):
            self.val_f1s = []
            self.val_recalls = []
            self.val_precisions = []
    
        def on_epoch_end(self, epoch, logs):
            # 5.4.1 For each validation batch
            for batch_index in range(0, len(self.validation_data)):
                # 5.4.1.1 Get the batch target values
                temp_targ = self.validation_data[batch_index][1]
                # 5.4.1.2 Get the batch prediction values
                temp_predict = (np.asarray(self.model.predict(
                                    self.validation_data[batch_index][0]))).round()
                # 5.4.1.3 Append them to the corresponding output objects
                if(batch_index == 0):
                    val_targ = temp_targ
                    val_predict = temp_predict
                else:
                    val_targ = np.vstack((val_targ, temp_targ))
                    val_predict = np.vstack((val_predict, temp_predict))
    
            val_f1 = round(f1_score(val_targ, val_predict), 4)
            val_recall = round(recall_score(val_targ, val_predict), 4)
            val_precis = round(precision_score(val_targ, val_predict), 4)
    
            self.val_f1s.append(val_f1)
            self.val_recalls.append(val_recall)
            self.val_precisions.append(val_precis)
    
            # Add custom metrics to the logs, so that we can use them with
            # EarlyStop and csvLogger callbacks
            logs["val_f1"] = val_f1
            logs["val_recall"] = val_recall
            logs["val_precis"] = val_precis
    
            print("— val_f1: {} — val_precis: {} — val_recall {}".format(
                     val_f1, val_precis, val_recall))
            return
    
    valid_metrics = Metrics()
    

    然后你可以在回调参数中添加valid_metrics:

    your_model.fit_generator(..., callbacks = [valid_metrics])
    

    请务必将其放在回调的开头,以防您希望其他回调使用这些措施。

    【讨论】:

    • 有没有办法使用验证数据的预测结果,而不是重新计算?
    • def on_epoch_end(self, batch, logs) 中访问 self.validation 的先决条件是什么?我总是遇到AttributeError: 'Metrics' object has no attribute 'validation_data'
    • @vanessaxenia 您需要将 Metrics 类中的 validation_data 作为参数传递。
    • 您的batch_index 实际上是数据的直接索引,因此它一次生成一个训练示例。您需要进行切片以获得完整批次。另外,更关键的是self.validation_data 只是一个包含 4 个元素的列表,这个答案根本不起作用。
    【解决方案4】:

    Verdant89 犯了一些错误,并没有实现所有功能。下面的代码应该可以工作。

    class Metrics(callbacks.Callback):
    
    def on_train_begin(self, logs={}):
        self.val_f1s = []
        self.val_recalls = []
        self.val_precisions = []
    
    def on_epoch_end(self, epoch, logs):
        # 5.4.1 For each validation batch
        for batch_index in range(0, len(self.validation_data[0])):
            # 5.4.1.1 Get the batch target values
            temp_target = self.validation_data[1][batch_index]
            # 5.4.1.2 Get the batch prediction values
            temp_predict = (np.asarray(self.model.predict(np.expand_dims(
                                self.validation_data[0][batch_index],axis=0)))).round()
            # 5.4.1.3 Append them to the corresponding output objects
            if batch_index == 0:
                val_target = temp_target
                val_predict = temp_predict
            else:
                val_target = np.vstack((val_target, temp_target))
                val_predict = np.vstack((val_predict, temp_predict))
    
        tp, tn, fp, fn = self.compute_tptnfpfn(val_target, val_predict)
        val_f1 = round(self.compute_f1(tp, tn, fp, fn), 4)
        val_recall = round(self.compute_recall(tp, tn, fp, fn), 4)
        val_precis = round(self.compute_precision(tp, tn, fp, fn), 4)
    
        self.val_f1s.append(val_f1)
        self.val_recalls.append(val_recall)
        self.val_precisions.append(val_precis)
    
        # Add custom metrics to the logs, so that we can use them with
        # EarlyStop and csvLogger callbacks
        logs["val_f1"] = val_f1
        logs["val_recall"] = val_recall
        logs["val_precis"] = val_precis
    
        print("— val_f1: {} — val_precis: {} — val_recall {}".format(
                 val_f1, val_precis, val_recall))
        return
    
    def compute_tptnfpfn(self,val_target,val_predict):
        # cast to boolean
        val_target = val_target.astype('bool')
        val_predict = val_predict.astype('bool')
    
        tp = np.count_nonzero(val_target * val_predict)
        tn = np.count_nonzero(~val_target * ~val_predict)
        fp = np.count_nonzero(~val_target * val_predict)
        fn = np.count_nonzero(val_target * ~val_predict)
    
        return tp, tn, fp, fn
    
    def compute_f1(self,tp, tn, fp, fn):
        f1 = tp*1. / (tp + 0.5*(fp+fn) + sys.float_info.epsilon)
        return f1
    
    def compute_recall(self,tp, tn, fp, fn):
        recall = tp*1. / (tp + fn + sys.float_info.epsilon)
        return recall
    
    def compute_precision(self,tp, tn, fp, fn):
        precision = tp*1. / (tp + fp + sys.float_info.epsilon)
        return precision
    

    【讨论】:

      猜你喜欢
      • 2011-09-29
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2012-11-12
      • 2013-01-29
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多