如何以张量的形式从张量流或 Keras 中的混淆矩阵中获得准确性？答案

【问题标题】：How can I get accuracy from confusion matrix in tensorflow or Keras in the form of a tensor?如何以张量的形式从张量流或 Keras 中的混淆矩阵中获得准确性？
【发布时间】：2019-01-21 07:14:25
【问题描述】：

我想从混淆矩阵中获取 UAR（未加权准确度）来监控验证数据的 UAR。但是，很难处理张量。

https://www.davidtvs.com/keras-custom-metrics/

我确实参考了这个网站并尝试在 Keras 中创建自己的指标。我通过使用 Keras 支持的ModelCheckpoint 和EarlyStopping 的第一种方法来制定指标。

model.compile(loss='categorical_crossentropy',optimizer=adam, metrics=['accuracy', uar_accuracy])

但是，我不知道如何定义uar_accuracy 函数。

            def uar_accuracy(y_true, y_pred):

            # Calculate the label from one-hot encoding
            pred_class_label = K.argmax(y_pred, axis=-1)
            true_class_label = K.argmax(y_true, axis=-1)


            cf_mat = tf.confusion_matrix(true_class_label, pred_class_label )

            diag = tf.linalg.tensor_diag_part(cf_mat)
            uar = K.mean(diag)

            return uar

此结果返回每个类的数据权数的平均值。但我不想要正确数据数量的平均值，但我想要每个类别的正确概率的平均值。

我该如何实现它？

我使用sklearn.metrics 和collections 库为numpy 类型而不是Tensor 类型实现了以下内容

            def get_accuracy_and_cnf_matrix(label, predict):
            uar = 0
            accuracy = []
            cnf_matrix = confusion_matrix(label, predict)
            diag=np.diagonal(cnf_matrix)
            for index,i in enumerate(diag):
                uar+=i/collections.Counter(label)[index]

            # cnf_marix (Number of corrects -> Accuracy)    
            cnf_matrix = np.transpose(cnf_matrix)
            cnf_matrix = cnf_matrix*100 / cnf_matrix.astype(np.int).sum(axis=0)
            cnf_matrix = np.transpose(cnf_matrix).astype(float)
            cnf_matrix = np.around(cnf_matrix, decimals=2)   

            # WAR, UAR
            test_weighted_accuracy = np.sum(label==predict)/len(label)*100
            test_unweighted_accuracy = uar/len(cnf_matrix)*100    
            accuracy.append(test_weighted_accuracy)
            accuracy.append(test_unweighted_accuracy)

            return np.around(np.array(accuracy),decimals=2), cnf_matrix

【问题讨论】：

标签： python tensorflow keras confusion-matrix

【解决方案1】：

您可以使用tf.reduce_sum 计算混淆矩阵中每一行的总和。这对应于每个类的数据点总数。然后你用这个行和除对角线元素来计算每个类正确预测示例的比率。

def non_nan_average(x):
    # Computes the average of all elements that are not NaN in a rank 1 tensor
    nan_mask = tf.debugging.is_nan(x)
    x = tf.boolean_mask(x, tf.logical_not(nan_mask))
    return K.mean(x)


def uar_accuracy(y_true, y_pred):
    # Calculate the label from one-hot encoding
    pred_class_label = K.argmax(y_pred, axis=-1)
    true_class_label = K.argmax(y_true, axis=-1)

    cf_mat = tf.confusion_matrix(true_class_label, pred_class_label )

    diag = tf.linalg.tensor_diag_part(cf_mat)    

    # Calculate the total number of data examples for each class
    total_per_class = tf.reduce_sum(cf_mat, axis=1)

    acc_per_class = diag / tf.maximum(1, total_per_class)  
    uar = non_nan_average(acc_per_class)

    return uar

【讨论】：

想不到这么简单的事情，一直在想！非常感谢。祝你有美好的一天。
乐于帮助 :-) 如果此答案或任何其他答案解决了您的问题，请将其标记为已接受。也有美好的一天
我有问题。为了防止发生 NaN，我添加了以下代码。那么UAR的结果是错误的。你知道为什么吗？ total_per_class= tf.reduce_sum(cf_mat, axis=1) total_per_class= tf.cast(total_per_class, dtype=tf.float32) total_per_class= total_per_class + tf.convert_to_tensor(K.epsilon())
我想发生 NaN 是因为你有一个零行，所以其中一个类的 total_per_class 为 0？在这种情况下，我想你应该忽略那个类，因为你没有它的数据。您的代码不起作用的原因是您将 0 替换为 epsilon，当您在 acc_per_class 中除以 epsilon 时，您会得到一个巨大且不正确的数字
我刚刚意识到你将 0 除以 epsilon，所以你不会得到一个很大的数字，你只会得到 0。但是当你对所有行取平均值时，这仍然会给出错误的结果跨度>