在 tensorflow keras 中创建熵作为自定义损失函数答案

【问题标题】：creating entropy as a custom loss function in tensorflow keras在 tensorflow keras 中创建熵作为自定义损失函数
【发布时间】：2021-02-22 04:19:00
【问题描述】：

我正在尝试在 tensorflow.keras 中创建自定义损失函数；特别是香农的熵。这是基本的神经网络结构

import tensorflow as tf
import numpy as np

import matplotlib.pyplot as plt

from scipy.stats import entropy
import numpy as np




mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train =x_train / 255.0





model = tf.keras.models.Sequential([

  tf.keras.layers.Flatten(input_shape=(28, 28, 1)),

  tf.keras.layers.Dense(128, activation=tf.nn.sigmoid),
  tf.keras.layers.Dense(10, activation=tf.nn.sigmoid)
])
model.compile(optimizer='sgd',
              loss=entropy_loss,
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=1,batch_size=512)

我正在尝试两种计算熵的方法，但都不起作用。第一种方法是将 y_true 和 y_pred 转换为 numpy，得到误差，然后使用 scipy 的熵度量计算熵。我在转换为 numpy 时遇到错误。我使用张量流计算的第二种方式，基于此：

how to calculate entropy on float numbers over a tensor in python keras

并且仍然面临错误。

方法1

def entropy_loss(y_true,y_pred):

    # Create a loss function that adds the MSE loss to the mean of all squared activations of a specific layer
   
    return tf.cast(entropy(y_pred.numpy() - y_true.numpy() , base=2))
   
    # Return a function
    #return loss

第一种方式有这个错误：

    <ipython-input-4-14c95bd6b1a3>:5 entropy_loss  *
        return tf.cast(entropy(y_pred.numpy() - y_true.numpy() , base=2))

    AttributeError: 'Tensor' object has no attribute 'numpy'

方法2

def entropy_loss(y_true,y_pred):

    y_true=tf.cast(y_true, tf.float32)
    y_pred=tf.cast(y_pred, tf.float32)
    e=y_true-y_pred
    print(e)
    loss= entropy_1(e) 
    #return e
    # Return a function
    return loss
def entropy_1( x):
    def row_entropy(row):
        _, _, count = tf.unique_with_counts(row)
        prob = count / tf.reduce_sum(count)
        return -tf.reduce_sum(prob * tf.math.log(prob))

    value_ranges = [-10.0, 100.0]
    nbins = 50
    new_f_w_t = tf.histogram_fixed_width_bins(x, value_ranges, nbins)
    result = tf.map_fn(row_entropy, new_f_w_t,dtype=tf.float32)
    return result

这个方法有如下错误：

    ValueError: Trying to read from list with wrong element dtype. List has type double but expected type float for '{{node entropy_loss/map/TensorArrayV2Stack/TensorListStack}} = TensorListStack[element_dtype=DT_FLOAT, num_elements=-1](entropy_loss/map/while:3, entropy_loss/map/TensorArrayV2Stack/Const)' with input shapes: [], [0].

【问题讨论】：

首先，你的最后一层应该是形状 10，因为有 10 个类。其次，请了解y_pred 和y_true 的含义。 y_pred 是一个 10x1 的概率向量，y_pred 是一个从 0 到 9 的数字，所以 y_pred - y_true 不起作用。您希望如何计算损失（数学上）？也许我可以提供更多帮助。
好吧，我错了，复制错误的一个代码sn-p，意思是复制最后一层有100的那个。
所以我想计算误差的熵。我想我会从概率中得到预测的标签，然后计算熵。

标签： python numpy tensorflow keras deep-learning

【解决方案1】：

其实你不需要实现它。但是，让我们弄清楚。

假设，您有第二批，答案和预测。你怀疑，tf entropy 和 entropy from scipy 会给出相同的结果。

a = np.array([[1], [2]], dtype=np.float)
b = np.array([[0.2, 0.7, 0.1], [0.2, 0.3, 0.5]], dtype=np.float)
at = tf.convert_to_tensor(a)
bt = tf.convert_to_tensor(b)
H = tf.keras.losses.sparse_categorical_crossentropy(a, b)
print(f"TF entropy: {H}")

a = [[0.2, 0.7, 0.1], [0.2, 0.3, 0.5]]
b = [[0, 1, 0], [0, 0, 1]]
H2 = entropy(b, a, axis=1)
print(f"Scipy entropy: {H2}")

结果：
TF熵：[0.35667494 0.69314718]
Scipy熵：[0.35667494 0.69314718]

好的，让我们实现它。

def my_entropy(y_true, y_pred):
    shape = tf.shape(y_pred)
    batch = shape[0]
    depth = tf.shape(y_pred)[1]
    y_true = tf.cast(y_true, tf.int32)
    y_true = tf.reshape(y_true, shape=(-1, batch))
    one_hot = tf.one_hot(y_true, depth=depth, dtype=tf.float32)
    y_pred = tf.cast(y_pred, dtype=tf.float32)
    div = tf.divide(one_hot, y_pred)
    div = tf.reduce_sum(div, axis=0)
    ind = tf.where(tf.greater(div, 0))
    values = tf.gather_nd(div, ind)
    h = tf.math.log(values)
    return h

测试：
H3 = my_entropy(at, bt)
我的熵：[0.35667497 0.6931472]

现在，您可以将其用作自定义损失函数，如下所示：

import tensorflow as tf

def my_entropy(y_true, y_pred):
    shape = tf.shape(y_pred)
    batch = shape[0]
    depth = tf.shape(y_pred)[1]
    y_true = tf.cast(y_true, tf.int32)
    y_true = tf.reshape(y_true, shape=(-1, batch))
    one_hot = tf.one_hot(y_true, depth=depth, dtype=tf.float32)
    y_pred = tf.cast(y_pred, dtype=tf.float32)
    div = tf.divide(one_hot, y_pred)
    div = tf.reduce_sum(div, axis=0)
    ind = tf.where(tf.greater(div, 0))
    values = tf.gather_nd(div, ind)
    h = tf.math.log(values)
    return h


if __name__ == '__main__':
    mnist = tf.keras.datasets.mnist

    (x_train, y_train), (x_test, y_test) = mnist.load_data()
    x_train = x_train / 255.0

    model = tf.keras.models.Sequential([

        tf.keras.layers.Flatten(input_shape=(28, 28, 1)),

        tf.keras.layers.Dense(128, activation=tf.nn.sigmoid),
        tf.keras.layers.Dense(10, activation=tf.nn.sigmoid)
    ])
    model.compile(optimizer='sgd',
                  loss=my_entropy,
                  metrics=['accuracy'])

    model.fit(x_train, y_train, epochs=5, batch_size=2)

【讨论】：

谢谢，虽然我一直在寻找差异 y_true 和 y_pred 以获得错误项，然后将该向量的香农熵作为我的损失函数。而不是做 sparse_categorical_crossentropy 损失