TensorFlow 2 中带导数的损失函数答案

【问题标题】：Loss function with derivative in TensorFlow 2TensorFlow 2 中带导数的损失函数
【发布时间】：2020-11-24 22:19:13
【问题描述】：

我正在使用 TF2 (2.3.0) NN 来逼近求解 ODE 的函数 y：y'+3y=0

我已经定义了 cutsom 损失类和函数，在其中我试图区分单个输出相对于单个输入，因此等式成立，前提是 y_true 为零：

from tensorflow.keras.losses import Loss
import tensorflow as tf

class CustomLossOde(Loss):
    def __init__(self, x, model, name='ode_loss'):
        super().__init__(name=name)
        self.x = x
        self.model = model

    def call(self, y_true, y_pred):

        with tf.GradientTape() as tape:
            tape.watch(self.x)
            y_p = self.model(self.x)


        dy_dx = tape.gradient(y_p, self.x)
        loss = tf.math.reduce_mean(tf.square(dy_dx + 3 * y_pred - y_true))
        return loss

但运行以下 NN：

import tensorflow as tf
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense
from tensorflow.keras import Input
from custom_loss_ode import CustomLossOde


num_samples = 1024
x_train = 4 * (tf.random.uniform((num_samples, )) - 0.5)
y_train = tf.zeros((num_samples, ))
inputs = Input(shape=(1,))
x = Dense(16, 'tanh')(inputs)
x = Dense(8, 'tanh')(x)
x = Dense(4)(x)
y = Dense(1)(x)
model = Model(inputs=inputs, outputs=y)
loss = CustomLossOde(model.input, model)
model.compile(optimizer=Adam(learning_rate=0.01, beta_1=0.9, beta_2=0.99),loss=loss)
model.run_eagerly = True
model.fit(x_train, y_train, batch_size=16, epochs=30)

现在我从第一个时代得到 0 损失，这没有任何意义。

我已经从函数中打印了y_true 和y_test，它们看起来还可以，所以我怀疑问题出在我没有成功打印的渐变中。感谢任何帮助

【问题讨论】：

当您将model.input 作为您的custpm 损失的x 传递时，您到底想达到什么目的？ model.input 是一个符号张量，它不包含任何数据。
@Lescurel 正如我所说：我正在尝试定义一个损失函数，该损失函数对（单个）网络输出相对于（单个）网络输入的导数进行了忏悔。你能解释一下我该怎么做吗？

标签： tensorflow neural-network

【解决方案1】：

在这种情况下，使用高级 Keras API 定义自定义损失有点困难。我会改为从 scracth 编写训练循环，因为它可以更精细地控制你可以做什么。

我从这两个指南中获得灵感：

基本上，我使用了多个磁带可以无缝交互的事实。我用一个来计算损失函数，另一个来计算优化器传播的梯度。

import tensorflow as tf
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense
from tensorflow.keras import Input

num_samples = 1024
x_train = 4 * (tf.random.uniform((num_samples, )) - 0.5)
y_train = tf.zeros((num_samples, ))
inputs = Input(shape=(1,))
x = Dense(16, 'tanh')(inputs)
x = Dense(8, 'tanh')(x)
x = Dense(4)(x)
y = Dense(1)(x)
model = Model(inputs=inputs, outputs=y)

# using the high level tf.data API for data handling
x_train = tf.reshape(x_train,(-1,1))
dataset = tf.data.Dataset.from_tensor_slices((x_train,y_train)).batch(1)

opt = Adam(learning_rate=0.01, beta_1=0.9, beta_2=0.99)
for step, (x,y_true) in enumerate(dataset):
    # we need to convert x to a variable if we want the tape to be 
    # able to compute the gradient according to x
    x_variable = tf.Variable(x) 
    with tf.GradientTape() as model_tape:
        with tf.GradientTape() as loss_tape:
            loss_tape.watch(x_variable)
            y_pred = model(x_variable)
        dy_dx = loss_tape.gradient(y_pred, x_variable)
        loss = tf.math.reduce_mean(tf.square(dy_dx + 3 * y_pred - y_true))
    grad = model_tape.gradient(loss, model.trainable_variables)
    opt.apply_gradients(zip(grad, model.trainable_variables))
    if step%20==0:
        print(f"Step {step}: loss={loss.numpy()}")

【讨论】：