【问题标题】:Loss function with derivative in TensorFlow 2TensorFlow 2 中带导数的损失函数
【发布时间】:2020-11-24 22:19:13
【问题描述】:

我正在使用 TF2 (2.3.0) NN 来逼近求解 ODE 的函数 y:y'+3y=0

我已经定义了 cutsom 损失类和函数,在其中我试图区分单个输出相对于单个输入,因此等式成立,前提是 y_true 为零:

from tensorflow.keras.losses import Loss
import tensorflow as tf

class CustomLossOde(Loss):
    def __init__(self, x, model, name='ode_loss'):
        super().__init__(name=name)
        self.x = x
        self.model = model

    def call(self, y_true, y_pred):

        with tf.GradientTape() as tape:
            tape.watch(self.x)
            y_p = self.model(self.x)


        dy_dx = tape.gradient(y_p, self.x)
        loss = tf.math.reduce_mean(tf.square(dy_dx + 3 * y_pred - y_true))
        return loss

但运行以下 NN:

import tensorflow as tf
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense
from tensorflow.keras import Input
from custom_loss_ode import CustomLossOde


num_samples = 1024
x_train = 4 * (tf.random.uniform((num_samples, )) - 0.5)
y_train = tf.zeros((num_samples, ))
inputs = Input(shape=(1,))
x = Dense(16, 'tanh')(inputs)
x = Dense(8, 'tanh')(x)
x = Dense(4)(x)
y = Dense(1)(x)
model = Model(inputs=inputs, outputs=y)
loss = CustomLossOde(model.input, model)
model.compile(optimizer=Adam(learning_rate=0.01, beta_1=0.9, beta_2=0.99),loss=loss)
model.run_eagerly = True
model.fit(x_train, y_train, batch_size=16, epochs=30)

现在我从第一个时代得到 0 损失,这没有任何意义。

我已经从函数中打印了y_truey_test,它们看起来还可以,所以我怀疑问题出在我没有成功打印的渐变中。 感谢任何帮助

【问题讨论】:

  • 当您将model.input 作为您的custpm 损失的x 传递时,您到底想达到什么目的? model.input 是一个符号张量,它不包含任何数据。
  • @Lescurel 正如我所说:我正在尝试定义一个损失函数,该损失函数对(单个)网络输出相对于(单个)网络输入的导数进行了忏悔。你能解释一下我该怎么做吗?

标签: tensorflow neural-network


【解决方案1】:

在这种情况下,使用高级 Keras API 定义自定义损失有点困难。我会改为从 scracth 编写训练循环,因为它可以更精细地控制你可以做什么。

我从这两个指南中获得灵感:

基本上,我使用了多个磁带可以无缝交互的事实。我用一个来计算损失函数,另一个来计算优化器传播的梯度。

import tensorflow as tf
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense
from tensorflow.keras import Input

num_samples = 1024
x_train = 4 * (tf.random.uniform((num_samples, )) - 0.5)
y_train = tf.zeros((num_samples, ))
inputs = Input(shape=(1,))
x = Dense(16, 'tanh')(inputs)
x = Dense(8, 'tanh')(x)
x = Dense(4)(x)
y = Dense(1)(x)
model = Model(inputs=inputs, outputs=y)

# using the high level tf.data API for data handling
x_train = tf.reshape(x_train,(-1,1))
dataset = tf.data.Dataset.from_tensor_slices((x_train,y_train)).batch(1)

opt = Adam(learning_rate=0.01, beta_1=0.9, beta_2=0.99)
for step, (x,y_true) in enumerate(dataset):
    # we need to convert x to a variable if we want the tape to be 
    # able to compute the gradient according to x
    x_variable = tf.Variable(x) 
    with tf.GradientTape() as model_tape:
        with tf.GradientTape() as loss_tape:
            loss_tape.watch(x_variable)
            y_pred = model(x_variable)
        dy_dx = loss_tape.gradient(y_pred, x_variable)
        loss = tf.math.reduce_mean(tf.square(dy_dx + 3 * y_pred - y_true))
    grad = model_tape.gradient(loss, model.trainable_variables)
    opt.apply_gradients(zip(grad, model.trainable_variables))
    if step%20==0:
        print(f"Step {step}: loss={loss.numpy()}")

【讨论】:

    猜你喜欢
    • 2019-01-20
    • 2019-02-03
    • 1970-01-01
    • 2022-12-09
    • 1970-01-01
    • 1970-01-01
    • 2020-11-27
    • 2018-12-20
    • 2018-01-09
    相关资源
    最近更新 更多