tensorflow python中张量的深拷贝答案

【问题标题】：Deep copy of tensor in tensorflow pythontensorflow python中张量的深拷贝
【发布时间】：2023-03-12 03:00:01
【问题描述】：

在我的一些代码中，我使用 tensorflow 创建了一个神经网络，并且可以访问代表该网络输出的张量。我想复制这个张量，这样即使我训练神经网络更多，我也可以访问张量的原始值。

按照其他答案和 tensorflow 文档，我尝试了 tf.identity() 函数，但它似乎没有做我需要的事情。其他一些链接建议使用 tf.tile()，但这也无济于事。我不希望使用 sess.run()、评估张量并将其存储在其他地方。

这是一个描述我需要做什么的玩具示例：

import tensorflow as tf
import numpy as np

t1 = tf.placeholder(tf.float32, [None, 1])
t2 = tf.layers.dense(t1, 1, activation=tf.nn.relu)
expected_out = tf.placeholder(tf.float32, [None, 1])

loss = tf.reduce_mean(tf.square(expected_out - t2))
train_op = tf.train.AdamOptimizer(1e-4).minimize(loss)

sess = tf.Session()

sess.run(tf.global_variables_initializer())

print(sess.run(t2, feed_dict={t1: np.array([1]).reshape(-1,1)}))
t3 = tf.identity(t2) # Need to make copy here
print(sess.run(t3, feed_dict={t1: np.array([1]).reshape(-1,1)}))

print("\nTraining \n")

for i in range(1000):
    sess.run(train_op, feed_dict={t1: np.array([1]).reshape(-1,1), expected_out: np.array([1]).reshape(-1,1)})

print(sess.run(t2, feed_dict={t1: np.array([1]).reshape(-1,1)}))
print(sess.run(t3, feed_dict={t1: np.array([1]).reshape(-1,1)}))

上述代码的结果是t2和t3具有相同的值。

[[1.5078927]]
[[1.5078927]]

Training

[[1.3262703]]
[[1.3262703]]

我想要t3 保持其价值不被复制。

[[1.5078927]]
[[1.5078927]]

Training

[[1.3262703]]
[[1.5078927]]

提前感谢您的帮助。

【问题讨论】：

标签： python tensorflow machine-learning deep-learning

【解决方案1】：

您可以使用命名的tf.assign 操作，然后通过Graph.get_operation_by_name 仅运行该操作。这不会获取张量的值，而只是在图上运行分配操作。考虑以下示例：

import tensorflow as tf

a = tf.placeholder(tf.int32, shape=(2,))
w = tf.Variable([1, 2])  # Updated in the training loop.
b = tf.Variable([0, 0])  # Backup; stores intermediate result.
t = tf.assign(w, tf.math.multiply(a, w))  # Update during training.
tf.assign(b, w, name='backup')

init_op = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init_op)
    x = [2, 2]
    # Emulate training loop:
    for i in range(3):
        print('w = ', sess.run(t, feed_dict={a: x}))
    # Backup without retrieving the value (returns None).
    print('Backup now: ', end='')
    print(sess.run(tf.get_default_graph().get_operation_by_name('backup')))
    # Train a bit more:
    for i in range(3):
        print('w = ', sess.run(t, feed_dict={a: x}))
    # Check the backed-up value:
    print('Backup: ', sess.run(b))  # Is [8, 16].

因此，对于您的示例，您可以这样做：

t3 = tf.Variable([], validate_shape=False)
tf.assign(t3, t2, validate_shape=False, name='backup')

【讨论】：

我认为这不太适合我的情况。在您的示例中， b 不依赖于占位符，因此您可以为其分配特定值。在我上面的例子中，b (t3) 应该依赖于占位符 t1 的输入，所以我不能给它分配一个特定的值。
当然，当您请求备份操作时，您还必须将t1 的值传递给feed_dict。但这与您之前尝试复制的方式没有什么不同：sess.run(t3, feed_dict={t1: np.array([1]).reshape(-1,1)})。现在你可以改为：sess.run(tf.get_default_graph().get_operation_by_name('backup'), feed_dict={t1: np.array([1]).reshape(-1,1)})。显然，您需要提供所有依赖项，因为您无法备份没有附加价值的东西。

【解决方案2】：

我认为也许 copy.deepcopy() 可以工作...... 例如：

import copy 
tensor_2 = copy.deepcopy(tensor_1)

关于 deepcopy 的 Python 文档： https://docs.python.org/3/library/copy.html

【讨论】：