如何更改 TensorFlow 中的模型参数？答案

【问题标题】：How to change model parameters in TensorFlow?如何更改 TensorFlow 中的模型参数？
【发布时间】：2018-03-13 16:36:23
【问题描述】：

我有一个在 TensorFrlow 中实现的逻辑回归模型 (lr)。我使用这个模型来生成预测：

print s.run(preds, feed_dict = {x:X[:5]})

之后我尝试通过以下方式更改模型参数：

lr.w = tf.assign(lr.w, np.random.uniform(size=(inp_dim, out_dim)))
lr.b = tf.assign(lr.b, np.random.uniform(size=(out_dim,)))
s.run([lr.w, lr.b])

之后我以同样的方式生成新的预测：

print s.run(preds, feed_dict = {x:X[:5]})

令人惊讶的是，我得到了与模型参数更改之前相同的值。所以，看起来我没有设法更改模型参数。

有谁知道我做错了什么？

添加

我可能需要提供有关我的“架构”的更多详细信息。这是我对逻辑回归的实现：

class logreg:

    def __init__(self, inp_dim, out_dim, r = 1.0):
        # initialize values of model parameters
        w_val = np.random.uniform(-r, r, size = (inp_dim, out_dim))
        b_val = np.random.uniform(-r, r, size = (out_dim,))
        self.w = tf.Variable(w_val, tf.float64)
        self.b = tf.Variable(b_val, tf.float64)

    def get_model_graph(self, inp):
        return tf.nn.softmax(tf.matmul(inp, self.w) + self.b)

我使用这个类的一个实例来定义预测方法：

x = tf.placeholder(tf.float64, [None, inp_dim])
preds = lr.get_model_graph(x)

我尝试通过更改 lr.w 和 lr.b 的值来“重新定义”预测函数，但它不起作用（如上所述）。

但是，我发现重新定义预测函数后，模型参数的新值变得可见：

lr.w = tf.assign(lr.w, np.random.uniform(size=(inp_dim, out_dim)))
lr.b = tf.assign(lr.b, np.random.uniform(size=(out_dim,)))
s.run(lr.w)
s.run(lr.b)
preds = lr.get_model_graph(x)

这是为什么呢？ “preds”的计算图不是绑定到lr.w 和lr.b 并重新定义“preds”我只需要更改w 和b 的值吗？

【问题讨论】：

标签： tensorflow

【解决方案1】：

所描述的问题行为是由于模型参数的值的第一次分配是在定义预测的计算图之前完成的。

更详细地说，以下代码将“阻止”模型的参数进行任何进一步的重新分配（这样就不可能更改模型参数）：

# instantiate the model
lr = logreg_tf(inp_dim = 4, out_dim = 3)

#  create the predict function
x = tf.placeholder(tf.float64, [None, inp_dim])

# specify values of the parameters
w = np.array([
    [ 1.0,  2.0,  3.0],
    [ 4.0,  5.0,  6.0],
    [ 7.0,  8.0,  9.0],
    [10.0, 11.0, 12.0]
    ])
b = np.array([13.0, 14.0, 15.0])

# set the values of the model parameters
lr.w = tf.assign(lr.w, w)
lr.b = tf.assign(lr.b, b)

# initialize all the global variables
s = tf.Session()    
s.run([lr.w, lr.b])

preds = lr.get_model_graph(x)

相比之下，下面的代码防止了“阻塞”：

# instantiate the model
lr = logreg_tf(inp_dim = 4, out_dim = 3)

#  create the predict function
x = tf.placeholder(tf.float64, [None, inp_dim])
preds = lr.get_model_graph(x)

# specify values of the parameters
w = np.array([
    [ 1.0,  2.0,  3.0],
    [ 4.0,  5.0,  6.0],
    [ 7.0,  8.0,  9.0],
    [10.0, 11.0, 12.0]
    ])
b = np.array([13.0, 14.0, 15.0])

# set the values of the model parameters
lr.w = tf.assign(lr.w, w)
lr.b = tf.assign(lr.b, b)

# initialize all the global variables
s = tf.Session()    
s.run([lr.w, lr.b])

这两个代码块之间的唯一区别是定义“preds”的行的位置。

更详细地解释了所描述的行为here。

【讨论】：

【解决方案2】：

在分配参数的新值时，您正在做一些奇怪的事情。

你为什么不定义你的类的方法来重新分配它们：

def assign_parameters(self, param, new_val):
    self.param = tf.Variable(w_val, tf.float64)

类似的原因是我看不到代码更新你的类在你执行lr.get_model_graph(x)时使用的变量

【讨论】：

我尝试更新我的班级在这里使用的变量：lr.w = tf.assign(lr.w, np.random.uniform(size=(inp_dim, out_dim))) lr.b = tf.assign(lr.b, np.random.uniform(size=(out_dim,)))
我尝试使用lr.w = tf.Variable(w_val, tf.float64) 而不是lr.w = tf.assign(lr.w, w_val)。结果，我收到一条错误消息：`Attempting to use uninitialized value`。