【问题标题】：How to monitor tensor values in Theano/Keras?如何在 Theano/Keras 中监控张量值？
【发布时间】：2016-05-05 22:51:12
【问题描述】：

我知道有人以各种形式提出过这个问题，但我真的找不到任何我能理解和使用的答案。如果这是一个基本问题，请原谅我，因为我是这些工具的新手（theano/keras）

要解决的问题

在神经网络中监控变量（例如 LSTM 中的输入/遗忘/输出门值）

我目前得到了什么

无论我在哪个阶段获得这些值，我都会得到类似的东西：

Elemwise{mul,no_inplace}.0
Elemwise{mul,no_inplace}.0
[for{cpu,scan_fn}.2, Subtensor{int64::}.0, Subtensor{int64::}.0]
[for{cpu,scan_fn}.2, Subtensor{int64::}.0, Subtensor{int64::}.0]
Subtensor{int64}.0
Subtensor{int64}.0

有什么方法我无法监控（例如打印到标准输出、写入文件等）吗？

可能的解决方案

似乎 Keras 中的回调可以完成这项工作，但它对我也不起作用。我得到了和上面一样的东西

我的猜测

似乎我犯了非常简单的错误。

提前非常感谢大家。

添加

具体来说，我正在尝试监控 LSTM 中的输入/忘记/输出门控值。我发现 LSTM.step() 用于计算这些值：

def step(self, x, states):
    h_tm1 = states[0]   # hidden state of the previous time step
    c_tm1 = states[1]   # cell state from the previous time step
    B_U = states[2]     # dropout matrices for recurrent units?
    B_W = states[3]     # dropout matrices for input units?

    if self.consume_less == 'cpu':                              # just cut x into 4 pieces in columns
        x_i = x[:, :self.output_dim]
        x_f = x[:, self.output_dim: 2 * self.output_dim]
        x_c = x[:, 2 * self.output_dim: 3 * self.output_dim]
        x_o = x[:, 3 * self.output_dim:]
    else:
        x_i = K.dot(x * B_W[0], self.W_i) + self.b_i
        x_f = K.dot(x * B_W[1], self.W_f) + self.b_f
        x_c = K.dot(x * B_W[2], self.W_c) + self.b_c
        x_o = K.dot(x * B_W[3], self.W_o) + self.b_o

    i = self.inner_activation(x_i + K.dot(h_tm1 * B_U[0], self.U_i))
    f = self.inner_activation(x_f + K.dot(h_tm1 * B_U[1], self.U_f))
    c = f * c_tm1 + i * self.activation(x_c + K.dot(h_tm1 * B_U[2], self.U_c))
    o = self.inner_activation(x_o + K.dot(h_tm1 * B_U[3], self.U_o))

    with open("test_visualization.txt", "a") as myfile:
        myfile.write(str(i)+"\n")

    h = o * self.activation(c)
    return h, [h, c]

因为它在上面的代码中，我试图将 i 的值写入一个文件，但它只给了我这样的值：

Elemwise{mul,no_inplace}.0
[for{cpu,scan_fn}.2, Subtensor{int64::}.0, Subtensor{int64::}.0]
Subtensor{int64}.0

所以我尝试了 i.eval() 或 i.get_value()，但都没有给我值。

.eval() 给了我这个：

theano.gof.fg.MissingInputError: An input of the graph, used to compute Subtensor{::, :int64:}(<TensorType(float32, matrix)>, Constant{10}), was not provided and not given a value.Use the Theano flag exception_verbosity='high',for more information on this error.

和 .get_value() 给了我这个：

AttributeError: 'TensorVariable' object has no attribute 'get_value'

所以我回溯了那些链（哪一行调用了哪些函数..），并试图在我找到的每一步获取值，但徒劳无功。

感觉我陷入了一些基本的陷阱。

【问题讨论】：

您是如何获得这些值的？包括您的代码，似乎您正在打印符号变量而不是它们的值。
非常感谢您的快速回复@MatiasValdenegro。我用代码和错误消息更新了上面的问题。

标签： callback monitoring theano keras

【解决方案1】：

我使用 Keras 常见问题解答中描述的解决方案：

http://keras.io/getting-started/faq/#how-can-i-visualize-the-output-of-an-intermediate-layer

详细说明：

from keras import backend as K

intermediate_tensor_function = K.function([model.layers[0].input],[model.layers[layer_of_interest].output])
intermediate_tensor = intermediate_tensor_function([thisInput])[0]

产量：

array([[ 3.,  17.]], dtype=float32)

但是我想使用函数式 API，但我似乎无法获得实际的张量，只能获得符号表示。例如：

model.layers[1].output

产量：

<tf.Tensor 'add:0' shape=(?, 2) dtype=float32>

我在这里遗漏了一些关于 Keras 和 Tensorflow 交互的内容，但我不确定是什么。非常感谢任何见解。

【讨论】：

【解决方案2】：

一种解决方案是创建一个在 LSTM 层被截断的网络版本，您希望监控其门值，然后用自定义层替换原始层，其中修改 stepfunction 以返回不仅隐藏层值，还有门值。

例如，假设您要访问 GRU 的访问门值。创建一个自定义层 GRU2，它继承 GRU 类的所有内容，但调整 step 函数，使其返回您要监视的状态的串联，然后在计算下一个激活时仅采用包含先前隐藏层激活的部分。即：

def step(self, x, states):

    # get prev hidden layer from input that is concatenation of
    # prev hidden layer + reset gate + update gate
    x = x[:self.output_dim, :]


    ###############################################
    # This is the original code from the GRU layer
    #

    h_tm1 = states[0]  # previous memory
    B_U = states[1]  # dropout matrices for recurrent units
    B_W = states[2]

    if self.consume_less == 'gpu':

        matrix_x = K.dot(x * B_W[0], self.W) + self.b
        matrix_inner = K.dot(h_tm1 * B_U[0], self.U[:, :2 * self.output_dim])

        x_z = matrix_x[:, :self.output_dim]
        x_r = matrix_x[:, self.output_dim: 2 * self.output_dim]
        inner_z = matrix_inner[:, :self.output_dim]
        inner_r = matrix_inner[:, self.output_dim: 2 * self.output_dim]

        z = self.inner_activation(x_z + inner_z)
        r = self.inner_activation(x_r + inner_r)

        x_h = matrix_x[:, 2 * self.output_dim:]
        inner_h = K.dot(r * h_tm1 * B_U[0], self.U[:, 2 * self.output_dim:])
        hh = self.activation(x_h + inner_h)
    else:
        if self.consume_less == 'cpu':
            x_z = x[:, :self.output_dim]
            x_r = x[:, self.output_dim: 2 * self.output_dim]
            x_h = x[:, 2 * self.output_dim:]
        elif self.consume_less == 'mem':
            x_z = K.dot(x * B_W[0], self.W_z) + self.b_z
            x_r = K.dot(x * B_W[1], self.W_r) + self.b_r
            x_h = K.dot(x * B_W[2], self.W_h) + self.b_h
        else:
            raise Exception('Unknown `consume_less` mode.')
        z = self.inner_activation(x_z + K.dot(h_tm1 * B_U[0], self.U_z))
        r = self.inner_activation(x_r + K.dot(h_tm1 * B_U[1], self.U_r))

        hh = self.activation(x_h + K.dot(r * h_tm1 * B_U[2], self.U_h))
    h = z * h_tm1 + (1 - z) * hh

    #
    # End of original code
    ###########################################################


    # concatenate states you want to monitor, in this case the
    # hidden layer activations and gates z and r
    all = K.concatenate([h, z, r])

    # return everything
    return all, [h]

（请注意，我添加的唯一行是函数的开头和结尾）。

如果您随后使用 GRU2 作为最后一层而不是 GRU 运行网络（对于 GRU2 层，return_sequences = True），您可以在网络上调用 predict，这将为您提供所有隐藏层和门值。

同样的事情应该适用于 LSTM，尽管您可能需要有点困惑才能弄清楚如何将所需的所有输出存储在一个向量中并在之后再次检索它们。

希望有帮助！

【讨论】：

【解决方案3】：

您可以使用 theano 的 printing 模块来打印在执行期间（而不是在定义期间，这是您正在做的事情以及您没有获得值的原因，而是它们的抽象定义)。

打印

只需使用Print 函数。 不要忘记使用Print 的输出来继续您的图表，否则输出将断开连接，并且打印很可能会在优化过程中被删除。你什么也看不到。

from keras import backend as K
from theano.printing import Print

def someLossFunction(x, ref):
  loss = K.square(x - ref)
  loss = Print('Loss tensor (before sum)')(loss)
  loss = K.sum(loss)
  loss = Print('Loss scalar (after sum)')(loss)
  return loss

情节

您可能会享受的一点奖励。

Print 类有一个global_fn 参数，用于覆盖默认的打印回调。您可以提供自己的函数并直接访问数据，例如构建绘图。

from keras import backend as K
from theano.printing import Print
import matplotlib.pyplot as plt

curve = []

# the callback function
def myPlottingFn(printObj, data):
    global curve
    # Store scalar data
    curve.append(data)

    # Plot it
    fig, ax = plt.subplots()
    ax.plot(curve, label=printObj.message)
    ax.legend(loc='best')
    plt.show()

def someLossFunction(x, ref):
  loss = K.sum(K.square(x - ref))
  # Callback is defined line below
  loss = Print('Loss scalar (after sum)', global_fn=myplottingFn)(loss) 
  return loss

顺便说一句，您传递给 Print('...') 的字符串存储在属性名称 message 下的打印对象中（请参阅函数 myPlottingFn）。这对于自动构建多曲线图很有用

【讨论】：