在这两种情况下,您应该得到相同的输出。我将在下面用一个玩具示例来说明这一点:
> 1. 设置网络的输入和参数:
# Set RNN params
batch_size = 2
time_steps = 10
vector_size = 5
# Create a random input
dataset= tf.random_normal((batch_size, time_steps, vector_size), dtype=tf.float32, seed=42)
# input tensor to the RNN
X = tf.Variable(dataset, dtype=tf.float32)
> 2. 输入时间序列 LSTM:[batch_size, time_steps, vector_size]
# Initializers cannot be set to random value, so set it a fixed value.
with tf.variable_scope('rnn_full', initializer=tf.initializers.ones()):
basic_cell= tf.contrib.rnn.BasicRNNCell(num_units=10)
output_f, state_f= tf.nn.dynamic_rnn(basic_cell, X, dtype=tf.float32)
> 3. LSTM 在循环计数 time_steps 中调用以创建 tim_series,其中每个 LSTM 都有一个输入:[batch_size, vector_size] 并且返回的状态设置为初始状态
# Unstack the inputs across time_steps
unstack_X = tf.unstack(X,axis=1)
outputs = []
with tf.variable_scope('rnn_unstacked', initializer=tf.initializers.ones()):
basic_cell= tf.contrib.rnn.BasicRNNCell(num_units=10)
#init_state has to be set to zero
init_state = basic_cell.zero_state(batch_size, dtype=tf.float32)
# Create a loop of N LSTM cells, N = time_steps.
for i in range(len(unstack_X)):
output, state= tf.nn.dynamic_rnn(basic_cell, tf.expand_dims(unstack_X[i], 1), dtype=tf.float32, initial_state= init_state)
# copy the init_state with the new state
init_state = state
outputs.append(output)
# Transform the output to [batch_size, time_steps, vector_size]
output_r = tf.transpose(tf.squeeze(tf.stack(outputs)), [1, 0, 2])
> 4. 检查输出
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
out_f, st_f =sess.run([output_f, state_f])
out_r, st_r =sess.run([output_r, state])
npt.assert_almost_equal(out_f, out_r)
npt.assert_almost_equal(st_f, st_r)
states 和 outputs 都匹配。