【问题标题】:Effect of setting sequence_length on the returned state in dynamic_rnn设置 sequence_length 对 dynamic_rnn 中返回状态的影响
【发布时间】:2018-05-09 21:15:33
【问题描述】:

假设我有一个 LSTM 网络来对长度为 10 的时间序列进行分类,将时间序列提供给 LSTM 的标准方法是形成一个 [batch size X 10 X vector size] 数组并将其提供给 LSTM:

self.rnn_t, self.new_state = tf.nn.dynamic_rnn( \
        inputs=self.X, cell=self.lstm_cell, dtype=tf.float32, initial_state=self.state_in)

当使用sequence_length 参数时,我可以指定时间序列的长度。

我的问题,对于上面定义的场景,如果我使用大小为 [batch size X 1 X vector size] 的向量调用 dynamic_rnn 10 次,取时间序列中的匹配索引并将返回的状态作为 initial_state在前面的电话中,我最终会得到相同的结果吗?输出和状态?或不?

【问题讨论】:

    标签: python tensorflow lstm


    【解决方案1】:

    在这两种情况下,您应该得到相同的输出。我将在下面用一个玩具示例来说明这一点:

    > 1. 设置网络的输入和参数:

    # Set RNN params
    batch_size = 2
    time_steps = 10
    vector_size = 5
    
    # Create a random input
    dataset= tf.random_normal((batch_size, time_steps, vector_size), dtype=tf.float32, seed=42)
    
    # input tensor to the RNN
    X = tf.Variable(dataset, dtype=tf.float32)
    

    > 2. 输入时间序列 LSTM:[batch_size, time_steps, vector_size]

    # Initializers cannot be set to random value, so set it a fixed value.
    with tf.variable_scope('rnn_full', initializer=tf.initializers.ones()):
       basic_cell= tf.contrib.rnn.BasicRNNCell(num_units=10)
       output_f, state_f= tf.nn.dynamic_rnn(basic_cell, X, dtype=tf.float32)
    

    > 3. LSTM 在循环计数 time_steps 中调用以创建 tim_series,其中每个 LSTM 都有一个输入:[batch_size, vector_size] 并且返回的状态设置为初始状态

    # Unstack the inputs across time_steps    
    unstack_X = tf.unstack(X,axis=1)
    
    outputs = []
    with tf.variable_scope('rnn_unstacked', initializer=tf.initializers.ones()):
       basic_cell= tf.contrib.rnn.BasicRNNCell(num_units=10)
    
       #init_state has to be set to zero
       init_state = basic_cell.zero_state(batch_size, dtype=tf.float32)
    
       # Create a loop of N LSTM cells, N = time_steps.
       for i in range(len(unstack_X)):
          output, state= tf.nn.dynamic_rnn(basic_cell, tf.expand_dims(unstack_X[i], 1), dtype=tf.float32, initial_state= init_state)
          # copy the init_state with the new state
          init_state = state
          outputs.append(output)
       # Transform the output to [batch_size, time_steps, vector_size]        
       output_r = tf.transpose(tf.squeeze(tf.stack(outputs)), [1, 0, 2])
    

    > 4. 检查输出

    with tf.Session() as sess:
       sess.run(tf.global_variables_initializer())
       out_f, st_f =sess.run([output_f, state_f])
       out_r, st_r =sess.run([output_r, state])
    
       npt.assert_almost_equal(out_f, out_r)
       npt.assert_almost_equal(st_f, st_r)
    

    statesoutputs 都匹配。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2018-06-22
      • 1970-01-01
      • 1970-01-01
      • 2018-07-06
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多