在张量流中填充可变长度序列答案

【问题标题】：Pad variable length sequences in tensorflow在张量流中填充可变长度序列
【发布时间】：2016-12-18 00:15:05
【问题描述】：

我正在尝试将 CNN 输出馈送到 TensorFlow 中的 RNN。

CNN 处理 10 张图像并输出形状为 (1, 230, 2048) 的张量。其中230是所有图像的序列总数，2048是每个序列的长度。

我会跟踪向量中每个图像的序列数。例如：

[1, 9, 25, 29, 31, 10, 23, 29, 37, 36]

我可以得到最大序列号，在这种情况下它将是 37。

问题是如何在不同位置填充 (1, 230, 2048) 张量，以便所有图像都用相同数量的序列（本例中为 37）表示？

最终张量的形状应为 (1, 370, 2048)。

谢谢

【问题讨论】：

标签： tensorflow

【解决方案1】：

我写了一个简短的代码来解决它。这是一个小示例，其中 6 张图像具有不同的序列号（为了清楚起见，我在张量中插入了间距）。

vec = tf.constant([[1, 1, 1, 1, 1, 1, 1, 1], 
        [1, 1, 1, 1, 1, 1, 1, 1], 
        [1, 1, 1, 1, 1, 1, 1, 1], 
        [1, 1, 1, 1, 1, 1, 1, 1], 

        [2, 2, 2, 2, 2, 2, 2, 2], 
        [2, 2, 2, 2, 2, 2, 2, 2], 

        [3, 3, 3, 3, 3, 3, 3, 3],

        [4, 4, 4, 4, 4, 4, 4, 4],
        [4, 4, 4, 4, 4, 4, 4, 4],
        [4, 4, 4, 4, 4, 4, 4, 4],

        [5, 5, 5, 5, 5, 5, 5, 5],
        [5, 5, 5, 5, 5, 5, 5, 5],

        [6, 6, 6, 6, 6, 6, 6, 6]], dtype=tf.float32)

seqLens = [4, 2, 1, 3, 2, 1]
maxLen = max(seqLens)

NFeatures = 8
BatchSize = 6

n = 0
offset = sum(seqLens[0:(n)])
indices = tf.reshape(tf.range(offset, seqLens[n]+offset), [seqLens[n], 1])
res = tf.gather_nd(vec, [indices])
res_as_vector = tf.reshape(res, [-1])
zero_padding = tf.zeros([NFeatures * maxLen] - tf.shape(res_as_vector), dtype=res.dtype)
a_padded = tf.concat(0, [res_as_vector, zero_padding])
result = tf.reshape(a_padded, [maxLen, NFeatures])
Inputs2 = result

for n in range(1, BatchSize):
    offset = sum(seqLens[0:(n)])
    indices = tf.reshape(tf.range(offset, seqLens[n]+offset), [seqLens[n], 1])
    res = tf.gather_nd(vec, [indices])
    res_as_vector = tf.reshape(res, [-1])
    zero_padding = tf.zeros([NFeatures * maxLen] - tf.shape(res_as_vector), dtype=res.dtype)
    a_padded = tf.concat(0, [res_as_vector, zero_padding])
    result = tf.reshape(a_padded, [maxLen, NFeatures])
    Inputs2 = tf.concat(0, [Inputs2, result])

sess = tf.Session()
sess.run(tf.global_variables_initializer())

print(sess.run(Inputs2))

输出应如下所示：

[[ 1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.]
 [ 2.  2.  2.  2.  2.  2.  2.  2.]
 [ 2.  2.  2.  2.  2.  2.  2.  2.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.]
 [ 3.  3.  3.  3.  3.  3.  3.  3.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.]
 [ 4.  4.  4.  4.  4.  4.  4.  4.]
 [ 4.  4.  4.  4.  4.  4.  4.  4.]
 [ 4.  4.  4.  4.  4.  4.  4.  4.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.]
 [ 5.  5.  5.  5.  5.  5.  5.  5.]
 [ 5.  5.  5.  5.  5.  5.  5.  5.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.]
 [ 6.  6.  6.  6.  6.  6.  6.  6.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.]]

【讨论】：

【解决方案2】：

看看tf.pad。你传递给它一个对的列表——每个维度一对。

【讨论】：

问题是我需要在张量内填充区域。不是整个张量。