【问题标题】:TensorFlow random_shuffle_queue is closed and has insufficient elementsTensorFlow random_shuffle_queue 已关闭且元素不足
【发布时间】:2015-12-02 18:15:36
【问题描述】:

我通过从 tfrecords 获取想法 here 来读取一批图像(由 this 转换)

我的图像是 cifar 图像 [32, 32, 3],正如您在阅读和传递图像时看到的那样,形状是正常的 (batch_size=100)

据我所知,日志中提到的两个最值得注意的问题是

  1. 12228 的形状,我不知道我从哪里得到的。我所有的张量的形状都是 [32, 32, 3] 或 [None, 3072]
  2. 样本用完

Compute status: Out of range: RandomSuffleQueue '_2_input/shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 100, current size 0)

我该如何解决这个问题?

日志:

1- image shape is  TensorShape([Dimension(3072)])
1.1- images batch shape is  TensorShape([Dimension(100), Dimension(3072)])
2- images shape is  TensorShape([Dimension(100), Dimension(3072)])

W tensorflow/core/kernels/queue_ops.cc:79] Invalid argument: Shape mismatch in tuple component 0. Expected [3072], got [12288]
W tensorflow/core/common_runtime/executor.cc:1027] 0x7fa72abc89a0 Compute status: Invalid argument: Shape mismatch in tuple component 0. Expected [3072], got [12288]
     [[Node: input/shuffle_batch/random_shuffle_queue_enqueue = QueueEnqueue[Tcomponents=[DT_FLOAT, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](input/shuffle_batch/random_shuffle_queue, input/sub, input/Cast_1)]]
W tensorflow/core/kernels/queue_ops.cc:79] Invalid argument: Shape mismatch in tuple component 0. Expected [3072], got [12288]
W tensorflow/core/common_runtime/executor.cc:1027] 0x7fa72ab9d080 Compute status: Invalid argument: Shape mismatch in tuple component 0. Expected [3072], got [12288]
     [[Node: input/shuffle_batch/random_shuffle_queue_enqueue = QueueEnqueue[Tcomponents=[DT_FLOAT, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](input/shuffle_batch/random_shuffle_queue, input/sub, input/Cast_1)]]
W tensorflow/core/kernels/queue_ops.cc:79] Invalid argument: Shape mismatch in tuple component 0. Expected [3072], got [12288]
W tensorflow/core/common_runtime/executor.cc:1027] 0x7fa7285e55a0 Compute status: Invalid argument: Shape mismatch in tuple component 0. Expected [3072], got [12288]
     [[Node: input/shuffle_batch/random_shuffle_queue_enqueue = QueueEnqueue[Tcomponents=[DT_FLOAT, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](input/shuffle_batch/random_shuffle_queue, input/sub, input/Cast_1)]]
W tensorflow/core/kernels/queue_ops.cc:79] Invalid argument: Shape mismatch in tuple component 0. Expected [3072], got [12288]
W tensorflow/core/common_runtime/executor.cc:1027] 0x7fa72aadb080 Compute status: Invalid argument: Shape mismatch in tuple component 0. Expected [3072], got [12288]
     [[Node: input/shuffle_batch/random_shuffle_queue_enqueue = QueueEnqueue[Tcomponents=[DT_FLOAT, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](input/shuffle_batch/random_shuffle_queue, input/sub, input/Cast_1)]]
W tensorflow/core/common_runtime/executor.cc:1027] 0x7fa72ad499a0 Compute status: Out of range: RandomSuffleQueue '_2_input/shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 100, current size 0)
     [[Node: input/shuffle_batch = QueueDequeueMany[component_types=[DT_FLOAT, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](input/shuffle_batch/random_shuffle_queue, input/shuffle_batch/n)]]
Traceback (most recent call last):
  File "/Users/HANEL/Documents/my_cifar_train.py", line 110, in <module>
    tf.app.run()
  File "/Users/HANEL/tensorflow/lib/python2.7/site-packages/tensorflow/python/platform/default/_app.py", line 11, in run
    sys.exit(main(sys.argv))
  File "/Users/HANEL/my_cifar_train.py", line 107, in main
    train()
  File "/Users/HANEL/my_cifar_train.py", line 76, in train
    _, loss_value = sess.run([train_op, loss])
  File "/Users/HANEL/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 345, in run
    results = self._do_run(target_list, unique_fetch_targets, feed_dict_string)
  File "/Users/HANEL/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 419, in _do_run
    e.code)
tensorflow.python.framework.errors.OutOfRangeError: RandomSuffleQueue '_2_input/shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 100, current size 0)
     [[Node: input/shuffle_batch = QueueDequeueMany[component_types=[DT_FLOAT, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](input/shuffle_batch/random_shuffle_queue, input/shuffle_batch/n)]]
Caused by op u'input/shuffle_batch', defined at:
  File "/Users/HANEL/my_cifar_train.py", line 110, in <module>
    tf.app.run()
  File "/Users/HANEL/tensorflow/lib/python2.7/site-packages/tensorflow/python/platform/default/_app.py", line 11, in run
    sys.exit(main(sys.argv))
  File "/Users/HANEL/my_cifar_train.py", line 107, in main
    train()
  File "/Users/HANEL/my_cifar_train.py", line 39, in train
    images, labels = my_input.inputs()
  File "/Users/HANEL/my_input.py", line 157, in inputs
    min_after_dequeue=200)
  File "/Users/HANEL/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/input.py", line 453, in shuffle_batch
    return queue.dequeue_many(batch_size, name=name)
  File "/Users/HANEL/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/data_flow_ops.py", line 245, in dequeue_many
    self._queue_ref, n, self._dtypes, name=name)
  File "/Users/HANEL/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 319, in _queue_dequeue_many
    timeout_ms=timeout_ms, name=name)
  File "/Users/HANEL/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/op_def_library.py", line 633, in apply_op
    op_def=op_def)
  File "/Users
/HANEL/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1710, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/Users/HANEL/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 988, in __init__
    self._traceback =

_extract_stack()

【问题讨论】:

  • 嗨@mrry 是的,我会把它发给你,但我发现了第二个问题,我使用 training_iterations 到 20 小于 100 (batch_size) 导致元素不足。我猜第一个问题是我机器的线程大小,它是 4 线程和 12228 = 4 * 3072
  • 最可能的问题是传递给set_shape() 的大小与decode_raw 生成的张量的真实大小不匹配 - 也许管道早期出现了问题。要找出真实的形状,您可以执行以下操作:image_shape = tf.shape(image); ...; sess.run(image_shape) 以获得真实的形状。
  • 查看更多输入代码,看起来您在将图像写入 TFRecord 文件之前将其转换为 np.int32 数组:images_only = [np.asarray(image[1], **np.int32**) for image in images]。但是,您将它们读入为tf.uint8 值,这意味着您将拥有四倍的值,以及4 * 3072 = 12288
  • @mrry 非常感谢,它有效。
  • @mrry 让您的评论成为更多积分的答案。你沉迷于积分不是吗?请记住,有些 StackOverflow 用户不阅读 cmets。此外,我不记得 Google 搜索仅基于 cmets 的答案返回结果。

标签: python tensorflow


【解决方案1】:

我遇到了类似的问题。在网上挖掘,结果发现如果你使用一些num_epochs 参数,你必须初始化所有local 变量,所以你的代码最终应该看起来像:

with tf.Session() as sess:
    sess.run(tf.local_variables_initializer())
    sess.run(tf.global_variables_initializer())
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)

    # do your stuff here

    coord.request_stop()
    coord.join(threads)

如果您发布更多代码,也许我可以更深入地研究它。与此同时,HTH。

【讨论】:

  • 谢谢! sess.run(tf.local_variables_initializer()) 就是它!
  • 很高兴它有帮助。由于这是一个很常见的问题,OP 能否将答案标记为正确以供更多读者使用?谢谢。
【解决方案2】:

您可能错误地处理已解析的 TFRecord 示例。例如。试图将张量重塑为不兼容的大小。您可以使用 tf_record_iterator 进行调试,以确认您正在读取的数据是以您认为的方式存储的:

import tensorflow as tf
import numpy as np

tfrecords_filename = '/path/to/some.tfrecord'
record_iterator = tf.python_io.tf_record_iterator(path=tfrecords_filename)

for string_record in record_iterator:
    # Parse the next example
    example = tf.train.Example()
    example.ParseFromString(string_record)

    # Get the features you stored (change to match your tfrecord writing code)
    height = int(example.features.feature['height']
                                 .int64_list
                                 .value[0])

    width = int(example.features.feature['width']
                                .int64_list
                                .value[0])

    img_string = (example.features.feature['image_raw']
                                  .bytes_list
                                  .value[0])
    # Convert to a numpy array (change dtype to the datatype you stored)
    img_1d = np.fromstring(img_string, dtype=np.float32)
    # Print the image shape; does it match your expectations?
    print(img_1d.shape)

【讨论】:

  • 唷,这是一个粗略的错误,但你的回答解决了它。我有一个 tf.py_func 节点返回错误的类型,但 TF 只显示有关元素不足的误导性错误消息。
【解决方案3】:

我今天遇到了完全相同的问题,后来我发现是我从“著名数据集”(例如https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data)下载的输入数据文件导致错误:它的末尾有一些空行文件。去掉空行,错误就消失了!

【讨论】:

  • 没错!你拯救了我的一天!谢谢!我删除了文件末尾的空行,问题解决了!
【解决方案4】:

这也可能是由根本不存在的错误 tf 记录文件名引起的。在进行其他检查之前,请确保指定了正确的文件路径。

【讨论】:

    【解决方案5】:

    总结一下cmets,

    Compute status: Out of range: RandomSuffleQueue '_2_input/shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 100, current size 0)
    

    是由于队列中的数据不足造成的。这通常是由于认为您有足够的数据进行 N 次迭代,而实际上您只有足够的 M 次迭代,其中 M

    计算您实际拥有多少数据的一个建议是计算在队列抛出 OutOfRangeError 异常之前您可以读取数据的次数。

    【讨论】:

    • 我遇到了类似的问题。但是,我认为我的数据还不够。我有一个包含 8000 个样本的数据集,并且我将 filename_queue 的 num_epoches 设置为 2,所以它应该是 16000 个样本要入队。我的 batch_size 设置为 100,所以它应该迭代 160 次。但是,我仍然有这样的“超出范围”警告,即使我将迭代代码放入可以捕获 OutOfRange 异常的 try ... 异常块中。那么有什么可能的原因吗?
    【解决方案6】:

    我遇到了同样的问题,之前的答案似乎都没有解决它,所以我也会插话。

    对我来说,问题最终是我传递给 parse_single_example 的功能列表。无论出于何种原因(因为我使用的是 float_list ?)在我的 tfrecords 文件中,我需要在我的功能列表中指定数组的长度或使用 tf.VarLenFeature 即:

    feature_structure = {'features': tf.FixedLenFeature([FEATURE_SIZE], tf.float32),
               'outputs': tf.FixedLenFeature([OUTPUT_SIZE], tf.float32)}
    d_features = tf.parse_single_example(serialized_example, features=feature_structure)
    

    如果没有这个,我会不断收到“random_shuffle_queue is closed and has enough elements”错误,我猜这是因为我解析的示例中没有数据。

    【讨论】:

    • 你最后是怎么解决的
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-10-11
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多