【问题标题】:Passing two queues to Tensorflow training将两个队列传递给 TensorFlow 训练
【发布时间】:2017-03-30 06:02:46
【问题描述】:

我正在尝试基于来自 Tensorflow 的 CIFAR10 示例创建一个火车操作,该示例使用 tf.RandomShuffleQueue,我的标签来自 (Accessing filename from file queue in Tensor Flow) 中提到的文件名。我该如何使用此代码?

当我尝试运行以下代码时,path 是一个包含许多文件的目录:

filenames = [path, f) for f in os.listdir(path)][1:]
file_fifo = tf.train.string_input_producer(filenames,
                                           shuffle=False,
                                           capacity=len(filenames))
reader = tf.WholeFileReader()
key, value = reader.read(file_fifo)
image = tf.image.decode_png(value, channels=3, dtype=tf.uint8)
image.set_shape([config.image_height, config.image_width, config.image_depth])
image = tf.cast(image, tf.float32)
image = tf.divide(image, 255.0)
labels = [int(os.path.basename(f).split('_')[-1].split('.')[0]) for f in filenames]
label_fifo = tf.FIFOQueue(len(filenames), tf.int32, shapes=[[]])
label_enqueue = label_fifo.enqueue_many([tf.constant(labels)])
label = label_fifo.dequeue()
bq = tf.RandomShuffleQueue(capacity=16 * batch_size,
                           min_after_dequeue=8 * batch,
                           dtypes=[tf.float32, tf.int32])
batch_enqueue_op = bq.enqueue([image, label_enqueue])
runner = tf.train.queue_runner.QueueRunner(bq, [batch_enqueue_op] * num_threads)
tf.train.add_queue_runner(runner)

# Read 'batch' labels + images from the example queue.
images, labels = batch_queue.dequeue_many(FLAGS.batch_size)
labels = tf.reshape(labels, [FLAGS.batch_size, 1])

我得到了明显的错误,因为我知道我的代码没有多大意义。我有两个 FIFO 队列 @9​​87654325@ 和 label_fifo,但我不知道如何使我的 label_fifo 输入 tf.RandomShuffleQueue。

有人可以帮忙吗?谢谢你:-)

【问题讨论】:

    标签: tensorflow


    【解决方案1】:

    我将代码更改为:

    filenames = [os.path.join(FLAGS.data_path, f) for f in os.listdir(FLAGS.data_path)][1:]
    np.random.shuffle(filenames)
    file_fifo = tf.train.string_input_producer(filenames, shuffle=False, capacity=len(filenames))
    reader = tf.WholeFileReader()
    key, value = reader.read(file_fifo)
    image = tf.image.decode_png(value, channels=3, dtype=tf.uint8)
    image.set_shape([config.image_height, config.image_width, config.image_depth])
    image = tf.cast(image, tf.float32)
    image = tf.divide(image, 255.0)
    
    labels = [int(os.path.basename(f).split('_')[-1].split('.')[0]) for f in filenames]
    label_fifo = tf.FIFOQueue(len(filenames), tf.int32, shapes=[[]])
    label_enqueue = label_fifo.enqueue_many([tf.constant(labels)])
    label = label_fifo.dequeue()
    
    if is_train:
        images, label_batch = tf.train.shuffle_batch([image, label],
                                                     batch_size=FLAGS.batch_size,
                                                     num_threads=FLAGS.num_threads,
                                                     capacity=16 * FLAGS.batch_size,
                                                     min_after_dequeue=8 * FLAGS.batch_size)
    labels = tf.reshape(label_batch, [FLAGS.batch_size, 1])
    

    对于培训,我有:

    class _LoggerHook(tf.train.SessionRunHook):
        """Logs loss and runtime."""
    
        def begin(self):
            self._step = -1
    
        def before_run(self, run_context):
            self._step += 1
            self._start_time = time.time()
            if self._step % int(config.train_examples / FLAGS.batch_size) == 0 or self._step == 0:
                run_context.session.run(label_enqueue_op)
            return tf.train.SessionRunArgs({'loss': loss, 'net': net})
    

    我的训练方式是:

    with tf.train.MonitoredTrainingSession(
            checkpoint_dir=FLAGS.train_path,
            hooks=[tf.train.StopAtStepHook(last_step=FLAGS.max_steps), tf.train.NanTensorHook(loss), _LoggerHook()],
            config=tf.ConfigProto(log_device_placement=FLAGS.log_device_placement)) as mon_sess:
            while not mon_sess.should_stop():
                mon_sess.run(train_op)
    

    训练开始,但它只在第一步运行并挂起 - 可能是因为它正在等待一些队列命令

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2020-08-04
      • 1970-01-01
      • 1970-01-01
      • 2016-12-31
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多