【发布时间】:2019-05-22 13:42:24
【问题描述】:
最近,我尝试学习如何在多个 GPU 上使用 Tensorflow 来加快训练速度。我找到了一个关于基于 Cifar10 数据集训练分类模型的官方教程。但是,我发现本教程使用队列读取图像。出于好奇,如何通过向 Session 提供值来使用多个 GPU?我似乎很难解决将同一数据集的不同值提供给不同 GPU 的问题。谢谢大家!以下代码是官方教程的一部分。
images, labels = cifar10.distorted_inputs()
batch_queue = tf.contrib.slim.prefetch_queue.prefetch_queue(
[images, labels], capacity=2 * FLAGS.num_gpus)
# Calculate the gradients for each model tower.
tower_grads = []
with tf.variable_scope(tf.get_variable_scope()):
for i in xrange(FLAGS.num_gpus):
with tf.device('/gpu:%d' % i):
with tf.name_scope('%s_%d' % (cifar10.TOWER_NAME, i)) as scope:
# Dequeues one batch for the GPU
image_batch, label_batch = batch_queue.dequeue()
# Calculate the loss for one tower of the CIFAR model. This function
# constructs the entire CIFAR model but shares the variables across
# all towers.
loss = tower_loss(scope, image_batch, label_batch)
# Reuse variables for the next tower.
tf.get_variable_scope().reuse_variables()
# Retain the summaries from the final tower.
summaries = tf.get_collection(tf.GraphKeys.SUMMARIES, scope)
# Calculate the gradients for the batch of data on this CIFAR tower.
grads = opt.compute_gradients(loss)
# Keep track of the gradients across all towers.
tower_grads.append(grads)
【问题讨论】:
标签: python tensorflow distributed