【问题标题】:iteratorFromStringHandle device placement cpu/gpu conflictiteratorFromStringHandle 设备放置 cpu/gpu 冲突
【发布时间】:2019-08-05 09:04:34
【问题描述】:

从磁盘恢复元图时,TensorFlow 抱怨它试图从 CPU 上定义的句柄在 GPU 上创建迭代器。

我正在尝试创建一个图形,该图形使用带有占位符字符串的 tf.Data 管道来定义迭代器(以便我可以交换数据集)。我可以成功创建一个看似适用于 GPU 的图形。但是,在我从磁盘恢复图形后,尝试将数据集句柄绑定到迭代器时出现错误(我认为):

“尝试根据设备“CPU:0”上定义的句柄在设备“...GPU:0”上创建迭代器 [[{{node IteratorFromStringHandleV2}} = IteratorFromStringHandleV2output_shapes=[....], output_types=[...], _device=".​​..GPU:0"]]

我已经尝试明确定义我希望对象放置在 tf.device("/GPU:0"): 守卫的位置,特别是在我创建数据集迭代器的位置,但这有一个不同的错误: “无法为操作 TensorSliceDataset 分配设备:无法满足明确的设备规范 '/device:GPU:0' 因为没有支持的 GPU 设备内核可用”

我在这里发现了类似的问题, When use Dataset API, got device placement error with tensorflow >= 1.11

我正在使用 tf-1.12(很遗憾,我无法使用更高版本)。

# this is the code which creates the graph

import tensorflow as tf
import numpy as np

def _bytestring_feature(byteStringList):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=byteStringList));

def _int64_feature(intList):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=intList));

def _float_feature(intList):
    return tf.train.Feature(float_list=tf.train.FloatList(value=intList));

def toTFrecord(tfrec_filewriter, img, label):
    feature={
        'image': _bytestring_feature([img.tostring()]),
        'class': _int64_feature([label])
    }
    return tf.train.Example(features=tf.train.Features(feature=feature));

# generate data and save it to disk:

print('generating data')
nPartitions=5; # number of file partitions
for p in range(nPartitions):
    filename='./tfrec_'+'{:02d}-{}.tfrec'.format(p,nPartitions)
    with tf.python_io.TFRecordWriter(filename) as outFile:
        # generate some data for this partition
        for i in range(10):
            example=toTFrecord(outFile, (p*100+i)*np.ones((32,32), np.float32), (p*100+i));
            outFile.write(example.SerializeToString());
print('...complete')

# make the network
handle=tf.placeholder(tf.string, shape=[], name='handle')
with tf.device("/GPU:0"):
    iter=tf.data.Iterator.from_string_handle(handle, (tf.float32, tf.int64), (tf.TensorShape([tf.Dimension(None), tf.Dimension(32), tf.Dimension(32)]), tf.TensorShape([tf.Dimension(None)])))
    img,label=iter.get_next()
    network=tf.layers.conv2d(inputs=tf.reshape(img, [-1, tf.shape(img)[1], tf.shape(img)[2], 1]), filters=4, kernel_size=[3,3], dilation_rate=[1,1], padding='same', activation=None, name='networkConv')

with tf.Session(config=tf.ConfigProto(log_device_placement=True, allow_soft_placement=False)) as sess:
    sess.run(tf.global_variables_initializer())

    saver=tf.train.Saver(keep_checkpoint_every_n_hours=0.5, max_to_keep=1000)
    tf.add_to_collection('network', network)
    tf.add_to_collection('handle', handle)
    saver.save(sess, './demoSession')
#......
# and this is a separate process which restores the graph for training:

import tensorflow as tf
import numpy as np
import glob

def readTFrecord(example):
    features={
        'image': tf.io.FixedLenFeature([], tf.string),
        'class': tf.io.FixedLenFeature([], tf.int64)
    };
    example=tf.parse_example(example, features)
    return tf.reshape(tf.decode_raw(example['image'], tf.float32), [-1, 32, 32]), example['class'] 

filenames=glob.glob('./tfrec*.tfrec')
ds=tf.data.TFRecordDataset(filenames)
ds=ds.shuffle(5000).batch(4).prefetch(4).map(readTFrecord, num_parallel_calls=2)

with tf.Session(config=tf.ConfigProto(log_device_placement=True, allow_soft_placement=False)) as sess:
    new_saver=tf.train.import_meta_graph('demoSession.meta', clear_devices=False)
    new_saver.restore(sess, 'demoSession')
    network=tf.get_collection('network')[0]
    handle=tf.get_collection('handle')[0]

    #with tf.device("/GPU:0"):
    dsIterator=ds.make_initializable_iterator()
    dsHandle=sess.run(dsIterator.string_handle())

    sess.run(dsIterator.initializer)

    out=sess.run(network, feed_dict={handle:dsHandle})
    print(out.shape)

我希望它会起作用,邦德先生。不幸的是,它说它不能

tensorflow.python.framework.errors_impl.InvalidArgumentError: Attempted create an iterator on device "/job:localhost/replica:0/task:0/device:GPU:0" from handle defined on device "/job:localhost/副本:0/任务:0/设备:CPU:0" [[{{node IteratorFromStringHandleV2}} = IteratorFromStringHandleV2output_shapes=[[?,32,32], [?]], output_types=[DT_FLOAT, DT_INT64], _device="/job:localhost/replica:0/task:0/device :GPU:0"]]

【问题讨论】:

    标签: tensorflow iterator


    【解决方案1】:

    看来是需要补充

    iter=tf.data.Iterator.from_string_handle(...) saveable_obj = tf.contrib.data.make_saveable_from_iterator(iter) ... tf.add_to_collection(tf.GraphKeys.SAVEABLE_OBJECTS, saveable_obj)

    我的初步测试似乎有效:-D

    编辑:实际上,它超越了我上面描述的错误,但是当我尝试创建新的保存状态时它引发了另一个错误,所以我怀疑这不是实际的答案 =/

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-07-19
      • 2012-12-09
      • 2021-08-28
      • 2021-11-06
      • 2014-09-11
      相关资源
      最近更新 更多