【发布时间】:2018-04-09 12:42:44
【问题描述】:
最好的
我想读取一些 TF 记录数据。
这有效,但仅适用于固定长度数据,但现在我想对可变长度数据做同样的事情 VarLenFeature
def load_tfrecord_fixed(serialized_example):
context_features = {
'length':tf.FixedLenFeature([],dtype=tf.int64),
'type':tf.FixedLenFeature([],dtype=tf.string)
}
sequence_features = {
"values":tf.FixedLenSequenceFeature([], dtype=tf.int64)
}
context_parsed, sequence_parsed = tf.parse_single_sequence_example(
serialized=serialized_example,
context_features=context_features,
sequence_features=sequence_features
)
return context_parsed,sequence_parsed
和
tf.reset_default_graph()
with tf.Session() as sess:
filenames = [fp.name]
dataset = tf.data.TFRecordDataset(filenames)
dataset = dataset.map(load_tfrecord_fixed)
dataset = dataset.repeat()
dataset = dataset.batch(2)
iterator = dataset.make_initializable_iterator()
next_element = iterator.get_next()
a = sess.run(iterator.initializer)
for i in range(3):
a = sess.run(next_element)
print(a)
结果:
({'length': array([3, 3], dtype=int64), 'type': array([b'FIXED_length', b'FIXED_length'], dtype=object)}, {'values': array([[82, 2, 2],
[42, 5, 1]], dtype=int64)}) ({'length': array([3, 3], dtype=int64), 'type': array([b'FIXED_length', b'FIXED_length'], dtype=object)}, {'values': array([[2, 3, 1],
[1, 2, 3]], dtype=int64)}) ({'length': array([3, 3], dtype=int64), 'type': array([b'FIXED_length', b'FIXED_length'], dtype=object)}, {'values': array([[ 1, 100, 200],
[123, 12, 12]], dtype=int64)})
这是我正在尝试使用的地图功能,但最后它给了我一些错误:'(
def load_tfrecord_variable(serialized_example):
context_features = {
'length':tf.FixedLenFeature([],dtype=tf.int64),
'batch_size':tf.FixedLenFeature([],dtype=tf.int64),
'type':tf.FixedLenFeature([],dtype=tf.string)
}
sequence_features = {
"values":tf.VarLenFeature(tf.int64)
}
context_parsed, sequence_parsed = tf.parse_single_sequence_example(
serialized=serialized_example,
context_features=context_features,
sequence_features=sequence_features
)
#return context_parsed, sequence_parsed (which is sparse)
# return context_parsed, sequence_parsed
batched_data = tf.train.batch(
tensors=[sequence_parsed['values']],
batch_size= 2,
dynamic_pad=True
)
# make dense data
dense_data = tf.sparse_tensor_to_dense(batched_data)
return context_parsed, dense_data
错误:
OutOfRangeError: Attempted to repeat an empty dataset infinitely.
[[Node: IteratorGetNext = IteratorGetNext[output_shapes=[[], [], [], [?,?,?]], output_types=[DT_INT64, DT_INT64, DT_STRING, DT_INT64], _device="/job:localhost/replica:0/task:0/device:CPU:0"](Iterator)]]
During handling of the above exception, another exception occurred:
所以你能帮帮我吗?另外,我每晚都使用 tensorflow。 我不认为我错过了很多......
【问题讨论】:
-
不要使用
tf.train.batch。如果有VarLenFeature,可以使用Dataset.padded_batch批量填充序列。 -
@MaosiChen 当我使用 padded_batch 时,我收到此错误“如果浅层结构是一个序列,输入也必须是一个序列。输入的类型为:
。” dataset = dataset.padded_batch(4, padded_shapes=([None]))
标签: python-3.x tensorflow tfrecord