【问题标题】:Tensorflow error when restoring trained model for eval, invalid shapes issue?为评估恢复训练模型时出现 TensorFlow 错误,无效形状问题?
【发布时间】:2017-06-18 22:18:01
【问题描述】:

我在尝试恢复经过训练的模型进行评估时遇到错误,但仅限于在测试集上进行评估时。错误是:

InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [2325,11] rhs shape= [4891,11]

注意,lhs shape = [2325,11] 和 rhs shape = [4891,11] 对应测试集中的 2325 张图片和训练集中的 4891 张图片;并且 11 是 11 个类的 one-hot 编码 - 所以这些可能对应于标签。当我对训练集进行评估时,尺寸匹配并且没有错误结果。帮助将不胜感激!

下面的完整堆栈跟踪:

Traceback (most recent call last):
  File "eval.py", line 75, in <module>
    main()
  File "eval.py", line 70, in main
    acc_annotation, acc_retrieval = evaluate(partition="test")
  File "eval.py", line 34, in evaluate
    restorer.restore(sess, tf.train.latest_checkpoint(SAVED_MODEL_DIR))
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1388, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 766, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 964, in _run
    feed_dict_string, options, run_metadata)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1014, in _do_run
    target_list, options, run_metadata)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1034, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [2325,11] rhs shape= [4891,11]
     [[Node: save/Assign_5 = Assign[T=DT_FLOAT, _class=["loc:@input/Variable_1"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](input/Variable_1, save/RestoreV2_5)]]

Caused by op u'save/Assign_5', defined at:
  File "eval.py", line 75, in <module>
    main()
  File "eval.py", line 70, in main
    acc_annotation, acc_retrieval = evaluate(partition="test")
  File "eval.py", line 25, in evaluate
    restorer = tf.train.Saver()  # For saving the model
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1000, in __init__
    self.build()
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1030, in build
    restore_sequentially=self._restore_sequentially)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 624, in build
    restore_sequentially, reshape)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 373, in _AddRestoreOps
    assign_ops.append(saveable.restore(tensors, shapes))
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 130, in restore
    self.op.get_shape().is_fully_defined())
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 47, in assign
    use_locking=use_locking, name=name)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1128, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [2325,11] rhs shape= [4891,11]
     [[Node: save/Assign_5 = Assign[T=DT_FLOAT, _class=["loc:@input/Variable_1"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](input/Variable_1, save/RestoreV2_5)]]

更新

我刚刚查看了检查点文件中的张量形状,看起来保护程序甚至将输入保存到模型中。我需要重新配置我的训练代码,或者弄清楚如何从检查点中排除模型输入(标签和图像):

('tensor_name: ', 'conv2-layer/bias/Adam_1')
(512,)
('tensor_name: ', 'input/Variable_1')
(4891, 11)
('tensor_name: ', 'conv2-layer/weights_1/Adam')
(5, 1, 64, 512)
('tensor_name: ', 'conv1-layer/weights_1')
(5, 23, 1, 64)
('tensor_name: ', 'conv2-layer/weights_1')
(5, 1, 64, 512)
('tensor_name: ', 'conv2-layer/weights_1/Adam_1')
(5, 1, 64, 512)
('tensor_name: ', 'input/Variable')
(4891, 100, 23, 1)
('tensor_name: ', 'conv1-layer/weights_1/Adam_1')
(5, 23, 1, 64)
('tensor_name: ', 'conv1-layer/bias/Adam')
(64,)
('tensor_name: ', 'beta2_power')
()
('tensor_name: ', 'conv2-layer/bias/Adam')
(512,)
('tensor_name: ', 'conv1-layer/bias/Adam_1')
(64,)
('tensor_name: ', 'conv2-layer/bias')
(512,)
('tensor_name: ', 'conv1-layer/bias')
(64,)
('tensor_name: ', 'beta1_power')
()
('tensor_name: ', 'conv1-layer/weights_1/Adam')
(5, 23, 1, 64)
('tensor_name: ', 'Variable')
()

【问题讨论】:

    标签: python tensorflow neural-network


    【解决方案1】:

    查看定义模型的代码会很有用;但据我所知,您可能已将输入定义为tf.Variable。变量是允许优化器更改以最小化损失函数的值。变量是模型的学习权重,这就是 Tensorflow 保存它们以便以后恢复它们的原因。

    您应该使用tf.Placeholder 将输入数据提供给图表。

    【讨论】:

    猜你喜欢
    • 2019-05-19
    • 1970-01-01
    • 2019-05-06
    • 2022-12-04
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2022-06-22
    • 1970-01-01
    相关资源
    最近更新 更多