【发布时间】:2020-05-06 11:23:52
【问题描述】:
我正在尝试使用 tensorflow 对象检测 API 运行对象检测模型。我运行对象检测的目的是尝试使用对象检测来解决验证码问题。我遵循了一个教程。 系统配置: Azure 上的虚拟机 GPU——尼维达特斯拉k80 内存 - 56 张量流版本 - 1.14 我的模型正在运行,但在 16 次迭代后停止,直到该迭代模型运行良好并且损失也在减少,但之后它给出了错误。我正在使用 faster_RCNN_resnet_inception_v2_atrous_coco。我遵循了执行模型所需的每条路径。 我在批次 2 中给出输入,然后给出资源耗尽的错误。
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, 2 root error(s) found.
(0) Invalid argument: ConcatOp : Dimensions of inputs should match: shape[0] = [1,284,1024,3] vs. shape[1]= [1,296,1024,3]
[[node concat (defined at /root/workspace/models/research/object_detection/legacy/trainer.py:191) ]]
[[gradients/FirstStageFeatureExtractor/InceptionResnetV2/InceptionResnetV2/Repeat/block35_3/Conv2d_1x1/BiasAdd_grad/BiasAddGrad/_6083]]
(1) Invalid argument: ConcatOp : Dimensions of inputs should match: shape[0] = [1,284,1024,3] vs. shape[1]= [1,296,1024,3]
[[node concat (defined at /root/workspace/models/research/object_detection/legacy/trainer.py:191) ]]
0 successful operations.
0 derived errors ignored.
Errors may have originated from an input operation.
Input Source operations connected to node concat:
Preprocessor_1/sub (defined at /root/workspace/models/research/object_detection/models/faster_rcnn_inception_resnet_v2_feature_extractor.py:77)
Input Source operations connected to node concat:
Preprocessor_1/sub (defined at /root/workspace/models/research/object_detection/models/faster_rcnn_inception_resnet_v2_feature_extractor.py:77)
Original stack trace for 'concat':
File "legacy/train.py", line 185, in <module>
tf.app.run()
File "/opt/sft/miniconda3/envs/ravi_gpu/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/opt/sft/miniconda3/envs/ravi_gpu/lib/python3.6/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/opt/sft/miniconda3/envs/ravi_gpu/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "/opt/sft/miniconda3/envs/ravi_gpu/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 324, in new_func
return func(*args, **kwargs)
File "legacy/train.py", line 181, in main
graph_hook_fn=graph_rewriter_fn)
File "/root/workspace/models/research/object_detection/legacy/trainer.py", line 297, in train
clones = model_deploy.create_clones(deploy_config, model_fn, [input_queue])
File "/root/workspace/models/research/slim/deployment/model_deploy.py", line 194, in create_clones
outputs = model_fn(*args, **kwargs)
File "/root/workspace/models/research/object_detection/legacy/trainer.py", line 191, in _create_losses
images = tf.concat(preprocessed_images, 0)
File "/opt/sft/miniconda3/envs/ravi_gpu/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "/opt/sft/miniconda3/envs/ravi_gpu/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 1299, in concat
return gen_array_ops.concat_v2(values=values, axis=axis, name=name)
File "/opt/sft/miniconda3/envs/ravi_gpu/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1256, in concat_v2
"ConcatV2", values=values, axis=axis, name=name)
File "/opt/sft/miniconda3/envs/ravi_gpu/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "/opt/sft/miniconda3/envs/ravi_gpu/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/opt/sft/miniconda3/envs/ravi_gpu/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3616, in create_op
op_def=op_def)
File "/opt/sft/miniconda3/envs/ravi_gpu/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2005, in __init__
self._traceback = tf_stack.extract_stack()
I0120 12:23:47.601684 140495582246720 coordinator.py:224] Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, 2 root error(s) found.
(0) Invalid argument: ConcatOp : Dimensions of inputs should match: shape[0] = [1,284,1024,3] vs. shape[1]= [1,296,1024,3]
[[node concat (defined at /root/workspace/models/research/object_detection/legacy/trainer.py:191) ]]
[[gradients/FirstStageFeatureExtractor/InceptionResnetV2/InceptionResnetV2/Repeat/block35_3/Conv2d_1x1/BiasAdd_grad/BiasAddGrad/_6083]]
(1) Invalid argument: ConcatOp : Dimensions of inputs should match: shape[0] = [1,284,1024,3] vs. shape[1]= [1,296,1024,3]
[[node concat (defined at /root/workspace/models/research/object_detection/legacy/trainer.py:191) ]]
0 successful operations.
0 derived errors ignored.
【问题讨论】:
-
您的数据集中的一个图像形状错误(查看错误消息)。您的程序运行良好,直到它尝试运行该图像,然后它失败了。预处理您的数据集 s.t.所有图像都具有相同的形状,或者在管道中添加一个预处理步骤,将所有图像的大小调整为一个通用形状。
-
@Ravi kant Gautam,您能否通过上述评论确认错误是否已解决?否则,您能否分享可重现的代码,以便我尽力帮助您。
-
我没有实施上述评论。我刚刚将我的模型 faster_RCNN 更改为单次检测,它消除了我的错误。
标签: python tensorflow gpu faster-rcnn