尝试使用这段代码后:
!python object_detection/model_main_tf2.py \
--pipeline_config_path={pipeline_file} \
--model_dir='object_detection/training' \
--checkpoint_dir='object_detection/training' \
--alsologtostderr
我最终设法得到了一些不同于零的结果(也许检查点发生了变化)。结果:
2020-09-20 10:11:00.899773: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
WARNING:tensorflow:Forced number of epochs for all eval validations to be 1.
W0920 10:11:02.925597 140679676843904 model_lib_v2.py:925] Forced number of epochs for all eval validations to be 1.
INFO:tensorflow:Maybe overwriting sample_1_of_n_eval_examples: None
I0920 10:11:02.925838 140679676843904 config_util.py:552] Maybe overwriting sample_1_of_n_eval_examples: None
INFO:tensorflow:Maybe overwriting use_bfloat16: False
I0920 10:11:02.925928 140679676843904 config_util.py:552] Maybe overwriting use_bfloat16: False
INFO:tensorflow:Maybe overwriting eval_num_epochs: 1
I0920 10:11:02.926012 140679676843904 config_util.py:552] Maybe overwriting eval_num_epochs: 1
WARNING:tensorflow:Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
W0920 10:11:02.926136 140679676843904 model_lib_v2.py:940] Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
2020-09-20 10:11:02.934837: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-09-20 10:11:02.971958: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-20 10:11:02.972512: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:00:04.0 name: Tesla T4 computeCapability: 7.5
coreClock: 1.59GHz coreCount: 40 deviceMemorySize: 14.73GiB deviceMemoryBandwidth: 298.08GiB/s
2020-09-20 10:11:02.972554: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-09-20 10:11:02.974022: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-09-20 10:11:02.975587: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-20 10:11:02.975923: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-20 10:11:02.980298: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-20 10:11:02.981636: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-09-20 10:11:02.985271: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2020-09-20 10:11:02.985387: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-20 10:11:02.985948: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-20 10:11:02.986434: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-09-20 10:11:02.991809: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2200000000 Hz
2020-09-20 10:11:02.991994: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x12e1480 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-20 10:11:02.992021: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-09-20 10:11:03.103553: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-20 10:11:03.104214: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x12e1640 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-09-20 10:11:03.104245: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Tesla T4, Compute Capability 7.5
2020-09-20 10:11:03.104426: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-20 10:11:03.104982: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:00:04.0 name: Tesla T4 computeCapability: 7.5
coreClock: 1.59GHz coreCount: 40 deviceMemorySize: 14.73GiB deviceMemoryBandwidth: 298.08GiB/s
2020-09-20 10:11:03.105024: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-09-20 10:11:03.105075: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-09-20 10:11:03.105097: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-20 10:11:03.105121: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-20 10:11:03.105140: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-20 10:11:03.105158: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-09-20 10:11:03.105181: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2020-09-20 10:11:03.105258: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-20 10:11:03.105829: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-20 10:11:03.106297: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-09-20 10:11:03.106340: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-09-20 10:11:03.719107: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-20 10:11:03.719168: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0
2020-09-20 10:11:03.719181: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N
2020-09-20 10:11:03.719388: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-20 10:11:03.720044: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-20 10:11:03.720576: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2020-09-20 10:11:03.720625: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 13962 MB memory) -> physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5)
WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards.
W0920 10:11:03.765423 140679676843904 dataset_builder.py:83] num_readers has been reduced to 1 to match input file shards.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_deterministic`.
W0920 10:11:03.768426 140679676843904 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_deterministic`.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/object_detection/builders/dataset_builder.py:175: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map()
W0920 10:11:03.799010 140679676843904 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/object_detection/builders/dataset_builder.py:175: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map()
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/dispatch.py:201: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.
W0920 10:11:07.302673 140679676843904 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/dispatch.py:201: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/object_detection/inputs.py:259: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0920 10:11:08.378596 140679676843904 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/object_detection/inputs.py:259: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
INFO:tensorflow:Waiting for new checkpoint at object_detection/training
I0920 10:11:10.765872 140679676843904 checkpoint_utils.py:125] Waiting for new checkpoint at object_detection/training
INFO:tensorflow:Found new checkpoint at object_detection/training/ckpt-7
I0920 10:11:10.769416 140679676843904 checkpoint_utils.py:134] Found new checkpoint at object_detection/training/ckpt-7
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/object_detection/model_lib_v2.py:702: set_learning_phase (from tensorflow.python.keras.backend) is deprecated and will be removed after 2020-10-11.
Instructions for updating:
Simply pass a True/False value to the `training` argument of the `__call__` method of your layer or model.
W0920 10:11:10.791586 140679676843904 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/object_detection/model_lib_v2.py:702: set_learning_phase (from tensorflow.python.keras.backend) is deprecated and will be removed after 2020-10-11.
Instructions for updating:
Simply pass a True/False value to the `training` argument of the `__call__` method of your layer or model.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/object_detection/eval_util.py:878: to_int64 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0920 10:11:51.835231 140679676843904 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/object_detection/eval_util.py:878: to_int64 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
2020-09-20 10:11:56.921874: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-09-20 10:11:57.287284: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
INFO:tensorflow:Finished eval step 0
I0920 10:11:59.687197 140679676843904 model_lib_v2.py:799] Finished eval step 0
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/object_detection/utils/visualization_utils.py:617: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, there are two
options available in V2.
- tf.py_function takes a python function which manipulates tf eager
tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
an ndarray (just call tensor.numpy()) but having access to eager tensors
means `tf.py_function`s can use accelerators such as GPUs as well as
being differentiable using a gradient tape.
- tf.numpy_function maintains the semantics of the deprecated tf.py_func
(it is not differentiable, and manipulates numpy arrays). It drops the
stateful argument making all functions stateful.
W0920 10:11:59.819776 140679676843904 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/object_detection/utils/visualization_utils.py:617: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, there are two
options available in V2.
- tf.py_function takes a python function which manipulates tf eager
tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
an ndarray (just call tensor.numpy()) but having access to eager tensors
means `tf.py_function`s can use accelerators such as GPUs as well as
being differentiable using a gradient tape.
- tf.numpy_function maintains the semantics of the deprecated tf.py_func
(it is not differentiable, and manipulates numpy arrays). It drops the
stateful argument making all functions stateful.
INFO:tensorflow:Performing evaluation on 43 images.
I0920 10:12:04.423586 140679676843904 coco_evaluation.py:282] Performing evaluation on 43 images.
creating index...
index created!
INFO:tensorflow:Loading and preparing annotation results...
I0920 10:12:04.423980 140679676843904 coco_tools.py:116] Loading and preparing annotation results...
INFO:tensorflow:DONE (t=0.00s)
I0920 10:12:04.426340 140679676843904 coco_tools.py:138] DONE (t=0.00s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.17s).
Accumulating evaluation results...
DONE (t=0.04s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.728
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.999
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.806
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.728
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.784
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.784
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.784
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.784
INFO:tensorflow:Eval metrics at step 5000
I0920 10:12:04.649628 140679676843904 model_lib_v2.py:853] Eval metrics at step 5000
INFO:tensorflow: + DetectionBoxes_Precision/mAP: 0.727510
I0920 10:12:04.672720 140679676843904 model_lib_v2.py:856] + DetectionBoxes_Precision/mAP: 0.727510
INFO:tensorflow: + DetectionBoxes_Precision/mAP@.50IOU: 0.999325
I0920 10:12:04.675546 140679676843904 model_lib_v2.py:856] + DetectionBoxes_Precision/mAP@.50IOU: 0.999325
INFO:tensorflow: + DetectionBoxes_Precision/mAP@.75IOU: 0.806042
I0920 10:12:04.676922 140679676843904 model_lib_v2.py:856] + DetectionBoxes_Precision/mAP@.75IOU: 0.806042
INFO:tensorflow: + DetectionBoxes_Precision/mAP (small): -1.000000
I0920 10:12:04.678196 140679676843904 model_lib_v2.py:856] + DetectionBoxes_Precision/mAP (small): -1.000000
INFO:tensorflow: + DetectionBoxes_Precision/mAP (medium): -1.000000
I0920 10:12:04.679378 140679676843904 model_lib_v2.py:856] + DetectionBoxes_Precision/mAP (medium): -1.000000
INFO:tensorflow: + DetectionBoxes_Precision/mAP (large): 0.727510
I0920 10:12:04.680430 140679676843904 model_lib_v2.py:856] + DetectionBoxes_Precision/mAP (large): 0.727510
INFO:tensorflow: + DetectionBoxes_Recall/AR@1: 0.783721
I0920 10:12:04.681638 140679676843904 model_lib_v2.py:856] + DetectionBoxes_Recall/AR@1: 0.783721
INFO:tensorflow: + DetectionBoxes_Recall/AR@10: 0.783721
I0920 10:12:04.682775 140679676843904 model_lib_v2.py:856] + DetectionBoxes_Recall/AR@10: 0.783721
INFO:tensorflow: + DetectionBoxes_Recall/AR@100: 0.783721
I0920 10:12:04.683973 140679676843904 model_lib_v2.py:856] + DetectionBoxes_Recall/AR@100: 0.783721
INFO:tensorflow: + DetectionBoxes_Recall/AR@100 (small): -1.000000
I0920 10:12:04.685043 140679676843904 model_lib_v2.py:856] + DetectionBoxes_Recall/AR@100 (small): -1.000000
INFO:tensorflow: + DetectionBoxes_Recall/AR@100 (medium): -1.000000
I0920 10:12:04.686104 140679676843904 model_lib_v2.py:856] + DetectionBoxes_Recall/AR@100 (medium): -1.000000
INFO:tensorflow: + DetectionBoxes_Recall/AR@100 (large): 0.783721
I0920 10:12:04.687381 140679676843904 model_lib_v2.py:856] + DetectionBoxes_Recall/AR@100 (large): 0.783721
INFO:tensorflow: + Loss/localization_loss: 0.089791
I0920 10:12:04.688443 140679676843904 model_lib_v2.py:856] + Loss/localization_loss: 0.089791
INFO:tensorflow: + Loss/classification_loss: 0.336598
I0920 10:12:04.689506 140679676843904 model_lib_v2.py:856] + Loss/classification_loss: 0.336598
INFO:tensorflow: + Loss/regularization_loss: 0.117549
I0920 10:12:04.690544 140679676843904 model_lib_v2.py:856] + Loss/regularization_loss: 0.117549
INFO:tensorflow: + Loss/total_loss: 0.543938
I0920 10:12:04.691550 140679676843904 model_lib_v2.py:856] + Loss/total_loss: 0.543938
INFO:tensorflow:Waiting for new checkpoint at object_detection/training
I0920 10:16:10.791572 140679676843904 checkpoint_utils.py:125] Waiting for new checkpoint at object_detection/training
Traceback (most recent call last):
File "object_detection/model_main_tf2.py", line 114, in <module>
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 300, in run
_run_main(main, args)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "object_detection/model_main_tf2.py", line 89, in main
wait_interval=300, timeout=FLAGS.eval_timeout)
File "/usr/local/lib/python3.6/dist-packages/object_detection/model_lib_v2.py", line 966, in eval_continuously
checkpoint_dir, timeout=timeout, min_interval_secs=wait_interval):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/checkpoint_utils.py", line 184, in checkpoints_iterator
checkpoint_dir, checkpoint_path, timeout=timeout)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/checkpoint_utils.py", line 132, in wait_for_new_checkpoint
time.sleep(seconds_to_sleep)
KeyboardInterrupt
但是,张量板尚未使用这些指标(AP、AR 等)进行更新。可能我做错了什么。我仍然没有在准确性方面取得任何进展,所以如果有人知道一些事情,那将非常有帮助。谢谢!