【发布时间】:2020-04-15 13:14:35
【问题描述】:
我在我的 win10 上使用 tensorflow 2.1.0。 cuda和cudnn的版本是:
# Name Version Build Channel
cudnn 7.6.5 cuda10.0_0
我想用 tf.keras.preprocessing.image.ImageDataGenerator() 尤其是 fit_generator() 方法来实现数据增强。但它返回了错误:
2020-04-15 15:06:28.571927: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-04-15 15:06:29.806016: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2020-04-15 15:06:29.806503: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
Traceback (most recent call last):
File "E:/Studium/Machine Learning/check2.py", line 112, in <module>
workers=4)
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 1297, in fit_generator
steps_name='steps_per_epoch')
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\keras\engine\training_generator.py", line 265, in model_iteration
batch_outs = batch_function(*batch_data)
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 973, in train_on_batch
class_weight=class_weight, reset_metrics=reset_metrics)
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\keras\engine\training_v2_utils.py", line 264, in train_on_batch
output_loss_metrics=model._output_loss_metrics)
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\keras\engine\training_eager.py", line 311, in train_on_batch
output_loss_metrics=output_loss_metrics))
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\keras\engine\training_eager.py", line 252, in _process_single_batch
training=training))
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\keras\engine\training_eager.py", line 127, in _model_loss
outs = model(inputs, **kwargs)
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py", line 891, in __call__
outputs = self.call(cast_inputs, *args, **kwargs)
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\keras\engine\sequential.py", line 256, in call
return super(Sequential, self).call(inputs, training=training, mask=mask)
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\keras\engine\network.py", line 708, in call
convert_kwargs_to_constants=base_layer_utils.call_context().saving)
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\keras\engine\network.py", line 860, in _run_internal_graph
output_tensors = layer(computed_tensors, **kwargs)
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py", line 891, in __call__
outputs = self.call(cast_inputs, *args, **kwargs)
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\keras\layers\convolutional.py", line 197, in call
outputs = self._convolution_op(inputs, self.kernel)
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\ops\nn_ops.py", line 1134, in __call__
return self.conv_op(inp, filter)
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\ops\nn_ops.py", line 639, in __call__
return self.call(inp, filter)
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\ops\nn_ops.py", line 238, in __call__
name=self.name)
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\ops\nn_ops.py", line 2010, in conv2d
name=name)
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\ops\gen_nn_ops.py", line 1031, in conv2d
data_format=data_format, dilations=dilations, name=name, ctx=_ctx)
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\ops\gen_nn_ops.py", line 1130, in conv2d_eager_fallback
ctx=_ctx, name=name)
File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\eager\execute.py", line 67, in quick_execute
six.raise_from(core._status_to_exception(e.code, message), None)
File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D]
如果我不使用该命令 fit_generator() 那么一切正常。有谁知道问题出在哪里?或者我怎样才能用 fit() 方法实现这个?
【问题讨论】:
-
嗨@PokeLu,你能为这个错误提供一个最小的可重现代码吗?
-
@TF_Support 我已经更改了代码。事实证明 fit() 方法可以做同样的事情。但是当我尝试从'keras.io/examples/cifar10_cnn'运行代码时,出现了同样的问题。
-
嗨@PokeLu,你能提供你用于培训的代码吗?
-
@Veeru 我通过避免使用 fit_generator 解决了这个问题。相反,我使用 fit() 然后不再发生错误。
标签: python python-3.x tensorflow keras deep-learning