加速初始 TensorFlow 启动答案

【问题标题】：Speed up the initial TensorFlow startup加速初始 TensorFlow 启动
【发布时间】：2020-11-18 06:13:30
【问题描述】：

每次我使用 TensorFlow (CPU) 运行 Python 代码时，例如：

import keras

我看到了：

2020-10-30 15:27:20.518894: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-10-30 15:27:20.518894: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-10-30 15:27:23.713077: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2020-10-30 15:27:23.713077: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-10-30 15:27:23.716077: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: User1-PC
2020-10-30 15:27:23.716077: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: User1-PC
2020-10-30 15:27:23.729078: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x10cad0c0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-10-30 15:27:23.729078: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
Using TensorFlow backend.

如果我把所有的等待时间加起来，大约有 10 秒的等待时间。

有没有办法加快这个过程？特别是如果我使用 TensorFlow 进行推理（而不是训练），我不想每次启动都等待 10 秒引擎。

注意：当然，当我的代码准备好时，我会保持使用 TensorFlow 的进程不断运行，并且我会使用某种进程间通信，以避免重新启动整个程序。

我的问题主要针对 原型设计 阶段，当您经常需要重新启动程序时：原型设计时，每个脚本启动必须等待 10 或 15 秒非常不方便。

【问题讨论】：

你能用jupyter用python notebook吗？您说问题出在原型设计中……当我在进行原型设计并希望快速重新编译几行代码或整个类时，使用 jupyter 笔记本非常棒，而且比重新运行文件要快得多。
另外，我用于快速测试的另一个技巧是将网络结果缓存到文件的包装器。这样，如果我的测试代码只是使用相同的数据并且我只是在使用其他正在消化网络输出的代码，那么导入语句将被跳过。我已将其用于语言嵌入和图像嵌入提取。最后，我还使用 np.random.rand(...) 生成数据来代替运行 tensorflow 以测试所有后神经网络处理。

标签： python tensorflow keras

【解决方案1】：

对于您的推理问题，您可能需要一个寿命更长的进程，您可以通过 HTTP、gRPC、XML-RPC、命名管道、从目录中读取文件...请求推理结果...？

如果做不到这一点，请获得更快的机器或磁盘。在我的机器上，启动一个新的 Python 进程并导入 Keras 大约需要 2 秒：

$ pip install tensorflow
Collecting tensorflow
  Downloading tensorflow-2.3.1-cp38-cp38-macosx_10_14_x86_64.whl (165.2 MB)
[...]
Successfully installed absl-py-0.11.0 astunparse-1.6.3 cachetools-4.1.1 chardet-3.0.4 gast-0.3.3 google-auth-1.23.0 google-auth-oauthlib-0.4.2 google-pasta-0.2.0 grpcio-1.33.2 h5py-2.10.0 idna-2.10 keras-preprocessing-1.1.2 markdown-3.3.3 numpy-1.18.5 oauthlib-3.1.0 opt-einsum-3.3.0 packaging-20.4 protobuf-3.13.0 pyasn1-0.4.8 pyasn1-modules-0.2.8 requests-2.24.0 requests-oauthlib-1.3.0 rsa-4.6 tensorboard-2.3.0 tensorboard-plugin-wit-1.7.0 tensorflow-2.3.1 tensorflow-estimator-2.3.0 termcolor-1.1.0 werkzeug-1.0.1 wrapt-1.12.1
$ time python -c 'import tensorflow.keras as keras'

________________________________________________________
Executed in    2.02 secs   fish           external
   usr time    2.85 secs  118.00 micros    2.85 secs
   sys time    0.62 secs  946.00 micros    0.62 secs

【讨论】：

“对于你的推理问题，你可能想要一个更长寿的过程，你可以从中请求推理结果，也许......”：是的，我会做当然，但是在“原型设计”阶段，这个最初的 10 秒启动有点不方便。
您可以尝试确保您使用的是仅 CPU 版本的 tensorflow。可能会阻止它进行所有 CUDA 检查