【问题标题】:error building tensorflow with cuda support on windows with bazel在带有 bazel 的 Windows 上使用 cuda 支持构建 tensorflow 时出错
【发布时间】:2017-10-19 16:24:55
【问题描述】:

我正在尝试通过 bazel 在 Windows 10 64 位上编译具有 CUDA 支持的 TensorFlow。 我的系统是这样设置的:

  • Windows 10 64 位
  • 具有 CUDA 功能 6.1 的 Nvidia GeForce 1050
  • CUDA 工具包 v8.0 -> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0
  • cuDNN v6.0 -> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0
  • bazel 0.7.0(重命名为 bazel.exe)-> C:\Users\eliam\bazel\0.7.0
  • MSYS2 64 位
  • TensorFlow 主分支 -> C:\Users\eliam\tensorflow

我也已经设置了这些环境变量:

BAZEL_PYTHON=C:/Users/eliam/Miniconda3
BAZEL_SH=C:/msys64/usr/bin/bash.exe
BAZEL_VC=C:/Program Files (x86)/Microsoft Visual Studio/2017/BuildTools/VC
BAZEL_VS=C:/Program Files (x86)/Microsoft Visual Studio 14.0
CUDA_PATH=C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v8.0
CUDA_TOOLKIT_PATH=C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v8.0
LD_LIBRARY_PATH=C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v8.0/lib/x64
PYTHON_BIN_PATH=C:/Users/eliam/Miniconda3/python.exe
PYTHON_PATH=C:/Users/eliam/Miniconda3/python.exe
PYTHONPATH=C:/Users/eliam/Miniconda3/python.exe
PYTHON_LIB_PATH=C:/Users/eliam/Miniconda3/lib/site-packages
PATH=C:\Users\eliam\bazel\0.7.0;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\lib\x64;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include;%PATH%

Bazel 已按照其网站 (https://docs.bazel.build/versions/master/install-windows.html) 所需的所有步骤进行设置

MSYS2 已按照其网站 (http://www.msys2.org/) 所需的所有步骤进行设置

我成功地完成了 configure.py。

python ./configure.py
You have bazel 0.7.0 installed.
Do you wish to build TensorFlow with XLA JIT support? [y/N]:
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with GDR support? [y/N]:
No GDR support will be enabled for TensorFlow.

Do you wish to build TensorFlow with VERBS support? [y/N]:
No VERBS support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]:


Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 6.0]:


Please specify the location where cuDNN 6 library is installed. Refer to README.md for more details. [Default is C:/Program Files/NVIDIA                 GPU Computing Toolkit/CUDA/v8.0]:


Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.5,5.2]


Do you wish to build TensorFlow with MPI support? [y/N]:
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:


Add "--config=mkl" to your bazel command to build with MKL support.
Please note that MKL on MacOS or windows is still not supported.
If you would like to use a local MKL instead of downloading, please set the environment variable "TF_MKL_ROOT" every time before build.
Configuration finished

之后,我使用以下命令设置了一些其他环境变量:

set BUILD_OPTS='--cpu=x64_windows_msvc --host_cpu=x64_windows_msvc --copt=/w --verbose_failures --experimental_ui --config=cuda'

为了防止这个错误

$ bazel build -c opt --config=cuda --verbose_failures --subcommands //tensorflow/cc:tutorials_example_trainer
..............
WARNING: The lower priority option '-c opt' does not override the previous value '-c opt'.
____Loading package: tensorflow/cc
____Loading package: @local_config_cuda//crosstool
____Loading package: @local_config_xcode//
ERROR: No toolchain found for cpu 'x64_windows'. Valid cpus are: [
  k8,
  piii,
  arm,
  darwin,
  ppc,
].
____Elapsed time: 10.196s

然后我开始 bazel 构建,使用以下命令

bazel build -c opt $BUILD_OPTS //tensorflow/tools/pip_package:build_pip_package

这就是问题的开始。这是完整日志的link

知道为什么吗?

【问题讨论】:

  • --cpu=x64_windows_msvcERROR: No toolchain found for cpu 'x64_windows' 对我来说似乎很容易解释
  • 正如我在帖子中所说,该错误已通过set BUILD_OPTS='--cpu=x64_windows_msvc --host_cpu=x64_windows_msvc --copt=/w --verbose_failures --experimental_ui --config=cuda 解决。我不明白为什么会发生的错误是帖子的最后一行(有整个日志的链接)
  • @talonmies 你能解释一下你的意思吗?

标签: windows tensorflow bazel tensorflow-gpu


【解决方案1】:

日志的重要部分是这样的:

ERROR: C:/msys64/home/eliam/tensorflow/tensorflow/stream_executor/BUILD:52:1: C++ compilation of rule '//tensorflow/stream_executor:cuda_platform' failed (Exit 2).
tensorflow/stream_executor/cuda/cuda_platform.cc(48): error C3861: 'strcasecmp': identifier not found
tensorflow/stream_executor/cuda/cuda_platform.cc(50): error C3861: 'strcasecmp': identifier not found
tensorflow/stream_executor/cuda/cuda_platform.cc(52): error C3861: 'strcasecmp': identifier not found
Target //tensorflow/cc:tutorials_example_trainer failed to build

tensorflow/stream_executor/cuda/cuda_platform.cc(48) 中包含strcmp

编译器抱怨strcasecmp,因此一定是#define'ing strcmpstrcasecmp。无论如何,您可以使用--verbose_failures 运行构建吗?这将显示 Bazel 正在执行的命令。这可能暗示正在发生的事情。

另外,我在你的环境变量中看到了这个:

BAZEL_VC=C:/Program Files (x86)/Microsoft Visual Studio/2017/BuildTools/VC
BAZEL_VS=C:/Program Files (x86)/Microsoft Visual Studio 14.0

您只需要设置其中一项。我建议保留BAZEL_VC,因为它指向更新的编译器。我承认我不知道定义两个 envvar 时会发生什么,Bazel 是否更喜欢其中一个。但我知道只要定义其中一个就可以正常工作。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2019-01-03
    • 2021-02-15
    • 2017-10-22
    • 1970-01-01
    • 2016-12-05
    • 2019-02-11
    • 2018-10-11
    • 1970-01-01
    相关资源
    最近更新 更多