【问题标题】:Compiling tensorflow/models/rnn/translate:translate with local numpy编译 tensorflow/models/rnn/translate:translate 与本地 numpy
【发布时间】:2017-08-25 10:58:34
【问题描述】:

我正在尝试在 GPU 上运行神经机器翻译演示。 tensorflow 入门页面中的 GPU 示例有效。

$ bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
$ bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu

产生预期的输出。

但是当我尝试编译translation demo:

bazel build -c opt --config=cuda --verbose_failures //tensorflow/models/rnn/translate:translate

失败了:

...
____Loading package: @jpeg_archive//
____Loading package: @png_archive//
____Loading package: @re2//
____Loading complete.  Analyzing...
____Found 1 target...
____Building...
____[0 / 2] BazelWorkspaceStatusAction stable-status.txt
____[25 / 324] Executing genrule @six_archive//:copy_six
____[237 / 1,193] Executing genrule @png_archive//:configure [for host]
____[242 / 1,193] Executing genrule //third_party/gpus/cuda:cuda_check
____[361 / 1,193] Executing genrule //google/protobuf:protobuf_python_internal_copied_filegroup_genrule
____From Executing genrule @png_archive//:configure:
____From Executing genrule @png_archive//:configure [for host]:
____From Executing genrule @jpeg_archive//:configure:
____From Executing genrule @jpeg_archive//:configure [for host]:
____[677 / 1,193] Compiling tensorflow/core/kernels/argmax_op.cc
____From Compiling tensorflow/python/client/tf_session_helper.cc:
tensorflow/python/client/tf_session_helper.cc: In function 'tensorflow::Status tensorflow::{anonymous}::TF_StringTensor_GetPtrAndLen(const TF_Tensor*, tensorflow::int64, tensorflow::int64, const char**, tensorflow::uint64*)':
tensorflow/python/client/tf_session_helper.cc:248:14: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   if (offset >= (limit - data_start) || !p || (*len > (limit - p))) {
              ^
tensorflow/python/client/tf_session_helper.cc:248:53: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   if (offset >= (limit - data_start) || !p || (*len > (limit - p))) {
                                                     ^
tensorflow/python/client/tf_session_helper.cc: In function 'tensorflow::Status tensorflow::{anonymous}::TF_Tensor_to_PyObject(TF_Tensor*, PyObject**)':
tensorflow/python/client/tf_session_helper.cc:311:32: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   if (PyArray_NBYTES(py_array) != TF_TensorByteSize(tensor)) {
                                ^
tensorflow/python/client/tf_session_helper.cc: In function 'void tensorflow::TF_Run_wrapper(TF_Session*, const FeedVector&, const NameVector&, const NameVector&, tensorflow::Status*, tensorflow::PyObjectVector*)':
tensorflow/python/client/tf_session_helper.cc:416:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   for (int i = 0; i < inputs.size(); ++i) {
                     ^
tensorflow/python/client/tf_session_helper.cc:430:41: error: 'PyArray_SHAPE' was not declared in this scope
       dims.push_back(PyArray_SHAPE(array)[i]);
                                         ^
tensorflow/python/client/tf_session_helper.cc:513:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   for (int i = 0; i < output_names.size(); ++i) {
                     ^
ERROR: /home/mifs/fs439/bin/tensorflow/tensorflow/python/BUILD:710:1: C++ compilation of rule '//tensorflow/python:tf_session_helper' failed: crosstool_wrapper_driver_is_not_gcc failed: error executing command
...

可能是因为

tensorflow/python/client/tf_session_helper.cc:430:41: error: 'PyArray_SHAPE' was not declared in this scope
       dims.push_back(PyArray_SHAPE(array)[i]);

这可能是因为全局 numpy 安装较旧且不知道 PyArray_SHAPE。我没有全局更新它的管理员权限,但我已经使用pip install --user 在 $HOME/.local/lib/python2.7/site-packages/ 中安装了更新的 numpy。如果我像这样在 tensorflow/tensorflow/python/BUILD 中添加相应规则的路径:

tf_cuda_library(
    name = "tf_session_helper",
    srcs = ["client/tf_session_helper.cc"],
    hdrs = ["client/tf_session_helper.h"],
    copts = numpy_macosx_include_dir + ["-I<path-to-local-numpy>"] + ["-I/usr/include/python2.7"],
    deps = [
        ":construction_fails_op",
        ":test_kernel_label_op_kernel",
        "//tensorflow/core",
        "//tensorflow/core:direct_session",
        "//tensorflow/core:kernels",
        "//tensorflow/core:lib",
        "//tensorflow/core:protos_cc",
    ],
)

它抱怨:

ERROR: <tensorflow-dir>/tensorflow/python/BUILD:710:1: in cc_library rule //tensorflow/python:tf_session_helper: The include path '/home/mifs/fs439/.local/lib/python2.7/site-packages/numpy/core/include' references a path outside of the execution root..
ERROR: <tensorflow-dir>/tensorflow/python/BUILD:710:1: in cc_library rule //tensorflow/python:tf_session_helper: The include path '/home/mifs/fs439/.local/lib/python2.7/site-packages/numpy/core/include' references a path outside of the execution root..
ERROR: Analysis of target '//tensorflow/models/rnn/translate:translate' failed; build aborted.
____Elapsed time: 18.559s

如何告诉 tensorflow 使用本地 numpy 版本?

(gcc 4.9.3,前沿 tensorflow + bazel,本地 numpy 1.10.1,ubuntu 12.04)

编辑:

当我按照syncd 的建议按照here 的说明进行操作时,我得到了

ERROR: /home/mifs/fs439/bin/tensorflow/tensorflow/python/BUILD:710:1: undeclared inclusion(s) in rule '//tensorflow/python:tf_session_helper':
this rule is missing dependency declarations for the following files included by 'tensorflow/python/client/tf_session_helper.cc':
  'third_party/numpy/arrayobject.h'
  'third_party/numpy/ndarrayobject.h'
  'third_party/numpy/ndarraytypes.h'
  'third_party/numpy/npy_common.h'
  'third_party/numpy/numpyconfig.h'
  'third_party/numpy/_numpyconfig.h'
  'third_party/numpy/npy_endian.h'
  'third_party/numpy/npy_cpu.h'
  'third_party/numpy/utils.h'
  'third_party/numpy/_neighborhood_iterator_imp.h'
  'third_party/numpy/npy_1_7_deprecated_api.h'
  'third_party/numpy/old_defines.h'
  'third_party/numpy/__multiarray_api.h'
  'third_party/numpy/npy_interrupt.h'.
Target //tensorflow/models/rnn/translate:translate failed to build

在 tf_cuda_library 规则中尝试将它们添加到 hdrs 的各种尝试都没有帮助:

hdrs = ["client/tf_session_helper.h"] + glob([
    "**/arrayobject.h",
    "numpy/*.h",
    "**/numpy/*.h",
]),

【问题讨论】:

  • 你如何指定本地numpy的路径?我相信绝对路径应该可行,但可能会出错。
  • 我使用的是绝对路径。在工作区中使用指向 numpy 的符号链接的相对路径会导致许多“此规则缺少...包含的以下文件的依赖项声明”错误 tf_cuda_library 规则后跟 numpy *.h 文件列表跨度>

标签: tensorflow


【解决方案1】:

here 提出了一种可能的解决方案。

tensorflow/third_party 目录中创建指向$HOME/.local/lib/python2.7/site-packages/numpy/core/include/numpy 的链接并将-Ithird_party 编辑为tensorflow/python/buildtensorflow/tensorflow.bzl

【讨论】:

  • 我得到“这个规则缺少依赖声明”这个解决方案 - 请参阅我更新的帖子
【解决方案2】:

如果您收到“规则中未声明的包含”错误,这是一个非常令人讨厌的解决方法:

1.) 在您的 gccs 内置包含目录中选择一个路径(即 bazel-workspace/tools/cpp/CROSSTOOL 中的 cxx_builtin_include_directory 路径之一 - 应该等于 g++ -v bla.cc 给出的路径)

2.) 假设您选择了目录 dir。在 dir

中创建一个名为“bla”的目录

3.) 在 dir/bla 中,创建指向您的 numpy 包含目录的符号链接 (ln -s .../core/include/numpy .)

4.) 将“dir/bla”添加到 tensorflow/python/BUILD 和 tensorflow/tensorflow.bzl,如链接中所述。

5.) 感到内疚,但对它编译感到高兴

【讨论】:

    猜你喜欢
    • 2019-08-01
    • 1970-01-01
    • 1970-01-01
    • 2019-05-25
    • 2019-12-30
    • 1970-01-01
    • 2018-06-15
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多