【发布时间】:2017-08-25 10:58:34
【问题描述】:
我正在尝试在 GPU 上运行神经机器翻译演示。 tensorflow 入门页面中的 GPU 示例有效。
$ bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
$ bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu
产生预期的输出。
但是当我尝试编译translation demo:
bazel build -c opt --config=cuda --verbose_failures //tensorflow/models/rnn/translate:translate
失败了:
...
____Loading package: @jpeg_archive//
____Loading package: @png_archive//
____Loading package: @re2//
____Loading complete. Analyzing...
____Found 1 target...
____Building...
____[0 / 2] BazelWorkspaceStatusAction stable-status.txt
____[25 / 324] Executing genrule @six_archive//:copy_six
____[237 / 1,193] Executing genrule @png_archive//:configure [for host]
____[242 / 1,193] Executing genrule //third_party/gpus/cuda:cuda_check
____[361 / 1,193] Executing genrule //google/protobuf:protobuf_python_internal_copied_filegroup_genrule
____From Executing genrule @png_archive//:configure:
____From Executing genrule @png_archive//:configure [for host]:
____From Executing genrule @jpeg_archive//:configure:
____From Executing genrule @jpeg_archive//:configure [for host]:
____[677 / 1,193] Compiling tensorflow/core/kernels/argmax_op.cc
____From Compiling tensorflow/python/client/tf_session_helper.cc:
tensorflow/python/client/tf_session_helper.cc: In function 'tensorflow::Status tensorflow::{anonymous}::TF_StringTensor_GetPtrAndLen(const TF_Tensor*, tensorflow::int64, tensorflow::int64, const char**, tensorflow::uint64*)':
tensorflow/python/client/tf_session_helper.cc:248:14: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (offset >= (limit - data_start) || !p || (*len > (limit - p))) {
^
tensorflow/python/client/tf_session_helper.cc:248:53: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (offset >= (limit - data_start) || !p || (*len > (limit - p))) {
^
tensorflow/python/client/tf_session_helper.cc: In function 'tensorflow::Status tensorflow::{anonymous}::TF_Tensor_to_PyObject(TF_Tensor*, PyObject**)':
tensorflow/python/client/tf_session_helper.cc:311:32: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (PyArray_NBYTES(py_array) != TF_TensorByteSize(tensor)) {
^
tensorflow/python/client/tf_session_helper.cc: In function 'void tensorflow::TF_Run_wrapper(TF_Session*, const FeedVector&, const NameVector&, const NameVector&, tensorflow::Status*, tensorflow::PyObjectVector*)':
tensorflow/python/client/tf_session_helper.cc:416:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int i = 0; i < inputs.size(); ++i) {
^
tensorflow/python/client/tf_session_helper.cc:430:41: error: 'PyArray_SHAPE' was not declared in this scope
dims.push_back(PyArray_SHAPE(array)[i]);
^
tensorflow/python/client/tf_session_helper.cc:513:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int i = 0; i < output_names.size(); ++i) {
^
ERROR: /home/mifs/fs439/bin/tensorflow/tensorflow/python/BUILD:710:1: C++ compilation of rule '//tensorflow/python:tf_session_helper' failed: crosstool_wrapper_driver_is_not_gcc failed: error executing command
...
可能是因为
tensorflow/python/client/tf_session_helper.cc:430:41: error: 'PyArray_SHAPE' was not declared in this scope
dims.push_back(PyArray_SHAPE(array)[i]);
这可能是因为全局 numpy 安装较旧且不知道 PyArray_SHAPE。我没有全局更新它的管理员权限,但我已经使用pip install --user 在 $HOME/.local/lib/python2.7/site-packages/ 中安装了更新的 numpy。如果我像这样在 tensorflow/tensorflow/python/BUILD 中添加相应规则的路径:
tf_cuda_library(
name = "tf_session_helper",
srcs = ["client/tf_session_helper.cc"],
hdrs = ["client/tf_session_helper.h"],
copts = numpy_macosx_include_dir + ["-I<path-to-local-numpy>"] + ["-I/usr/include/python2.7"],
deps = [
":construction_fails_op",
":test_kernel_label_op_kernel",
"//tensorflow/core",
"//tensorflow/core:direct_session",
"//tensorflow/core:kernels",
"//tensorflow/core:lib",
"//tensorflow/core:protos_cc",
],
)
它抱怨:
ERROR: <tensorflow-dir>/tensorflow/python/BUILD:710:1: in cc_library rule //tensorflow/python:tf_session_helper: The include path '/home/mifs/fs439/.local/lib/python2.7/site-packages/numpy/core/include' references a path outside of the execution root..
ERROR: <tensorflow-dir>/tensorflow/python/BUILD:710:1: in cc_library rule //tensorflow/python:tf_session_helper: The include path '/home/mifs/fs439/.local/lib/python2.7/site-packages/numpy/core/include' references a path outside of the execution root..
ERROR: Analysis of target '//tensorflow/models/rnn/translate:translate' failed; build aborted.
____Elapsed time: 18.559s
如何告诉 tensorflow 使用本地 numpy 版本?
(gcc 4.9.3,前沿 tensorflow + bazel,本地 numpy 1.10.1,ubuntu 12.04)
编辑:
当我按照syncd 的建议按照here 的说明进行操作时,我得到了
ERROR: /home/mifs/fs439/bin/tensorflow/tensorflow/python/BUILD:710:1: undeclared inclusion(s) in rule '//tensorflow/python:tf_session_helper':
this rule is missing dependency declarations for the following files included by 'tensorflow/python/client/tf_session_helper.cc':
'third_party/numpy/arrayobject.h'
'third_party/numpy/ndarrayobject.h'
'third_party/numpy/ndarraytypes.h'
'third_party/numpy/npy_common.h'
'third_party/numpy/numpyconfig.h'
'third_party/numpy/_numpyconfig.h'
'third_party/numpy/npy_endian.h'
'third_party/numpy/npy_cpu.h'
'third_party/numpy/utils.h'
'third_party/numpy/_neighborhood_iterator_imp.h'
'third_party/numpy/npy_1_7_deprecated_api.h'
'third_party/numpy/old_defines.h'
'third_party/numpy/__multiarray_api.h'
'third_party/numpy/npy_interrupt.h'.
Target //tensorflow/models/rnn/translate:translate failed to build
在 tf_cuda_library 规则中尝试将它们添加到 hdrs 的各种尝试都没有帮助:
hdrs = ["client/tf_session_helper.h"] + glob([
"**/arrayobject.h",
"numpy/*.h",
"**/numpy/*.h",
]),
【问题讨论】:
-
你如何指定本地numpy的路径?我相信绝对路径应该可行,但可能会出错。
-
我使用的是绝对路径。在工作区中使用指向 numpy 的符号链接的相对路径会导致许多“此规则缺少...包含的以下文件的依赖项声明”错误 tf_cuda_library 规则后跟 numpy *.h 文件列表跨度>
标签: tensorflow