【问题标题】:ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory even if libcublas.so.9.0 exists in pathImportError:libcublas.so.9.0:无法打开共享对象文件:即使路径中存在libcublas.so.9.0,也没有这样的文件或目录
【发布时间】:2018-07-09 14:31:05
【问题描述】:

我的机器信息:

  • nvcc --version:

    nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2017 NVIDIA Corporation Built on Fri_Sep__1_21:08:03_CDT_2017 Cuda compilation tools, release 9.0, V9.0.176

  • cuda驱动版本

    • 版本 - 9.2
    • 文件 - nvidia-diag-driver-local-repo-rhel7-396.26-1.0-1.x86_64.rpm
  • cat /etc/redhat-release:CentOS Linux 版本 7.5.1804(核心)

cat .bashrc 包括以下内容

PATH=$PATH:/usr/local/cuda/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64
CUDA_HOME=$CUDA_HOME:/usr/local/cuda

在此之后,如果我尝试导入 torch 或 torchvision,它工作正常。但是我导入了tensorflow,它没有导入

我的tensorflow版本如下:

  • 张量板==1.8.0
  • tensorflow-gpu==1.8.0

我收到以下错误:

>>> import tensorflow
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/local/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/local/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/site-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/local/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/local/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.

但是/usr/local/cuda/lib64 有以下内容:

  • libcublas_device.a
  • libcublas.so
  • libcublas.so.9.0
  • libcublas.so.9.0.176
  • libcublas_static.a

我无法找出问题所在。许可与此有关吗?后来我将上述文件的所有者和权限更改为当前和755。仍然得到同样的错误。

【问题讨论】:

  • LD_LIBRARY_PATH 是否指向/usr/local/cuda/lib64 ??没有?
  • @Patwie /usr/local/cuda 是指向/datadrive/abhisek/cuda-9.0/ 的链接文件。反正我改了。 echo $LD_LIBRARY_PATH:/usr/local/cuda/lib64。但同样的错误。
  • libcublas.solibcublas.so.9.0 的符号链接,libcublas.so.9.0libcublas.so.9.0.176 的符号链接?您没有复制、粘贴和覆盖这些“*.so”文件?而LD_LIBRARY_PATH=datadrive/abhisek/cuda-9.0/lib64 python 用 TensorFlow 重现了同样的错误(换句话说:.bashrc 真的被执行了)?
  • 我没有进行任何覆盖或复制粘贴。在安装 cuda 时,它会询问将安装文件放在哪里以及在哪里创建链接文件。唯一的链接文件是/usr/local/cudadatadrive/abhisek/cuda-9.0.bashrc 正在执行,因为我在更改后重新启动会话。换句话说echo $LD_LIBRARY_PATH::/usr/local/cuda/lib64

标签: tensorflow


【解决方案1】:

只需将上述路径添加到您的 LD_LIBRARY_PATH 为:

 export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

【讨论】:

  • /usr/local/cuda 是指向/datadrive/abhisek/cuda-9.0 的链接文件。无论如何我改变了它,仍然没有运气。同样的错误!
  • 在 python 中调用 os.environ['LD_LIBRARY_PATH'] 时看到正确的输出了吗?
猜你喜欢
  • 2019-01-12
  • 2018-07-03
  • 2018-08-30
  • 2018-11-02
  • 2020-01-27
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多