【问题标题】:tensorflow-gpu running failure on LINUXtensorflow-gpu 在 LINUX 上运行失败
【发布时间】:2018-09-15 07:59:18
【问题描述】:

我已经在 ubuntu 16.04 上安装了 CUDA 和 cuDnn。

CUDA 版本 : 9.0 // 带有驱动程序版本 390.87

cuDNN 版本:7.2 for CUDA9.0

import tensorflow as tf

工作正常,但是

tf.Session() 

呈现以下错误。

2018-09-15 16:43:23.281375: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0
2018-09-15 16:43:23.281431: E tensorflow/core/common_runtime/direct_session.cc:158] Internal: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/home/imhgchoi/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1494, in __init__
super(Session, self).__init__(target, graph, config=config)
 File "/home/imhgchoi/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 626, in __init__
self._session = tf_session.TF_NewSession(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.

错误消息暗示我安装了错误版本的 CUDA 驱动程序,但我迷路了。我不确定要采取什么措施来纠正这种情况。


添加环境变量后

这只会增加新的错误..

2018-09-15 17:13:39.684390: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-09-15 17:13:39.767963: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:897] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-09-15 17:13:39.768481: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1405] Found device 0 with properties: 
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.506
pciBusID: 0000:09:00.0
totalMemory: 3.94GiB freeMemory: 3.41GiB
2018-09-15 17:13:39.768502: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0
2018-09-15 17:13:39.768635: E tensorflow/core/common_runtime/direct_session.cc:158] Internal: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version

【问题讨论】:

    标签: tensorflow


    【解决方案1】:

    也许是你的环境变量导致了这个问题。 试试这个:

    在你的 ~/.bashrc 文件末尾添加这些行并打开一个终端并简单地在那里启动一个 python 会话然后导入 tensorflow(你应该通过 apt 安装 tensporflow-gpu)并查看它是否有效:

    sudo vim ~/.bashrc
    

    并在文件末尾添加这些并重新启动终端:

    export CUDA_HOME="/usr/local/cuda-9.0"
    export LD_LIBRARY_PATH="${CUDA_HOME}/lib64"
    export PATH="${CUDA_HOME}/bin:${PATH}"
    export DYLD_LIBRARY_PATH="${CUDA_HOME}/lib"
    

    编辑.1

    请确保“usr/local/cuda-9.0”是您安装cuda的目录。

    【讨论】:

    • 这增加了错误.. :( 我已经更新了我帖子中的错误
    • nvcc --version 在你的终端返回什么? @HyeongGyuFroilanChoi
    猜你喜欢
    • 2017-12-03
    • 1970-01-01
    • 2019-02-20
    • 2018-04-12
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多