【问题标题】:Blas GEMV launch failed: m=3, n=2 [Op:MatMul]Blas GEMV 启动失败:m=3,n=2 [Op:MatMul]
【发布时间】:2018-12-13 01:13:10
【问题描述】:

当我运行以下代码时,我得到了错误:

E tensorflow/stream_executor/cuda/cuda_blas.cc:654] failed to run cuBLAS routine cublasSgemv_v2: CUBLAS_STATUS_EXECUTION_FAILED
Traceback (most recent call last):
File "modelAndLayer.py", line 16, in <module>
y_pred=model(X)
File "/home/cxsbg/anaconda3/envs/dl36/lib/python3.6/site-packages/tensorflow/python/keras/_impl/keras/engine/base_layer.py", line 314, in __call__
output = super(Layer, self).__call__(inputs, *args, **kwargs)
File "/home/cxsbg/anaconda3/envs/dl36/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 717, in __call__
outputs = self.call(inputs, *args, **kwargs)
File "modelAndLayer.py", line 10, in call
output=self.dense(input)
File "/home/cxsbg/anaconda3/envs/dl36/lib/python3.6/site-packages/tensorflow/python/keras/_impl/keras/engine/base_layer.py", line 314, in __call__
output = super(Layer, self).__call__(inputs, *args, **kwargs)
File "/home/cxsbg/anaconda3/envs/dl36/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 717, in __call__
outputs = self.call(inputs, *args, **kwargs)
File "/home/cxsbg/anaconda3/envs/dl36/lib/python3.6/site-packages/tensorflow/python/layers/core.py", line 163, in call
outputs = gen_math_ops.mat_mul(inputs, self.kernel)
File "/home/cxsbg/anaconda3/envs/dl36/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 4305, in mat_mul
_six.raise_from(_core._status_to_exception(e.code, message), None)
File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InternalError: Blas GEMV launch failed:  m=3, n=2 [Op:MatMul]

我的显卡是RTX2080,驱动是v410。 cuda 是 v9.0,cudnn 是 v7。 tensorflow-gpu 是 v1.8 (我对 v1.8 和 v1.12 都感到厌烦)。 python是v3.6(我在v3.6和v2.7上都试过)。系统是Ubuntu 16.04(win10我也累)。

问题总是出现在 tensorflow-gpu 上,但在 tensorflow cpu 上有效。

代码在这里(一个简单的线性模型):

import tensorflow as tf
tf.enable_eager_execution()
X=tf.constant([[1.,2.,3,],[4.,5.,6.]])
Y=tf.constant([[10.],[20.]])
class Linear(tf.keras.Model):
    def __init__(self):
        super().__init__()
        self.dense=tf.keras.layers.Dense(units=1,kernel_initializer=tf.zeros_initializer(),bias_initializer=tf.zeros_initializer())
    def call(self,input):
        output=self.dense(input)
        return output
model=Linear()
optimizer=tf.train.GradientDescentOptimizer(learning_rate=1e-3)
for i in range(1000):
    with tf.GradientTape() as tape:
        y_pred=model(X)
        loss=tf.reduce_mean(tf.square(y_pred-Y))
    grads=tape.gradient(loss,model.variables)
    optimizer.apply_gradients(zip(grads,model.variables))
print(model.variables)

【问题讨论】:

  • 这是一个错误报告。你应该提交一个 tensorflow github 问题。

标签: python tensorflow gpu blas


【解决方案1】:

我认为错误是由 tf.enable_eager_execution() 引起的,因为我对其进行了多次测试。感谢作者which-version-of-cuda-can-work-with-rtx-2080。当我使用cuda9.2时,错误已修复。

【讨论】:

    猜你喜欢
    • 2021-07-28
    • 1970-01-01
    • 2020-02-14
    • 1970-01-01
    • 1970-01-01
    • 2010-12-02
    • 2018-01-12
    • 1970-01-01
    相关资源
    最近更新 更多