【问题标题】:Theano: GPU Usage on a New Jetson TX1Theano:新 Jetson TX1 上的 GPU 使用情况
【发布时间】:2017-01-25 22:10:34
【问题描述】:

(下面问题中的错误消息很长。TL;DR,这里是具体问题:为什么这个测试代码不能在 TX1 的 GPU 上执行,我需要做什么才能让它这样做?)

我刚刚刷机并安装了带有 JetPack 2.3 的全新 Nvidia Jetson TX1。我正在尝试将 Theano 安装在 TX1 上,以便能够使用板载 GPU,用于进一步的机器学习和神经网络应用程序。

但是,我似乎无法让 GPU 本身工作。

Theano 的安装取自 here:

sudo apt-get install python-numpy python-scipy python-dev python-pip python-nose g++ libblas-dev git
pip install --upgrade --no-deps git+git://github.com/Theano/Theano.git --user  # Need Theano 0.8(not yet released) or more recent

安装的Theano版本是0.9.0.dev2,python版本是2.7.12。

我使用了here的测试脚本:

from theano import function, config, shared, tensor
import numpy
import time

vlen = 10 * 30 * 768  # 10 x #cores x # threads per core
iters = 1000

rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], tensor.exp(x))
print(f.maker.fgraph.toposort())
t0 = time.time()
for i in range(iters):
    r = f()
t1 = time.time()
print("Looping %d times took %f seconds" % (iters, t1 - t0))
print("Result is %s" % (r,))
if numpy.any([isinstance(x.op, tensor.Elemwise) and
              ('Gpu' not in type(x.op).__name__)
              for x in f.maker.fgraph.toposort()]):
    print('Used the cpu')
else:
    print('Used the gpu')

按照建议运行时:

THEANO_FLAGS=device=cuda0 python gpu_tutorial1.py

我收到以下响应,其中充满了错误、警告和在 CPU 而不是 GPU 上的执行:

ERROR (theano.gpuarray): pygpu was configured but could not be imported
Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 21, in <module>
    import pygpu
ImportError: No module named pygpu
WARNING (theano.gof.cmodule): OPTIMIZATION WARNING: Theano was not able to find the default g++ parameters. This is needed to tune the compilation to your specific CPU. This can slow down the execution of Theano functions. Please submit the following lines to Theano's mailing list so that we can fix this problem:
 ['# 1 "<stdin>"\n', '# 1 "<built-in>"\n', '# 1 "<command-line>"\n', '# 1 "/usr/include/stdc-predef.h" 1 3 4\n', '# 1 "<command-line>" 2\n', '# 1 "<stdin>"\n', 'Using built-in specs.\n', 'COLLECT_GCC=/usr/bin/g++\n', 'Target: aarch64-linux-gnu\n', "Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.2' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libquadmath --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-arm64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-arm64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-arm64 --with-arch-directory=aarch64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-multiarch --enable-fix-cortex-a53-843419 --disable-werror --enable-checking=release --build=aarch64-linux-gnu --host=aarch64-linux-gnu --target=aarch64-linux-gnu\n", 'Thread model: posix\n', 'gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.2) \n', "COLLECT_GCC_OPTIONS='-E' '-v' '-shared-libgcc' '-mlittle-endian' '-mabi=lp64'\n", ' /usr/lib/gcc/aarch64-linux-gnu/5/cc1 -E -quiet -v -imultiarch aarch64-linux-gnu - -mlittle-endian -mabi=lp64 -fstack-protector-strong -Wformat -Wformat-security\n', 'ignoring nonexistent directory "/usr/local/include/aarch64-linux-gnu"\n', 'ignoring nonexistent directory "/usr/lib/gcc/aarch64-linux-gnu/5/../../../../aarch64-linux-gnu/include"\n', '#include "..." search starts here:\n', '#include <...> search starts here:\n', ' /usr/lib/gcc/aarch64-linux-gnu/5/include\n', ' /usr/local/include\n', ' /usr/lib/gcc/aarch64-linux-gnu/5/include-fixed\n', ' /usr/include/aarch64-linux-gnu\n', ' /usr/include\n', 'End of search list.\n', 'COMPILER_PATH=/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/:/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/\n', 'LIBRARY_PATH=/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/5/../../../aarch64-linux-gnu/:/usr/lib/gcc/aarch64-linux-gnu/5/../../../../lib/:/lib/aarch64-linux-gnu/:/lib/../lib/:/usr/lib/aarch64-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/aarch64-linux-gnu/5/../../../:/lib/:/usr/lib/\n', "COLLECT_GCC_OPTIONS='-E' '-v' '-shared-libgcc' '-mlittle-endian' '-mabi=lp64'\n"]
[Elemwise{exp,no_inplace}(<TensorType(float64, vector)>)]
Looping 1000 times took 12.736936 seconds
Result is [ 1.23178032  1.61879341  1.52278065 ...,  2.20771815  2.29967753
  1.62323285]
Used the cpu

当我将设备标志更改为“gpu”时:

THEANO_FLAGS=device=gpu python gpu_tutorial1.py

情况有所改善,至少找到了 NVIDIA Tegra X1,尽管它最终没有被使用:

Using gpu device 0: NVIDIA Tegra X1 (CNMeM is disabled, cuDNN 5105)
WARNING (theano.gof.cmodule): OPTIMIZATION WARNING: Theano was not able to find the default g++ parameters. This is needed to tune the compilation to your specific CPU. This can slow down the execution of Theano functions. Please submit the following lines to Theano's mailing list so that we can fix this problem:
 ['# 1 "<stdin>"\n', '# 1 "<built-in>"\n', '# 1 "<command-line>"\n', '# 1 "/usr/include/stdc-predef.h" 1 3 4\n', '# 1 "<command-line>" 2\n', '# 1 "<stdin>"\n', 'Using built-in specs.\n', 'COLLECT_GCC=/usr/bin/g++\n', 'Target: aarch64-linux-gnu\n', "Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.2' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libquadmath --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-arm64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-arm64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-arm64 --with-arch-directory=aarch64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-multiarch --enable-fix-cortex-a53-843419 --disable-werror --enable-checking=release --build=aarch64-linux-gnu --host=aarch64-linux-gnu --target=aarch64-linux-gnu\n", 'Thread model: posix\n', 'gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.2) \n', "COLLECT_GCC_OPTIONS='-E' '-v' '-shared-libgcc' '-mlittle-endian' '-mabi=lp64'\n", ' /usr/lib/gcc/aarch64-linux-gnu/5/cc1 -E -quiet -v -imultiarch aarch64-linux-gnu - -mlittle-endian -mabi=lp64 -fstack-protector-strong -Wformat -Wformat-security\n', 'ignoring nonexistent directory "/usr/local/include/aarch64-linux-gnu"\n', 'ignoring nonexistent directory "/usr/lib/gcc/aarch64-linux-gnu/5/../../../../aarch64-linux-gnu/include"\n', '#include "..." search starts here:\n', '#include <...> search starts here:\n', ' /usr/lib/gcc/aarch64-linux-gnu/5/include\n', ' /usr/local/include\n', ' /usr/lib/gcc/aarch64-linux-gnu/5/include-fixed\n', ' /usr/include/aarch64-linux-gnu\n', ' /usr/include\n', 'End of search list.\n', 'COMPILER_PATH=/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/:/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/\n', 'LIBRARY_PATH=/usr/lib/gcc/aarch64-linux-gnu/5/:/usr/lib/gcc/aarch64-linux-gnu/5/../../../aarch64-linux-gnu/:/usr/lib/gcc/aarch64-linux-gnu/5/../../../../lib/:/lib/aarch64-linux-gnu/:/lib/../lib/:/usr/lib/aarch64-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/aarch64-linux-gnu/5/../../../:/lib/:/usr/lib/\n', "COLLECT_GCC_OPTIONS='-E' '-v' '-shared-libgcc' '-mlittle-endian' '-mabi=lp64'\n"]
[Elemwise{exp,no_inplace}(<TensorType(float64, vector)>)]
Looping 1000 times took 12.820628 seconds
Result is [ 1.23178032  1.61879341  1.52278065 ...,  2.20771815  2.29967753
  1.62323285]
Used the cpu

我确实计划将警告行发送到 Theano 邮件列表,但该警告似乎与我目前的主要问题无关:为什么此测试代码不在 TX1 的 GPU 上执行,我需要什么怎么做才能做到这一点?

【问题讨论】:

    标签: theano theano-cuda


    【解决方案1】:

    事实证明,该站点上推荐的 CLI 调用不正确。正确的调用是:

    THEANO_FLAGS='device=gpu,floatX=float32' python gpu_tutorial1.py
    

    这足以在 GPU 上以令人满意的加速执行(在输出中明显并报告),并摆脱那种长得令人毛骨悚然的错误警告。

    将这两个标志放在一个 .theanorc 文件中也足够了,并且简化了调用。

    【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2017-04-30
    • 2019-11-19
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2012-05-29
    相关资源
    最近更新 更多