【发布时间】:2021-01-19 16:05:00
【问题描述】:
我是 Docker 的新手,所以下面的问题可能有点幼稚,但我被困住了,我需要帮助。
我正在尝试重现一些研究结果。作者只是released code along with a specification of how to build a Docker image 来重现他们的结果。相关位复制如下:
我相信我正确安装了 Docker:
$ docker --version
Docker version 19.03.13, build 4484c46d9d
$ sudo docker run hello-world
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
(amd64)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/
For more examples and ideas, visit:
https://docs.docker.com/get-started/
但是,当我尝试检查我的 nvidia-docker 安装是否成功时,我收到以下错误:
$ sudo docker run --gpus all --rm nvidia/cuda:10.1-base nvidia-smi
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: nvml error: driver not loaded\\\\n\\\"\"": unknown.
看起来关键错误是:
nvidia-container-cli: initialization error: nvml error: driver not loaded
我在本地没有 GPU,我发现关于是否需要在 NVIDIA Docker 之前安装 CUDA 的信息相互矛盾。例如,this NVIDIA moderator says“正确的 nvidia docker 插件安装始于在基础机器上安装正确的 CUDA。”
我的问题如下:
-
我可以在不安装 CUDA 的情况下安装 NVIDIA Docker 吗?
-
如果是这样,这个错误的根源是什么?我该如何解决?
-
如果没有,我该如何创建这个 Docker 镜像来重现结果?
【问题讨论】:
标签: docker nvidia nvidia-docker