In this howto we will get CUDA working in Docker. And - as bonus - add Tensorflow on top! However, please note that you'll need following prereqs:
GNU/Linux x86_64 with kernel version > 3.10 Docker >= 1.9 (official docker-engine, docker-ce or docker-ee only) NVIDIA GPU with Architecture > Fermi (2.1) NVIDIA drivers >= 340.29 with binary nvidia-modprobe
We will install the NVIDIA drivers in this tutorial, so you should only have the right kernel and docker version already installed, we're using a Ubuntu 15.05 x64 machine here. For CUDA, you'll need a Fermi 2.1 CUDA card (or better), for tensorflow a >= 3.0 CUDA card...
Which Graphicscard Model do I own?
lspci | grep VGA sudo lshw -C video
Output i.e.:
product: GF108 [GeForce GT 430] vendor: NVIDIA Corporation
You should lookup on google if it works with cuda / Fermi 2.1, i.e. on https://developer.nvidia.com/cuda-gpus
GeForce GT 430 - Compute: 2.1
Ok, that one works!
I got additional infos from: https://www.geforce.com/hardware/desktop-gpus/geforce-gt-430/specifications
CUDA and Docker?
You can find out more about that topic on https://github.com/NVIDIA/nvidia-docker
Getting it to work will be the next step:
Download right CUDA / NVIDIA Driver
from http://www.nvidia.com/object/unix.html
I choose Linux x86_64/AMD64/EM64T, Latest Long Lived Branch version: 375.66, but please check in the description of the file, if your graphics card is supported!
After Download, install the driver:
chmod +x NVIDIA-Linux-x86_64-375.66.run sudo ./NVIDIA-Linux-x86_64-375.66.run
It will ask for permission, accept it. If it gives info that the nouveau driver needs to be disabled, just accept that, in the next step, it will generate a blacklist file and exit the setup. Afterwards, run
sudo update-initramfs -u
and reboot your server. Then, rerun the setup with
sudo ./NVIDIA-Linux-x86_64-375.66.run
You can check the installation with
nvidia-smi
and get an output similar to this one:
Mon Jul 24 09:03:47 2017 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 375.66 Driver Version: 375.66 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GT 430 Off | 0000:01:00.0 N/A | N/A | | N/A 40C P0 N/A / N/A | 0MiB / 963MiB | N/A Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 Not Supported | +-----------------------------------------------------------------------------+
which means that it worked!
Install nvidia-docker and nvidia-docker-plugin
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker_1.0.1-1_amd64.deb sudo dpkg -i /tmp/nvidia-docker*.deb && rm /tmp/nvidia-docker*.deb
Test nvidia-smi from Docker
nvidia-docker run --rm nvidia/cuda nvidia-smi
should output:
Using default tag: latest latest: Pulling from nvidia/cuda e0a742c2abfd: Pull complete 486cb8339a27: Pull complete dc6f0d824617: Pull complete 4f7a5649a30e: Pull complete 672363445ad2: Pull complete ba1240a1e18b: Pull complete e875cd2ab63c: Pull complete e87b2e3b4b38: Pull complete 17f7df84dc83: Pull complete 6c05bfef6324: Pull complete Digest: sha256:c8c492ec656ecd4472891cd01d61ed3628d195459d967f833d83ffc3770a9d80 Status: Downloaded newer image for nvidia/cuda:latest Mon Jul 24 07:07:12 2017 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 375.66 Driver Version: 375.66 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GT 430 Off | 0000:01:00.0 N/A | N/A | | N/A 40C P8 N/A / N/A | 0MiB / 963MiB | N/A Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 Not Supported | +-----------------------------------------------------------------------------+
Yep, you got it working in Docker!
Running an interactive CUDA session isolating the first GPU
NV_GPU=0 nvidia-docker run -ti --rm nvidia/cuda
Input our first Hello World program
echo '#include <stdio.h> // Kernel-execution with __global__: empty function at this point __global__ void kernel(void) { // printf("Hello, Cuda!\n"); } int main(void) { // Kernel execution with <<<1,1>>> kernel<<<1,1>>>(); printf("Hello, World!\n"); return 0; }' > helloWorld.cu
Compile it within the Docker container
nvcc helloWorld.cu -o helloWorld
Execute it...
./helloWorld
and you get,...
Hello, World!
Congrats, you got it working!
Encore, Tensorflow
Getting Tensorflow to work is straight forward:
nvidia-docker run -it -p 8888:8888 tensorflow/tensorflow:latest-gpu
It will output something like:
Copy/paste this URL into your browser when you connect for the first time, to login with a token: http://localhost:8888/?token=d747247b33023883c1a929bc97d9a115e8b2dd0db9437620
you should do that 🙂
Then enter the 1_hello_tensorflow notebook and run the first sample:
from __future__ import print_function import tensorflow as tf with tf.Session(): input1 = tf.constant([1.0, 1.0, 1.0, 1.0]) input2 = tf.constant([2.0, 2.0, 2.0, 2.0]) output = tf.add(input1, input2) result = output.eval() print("result: ", result)
by selecting it and clicking on the >| (run cell, select below) Button.
This worked for me:
result: [ 3. 3. 3. 3.]
however... sadly not the GPU was calculating the results as shown by the Docker CLI:
Kernel started: 2bc4c3b0-61f3-4ec8-b95b-88ed06379d85 [I 07:31:45.544 NotebookApp] Adapting to protocol v5.1 for kernel 2bc4c3b0-61f3-4ec8-b95b-88ed06379d85 2017-07-24 07:32:17.780122: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. 2017-07-24 07:32:17.837112: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2017-07-24 07:32:17.837440: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties: name: GeForce GT 430 major: 2 minor: 1 memoryClockRate (GHz) 1.4 pciBusID 0000:01:00.0 Total memory: 963.19MiB Free memory: 954.56MiB 2017-07-24 07:32:17.837498: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0 2017-07-24 07:32:17.837522: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: Y 2017-07-24 07:32:17.837549: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] Ignoring visible gpu device (device: 0, name: GeForce GT 430, pci bus id: 0000:01:00.0) with Cuda compute capability 2.1. The minimum required Cuda capability is 3.0.
So, CUDA >= 3.0 devices only for tensorflow 🙁 - but, it still works, as it is using the CPU (however, not as fast as it could :/)
Infos taken from:
https://github.com/NVIDIA/nvidia-docker
https://developer.nvidia.com/cuda-gpus
https://hub.docker.com/r/tensorflow/tensorflow/