CUDA and Tensorflow in Docker

In this howto we will get CUDA working in Docker. And - as bonus - add Tensorflow on top! However, please note that you'll need following prereqs:

GNU/Linux x86_64 with kernel version > 3.10
Docker >= 1.9 (official docker-engine, docker-ce or docker-ee only)
NVIDIA GPU with Architecture > Fermi (2.1)
NVIDIA drivers >= 340.29 with binary nvidia-modprobe

We will install the NVIDIA drivers in this tutorial, so you should only have the right kernel and docker version already installed, we're using a Ubuntu 15.05 x64 machine here. For CUDA, you'll need a Fermi 2.1 CUDA card (or better), for tensorflow a >= 3.0 CUDA card...

Which Graphicscard Model do I own?

lspci | grep VGA
sudo lshw -C video

Output i.e.:

product: GF108 [GeForce GT 430]
vendor: NVIDIA Corporation

You should lookup on google if it works with cuda / Fermi 2.1, i.e. on https://developer.nvidia.com/cuda-gpus

GeForce GT 430 - Compute: 2.1

Ok, that one works!

I got additional infos from: https://www.geforce.com/hardware/desktop-gpus/geforce-gt-430/specifications

CUDA and Docker?

You can find out more about that topic on https://github.com/NVIDIA/nvidia-docker

Getting it to work will be the next step:

Download right CUDA / NVIDIA Driver

from http://www.nvidia.com/object/unix.html
I choose Linux x86_64/AMD64/EM64T, Latest Long Lived Branch version: 375.66, but please check in the description of the file, if your graphics card is supported!

After Download, install the driver:

chmod +x NVIDIA-Linux-x86_64-375.66.run
sudo ./NVIDIA-Linux-x86_64-375.66.run

It will ask for permission, accept it. If it gives info that the nouveau driver needs to be disabled, just accept that, in the next step, it will generate a blacklist file and exit the setup. Afterwards, run

sudo update-initramfs -u

and reboot your server. Then, rerun the setup with

sudo ./NVIDIA-Linux-x86_64-375.66.run

You can check the installation with

nvidia-smi

and get an output similar to this one:

Mon Jul 24 09:03:47 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.66                 Driver Version: 375.66                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 430      Off  | 0000:01:00.0     N/A |                  N/A |
| N/A   40C    P0    N/A /  N/A |      0MiB /   963MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
+-----------------------------------------------------------------------------+

which means that it worked!

Install nvidia-docker and nvidia-docker-plugin

wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker_1.0.1-1_amd64.deb
sudo dpkg -i /tmp/nvidia-docker*.deb && rm /tmp/nvidia-docker*.deb

Test nvidia-smi from Docker

nvidia-docker run --rm nvidia/cuda nvidia-smi

should output:

Using default tag: latest
latest: Pulling from nvidia/cuda
e0a742c2abfd: Pull complete
486cb8339a27: Pull complete
dc6f0d824617: Pull complete
4f7a5649a30e: Pull complete
672363445ad2: Pull complete
ba1240a1e18b: Pull complete
e875cd2ab63c: Pull complete
e87b2e3b4b38: Pull complete
17f7df84dc83: Pull complete
6c05bfef6324: Pull complete
Digest: sha256:c8c492ec656ecd4472891cd01d61ed3628d195459d967f833d83ffc3770a9d80
Status: Downloaded newer image for nvidia/cuda:latest
Mon Jul 24 07:07:12 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.66                 Driver Version: 375.66                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 430      Off  | 0000:01:00.0     N/A |                  N/A |
| N/A   40C    P8    N/A /  N/A |      0MiB /   963MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
+-----------------------------------------------------------------------------+

Yep, you got it working in Docker!

Running an interactive CUDA session isolating the first GPU

NV_GPU=0 nvidia-docker run -ti --rm nvidia/cuda

Input our first Hello World program

echo '#include <stdio.h>
// Kernel-execution with __global__: empty function at this point
__global__ void kernel(void) {
// printf("Hello, Cuda!\n");
}
int main(void) {
// Kernel execution with <<<1,1>>>
kernel<<<1,1>>>();
printf("Hello, World!\n");
return 0;
}' > helloWorld.cu

Compile it within the Docker container

nvcc helloWorld.cu -o helloWorld

Execute it...

./helloWorld

and you get,...

Hello, World!

Congrats, you got it working!

Encore, Tensorflow

Getting Tensorflow to work is straight forward:

nvidia-docker run -it -p 8888:8888 tensorflow/tensorflow:latest-gpu

It will output something like:

Copy/paste this URL into your browser when you connect for the first time, to login with a token:
http://localhost:8888/?token=d747247b33023883c1a929bc97d9a115e8b2dd0db9437620

you should do that 🙂

Then enter the 1_hello_tensorflow notebook and run the first sample:

from __future__ import print_function
import tensorflow as tf
with tf.Session():
    input1 = tf.constant([1.0, 1.0, 1.0, 1.0])
    input2 = tf.constant([2.0, 2.0, 2.0, 2.0])
    output = tf.add(input1, input2)
    result = output.eval()
    print("result: ", result)

by selecting it and clicking on the >| (run cell, select below) Button.
This worked for me:

result: [ 3. 3. 3. 3.]

however... sadly not the GPU was calculating the results as shown by the Docker CLI:

Kernel started: 2bc4c3b0-61f3-4ec8-b95b-88ed06379d85
[I 07:31:45.544 NotebookApp] Adapting to protocol v5.1 for kernel 2bc4c3b0-61f3-4ec8-b95b-88ed06379d85
2017-07-24 07:32:17.780122: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-24 07:32:17.837112: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-07-24 07:32:17.837440: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties:
name: GeForce GT 430
major: 2 minor: 1 memoryClockRate (GHz) 1.4
pciBusID 0000:01:00.0
Total memory: 963.19MiB
Free memory: 954.56MiB
2017-07-24 07:32:17.837498: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0
2017-07-24 07:32:17.837522: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   Y
2017-07-24 07:32:17.837549: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] Ignoring visible gpu device (device: 0, name: GeForce GT 430, pci bus id: 0000:01:00.0) with Cuda compute capability 2.1. The minimum required Cuda capability is 3.0.

So, CUDA >= 3.0 devices only for tensorflow 🙁 - but, it still works, as it is using the CPU (however, not as fast as it could :/)

Infos taken from:

https://github.com/NVIDIA/nvidia-docker
https://developer.nvidia.com/cuda-gpus
https://hub.docker.com/r/tensorflow/tensorflow/

CUDA and Tensorflow in Docker

Which Graphicscard Model do I own?

CUDA and Docker?

Download right CUDA / NVIDIA Driver

After Download, install the driver:

Install nvidia-docker and nvidia-docker-plugin

Test nvidia-smi from Docker

Running an interactive CUDA session isolating the first GPU

Input our first Hello World program

Compile it within the Docker container

Execute it...

and you get,...

Encore, Tensorflow

Getting Tensorflow to work is straight forward:

Infos taken from:

Related

Leave a ReplyCancel reply