Run a GPU-accelerated application in a Docker container on a cloud server

Docker containers can be used on cloud servers with GPUs to flexibly manage GPU-accelerated applications without needing to set up an additional environment.

A containerized environment will allow you to:

optimally consume resources—you can run multiple applications on one server that would require setting up different environments in another;
avoid issues with CUDA Toolkit versioning for your applications.

Selectel offers ready-to-use Docker images for running GPU-accelerated applications in containerized environments:

Ubuntu 24.04 LTS 64-bit GPU Driver 535 Docker;
Ubuntu 24.04 LTS 64-bit GPU Driver 580 Docker;
Ubuntu 22.04 LTS 64-bit GPU Driver 535 Docker;
Ubuntu 22.04 LTS 64-bit GPU Driver 580 Docker.

Requirements for the cloud server

The cloud server must have:

server configuration with a GPU;
the image from which the server is created, with preinstalled GPU drivers and Docker;
a network volume or local disk of the server larger than 40 GB.

Run a GPU-accelerated application in a Docker container on a server

Run the pytorch-cuda sample in a Docker container.
Create a custom Docker image with CUDA.

1. Run the pytorch-cuda sample in a Docker container

Run PyTorch inside a Docker container with GPU support.

Open the CLI.

Make sure the GPU on the server is working correctly:

nvidia-smi

The response will show a list of NVIDIA-SMI, driver, and CUDA versions compatible with the current driver version, but not installed in the system. For example:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Tesla T4                       Off |   00000000:00:06.0 Off |                    0 |
| N/A   41C    P8             10W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

Run a container from the NVIDIA Container Registry container catalog:
```
sudo docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:<pytorch_version>-py3 bash
```
Specify <pytorch_version> — the PyTorch version.

Make sure that the CUDA Toolkit is installed in the container and the GPU is available for calculations:

import torch

print("CUDA Available: ", torch.cuda.is_available())
print("Number of GPUs: ", torch.cuda.device_count())

Example output:

CUDA Available:  True
Number of GPUs:  1

Make sure that CUDA Runtime 12.1 is installed in the container, as it is required to run the current version of PyTorch:

conda list | grep cud

Example output:

libcudnn9-cuda-12         9.1.1.17                      0    nvidia
cuda-cudart               12.1.105                      0    nvidia
cuda-cupti                12.1.105                      0    nvidia
cuda-libraries            12.1.0                        0    nvidia
cuda-nvrtc                12.1.105                      0    nvidia
cuda-nvtx                 12.1.105                      0    nvidia
cuda-opencl               12.3.101                      0    nvidia
cuda-runtime              12.1.0                        0    nvidia

You do not need to install CUDA Runtime on the server OS.

2. Create a custom Docker image with CUDA

Run the ready-to-use container:

docker run --gpus all -it --rm  nvcr.io/nvidia/cuda:12.8.1-cudnn-devel-ubuntu24.04

The container will have compatible versions of CUDA Toolkit, CUDA Runtime, and libcudnn preinstalled:

cuda-cudart-12-8                12.8.90-1                   amd64        CUDA Runtime native Libraries
cuda-nvcc-12-8                  12.8.93-1                   amd64        CUDA nvcc
cuda-toolkit-config-common      12.8.90-1                   all          Common config package for CUDA Toolkit.
libcudnn9-cuda-12               9.8.0.87-1                  amd64        cuDNN runtime libraries for CUDA 12.8

Install Python 3:

apt update && apt -y install python3 python3-pip
python3 -m pip config set global.break-system-packages true
python3 -m pip install tensorflow

Make sure that the GPU is available in the Docker container:

python3 -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000]))); gpu_available = tf.test.is_gpu_available(); print('GPU is availlable: ', gpu_available)"

Example output:

I0000 00:00:1743408862.613883     910 gpu_device.cc:2019] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 4287 MB memory:  -> device: 0, name: NVIDIA RTX A2000, pci bus id: 0000:00:06.0, compute capability: 8.6
tf.Tensor(-1418.5072, shape=(), dtype=float32)
Available GPUs:  [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

Exit the shell without stopping the container: press Ctrl + P, and then Ctrl + Q.

Check that the container is running:

docker ps a

Example output:

CONTAINER ID   IMAGE                                                COMMAND                  CREATED          STATUS                      PORTS     NAMES
20d557a37bdd   nvcr.io/nvidia/cuda:12.8.1-cudnn-devel-ubuntu24.04   "/opt/nvidia/nvidia_…"   24 minutes ago   Up 24 minutes                         nifty_shtern

In the CONTAINER ID column, copy the ID of the container you ran in step 1.

Create the image:
```
docker commit <container_id> <image_tag>
```
Specify:
- <container_id> — the container ID you copied in step 5;
- <image_tag> — the image tag.
If the image was created, the image hash will be displayed. Example output:
```
sha256:a7ff970295e5dd37ef441fcf0462752715c95cece2729ddcc774a8aaa0773bce
```
Create and run a custom container from the image:
```
docker run --rm -it <image_tag> bash
```
Specify <image_tag> — the image tag you created in step 6.

Here --rm is a flag that will remove the container after you exit the container's bash shell.