Run a GPU-accelerated application in a Docker container on a server
Docker containers can be used on GPU cloud servers to flexibly manage GPU-accelerated applications without additional environment customization.
The containerized environment will allow:
- optimally consume resources — you can run many applications on one server that require different environments;
- avoid CUDA Toolkit versioning issues for your applications.
Selectel has ready-made images available with CUDA Toolkit latest minor versions 11-12 and Docker to run GPU-accelerated applications in containerized environments:
- Ubuntu 24.04 LTS 64-bit CUDA 11.8 Docker;
- Ubuntu 24.04 LTS 64-bit CUDA 12.8 Docker;
- Ubuntu 22.04 LTS 64-bit CUDA 11.8 Docker;
- Ubuntu 22.04 LTS 64-bit CUDA 12.8 Docker.
Cloud server requirements
The cloud server should have:
- server configuration with the GPU;
- the image from which the server is created, with pre-installed GPU drivers, CUDA Toolkit, and Docker;
- network or local server disk larger than 40 GB.
Run a GPU-accelerated application in a Docker container on a server
1. Run the pytorch-cuda sample in a Docker container
You need to run PyTorch to use the GPU with the CUDA Toolkit in your application's Docker container.
-
Open the CLI.
-
Make sure the GPU on the server is working correctly:
nvidia-smi
The response will list NVIDIA-SMI versions, drivers, and CUDA compatible with the current driver version but not installed on the system. For example:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Tesla T4 Off | 00000000:00:06.0 Off | 0 |
| N/A 41C P8 10W / 70W | 0MiB / 15360MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+ -
Start the container from the container catalog NVIDIA Container Registry:
sudo docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:<pytorch_version>-py3 bash
Specify
<pytorch_version>
— a version of PyTorch. -
Make sure the CUDA Toolkit is installed in the container and the GPU is available for computation:
import torch
print("CUDA Available: ", torch.cuda.is_available())
print("Number of GPUs: ", torch.cuda.device_count())Example output:
CUDA Available: True
Number of GPUs: 1 -
Make sure the container has the CUDA Runtime 12.1 version installed, which is needed to run the current version of PyTorch:
conda list | grep cud
Example output:
libcudnn9-cuda-12 9.1.1.17 0 nvidia
cuda-cudart 12.1.105 0 nvidia
cuda-cupti 12.1.105 0 nvidia
cuda-libraries 12.1.0 0 nvidia
cuda-nvrtc 12.1.105 0 nvidia
cuda-nvtx 12.1.105 0 nvidia
cuda-opencl 12.3.101 0 nvidia
cuda-runtime 12.1.0 0 nvidiaThe CUDA Runtime does not need to be installed on the server OS.
2. Create your own Docker image with CUDA
-
Start the finished container:
docker run --gpus all -it --rm nvcr.io/nvidia/cuda:12.8.1-cudnn-devel-ubuntu24.04
CUDA Toolkit, CUDA Runtime, and libcudnn compatible versions will be pre-installed in the container:
cuda-cudart-12-8 12.8.90-1 amd64 CUDA Runtime native Libraries
cuda-nvcc-12-8 12.8.93-1 amd64 CUDA nvcc
cuda-toolkit-config-common 12.8.90-1 all Common config package for CUDA Toolkit.
libcudnn9-cuda-12 9.8.0.87-1 amd64 cuDNN runtime libraries for CUDA 12.8 -
Install Python 3:
apt update && apt -y install python3 python3-pip
python3 -m pip config set global.break-system-packages true
python3 -m pip install tensorflow -
Make sure the GPU is available in the Docker container:
python3 -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000]))); gpu_available = tf.test.is_gpu_available(); print('GPU is availlable: ', gpu_available)"
Example output:
I0000 00:00:1743408862.613883 910 gpu_device.cc:2019] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 4287 MB memory: -> device: 0, name: NVIDIA RTX A2000, pci bus id: 0000:00:06.0, compute capability: 8.6
tf.Tensor(-1418.5072, shape=(), dtype=float32)
Available GPUs: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')] -
Exit the shell without interrupting the container: press
Ctrl + P
and thenCtrl + Q
. -
Verify that the container is running:
docker ps a
Example output:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
20d557a37bdd nvcr.io/nvidia/cuda:12.8.1-cudnn-devel-ubuntu24.04 "/opt/nvidia/nvidia_…" 24 minutes ago Up 24 minutes nifty_shternIn the parameter column
CONTAINER ID
copy the ID of the container you started in step 1. -
Create an image:
docker commit <container_id> <image_tag>
Specify:
<container_id>
— The ID of the container you copied in step 5;<image_tag>
— image tag.
If the image has been created, the image hash will be output. Example output:
sha256:a7ff970295e5dd37ef441fcf0462752715c95cece2729ddcc774a8aaa0773bce
-
Create and run your own container from the image:
docker run --rm -it <image_tag> bash
Specify
<image_tag>
— tag of the image you created in step 6.Here.
--rm
— flag that will delete the container after exiting the shellbash
of the container.