Create a Managed Kubernetes cluster with GPUs
Creating a Managed Kubernetes cluster with GPUs is not available in Managed Kubernetes clusters on a dedicated server.
You can add GPUs (graphics processing units) to a Managed Kubernetes cluster on a cloud server — when you create a Managed Kubernetes cluster on a cloud server or add a group of nodes on a cloud server.
To see GPU availability across regions, see the GPU availability matrix for Managed Kubernetes.
Nodes with GPUs can use pre-installed drivers or install drivers themselves. Automatic cluster scaling is not available for groups of nodes with GPUs without drivers.
Create a cluster with GPUs
Use the instructions Create a Managed Kubernetes cluster on a cloud server.
Select:
- configuration — fixed configuration of a group of nodes with GPU;
- GPU drivers — by default, the **GPU Dri **vers toggle switch is enabled and the cluster uses pre-installed drivers. To install GPU drivers yourself, turn off the GPU Drivers toggle switch.
Available GPUs
You can view the current GPU list in the control panel: in the top menu, click Products → Managed Kubernetes → Create Cluster → Node Group stage → Cloud ServerNode Configuration → Fixed with GPU.
To see GPU availability across regions, see the GPU availability matrix for Managed Kubernetes.
NVIDIA® A100 40Gb
Offers maximum performance for AI, HPC and data processing. Suitable for deep learning, scientific research and data analytics.
Based on Ampere® architecture, up to 1.5 GB/s throughput. See NVIDIA® documentation for detailed specifications.
In fixed Managed Kubernetes cluster configurations, 1 to 8 GPUs × 40 GB are available, with vCPUs from 6 to 48, RAM from 87 to 704 GB.
NVIDIA® Tesla T4
Suitable for Machine Learning and Deep Learning, inference, graphics and video rendering. Works with most AI frameworks and is compatible with all types of neural networks.
Based on Turing® architecture, up to 300 GB/s throughput. See NVIDIA® documentation for detailed specifications.
In fixed Managed Kubernetes cluster configurations, 1 to 4 GPUs × 16 GB are available, with vCPUs from 4 to 24, RAM from 32 to 320 GB.
NVIDIA® A30
Suitable for AI-inference, HPC, language processing, conversational artificial intelligence, recommender systems.
Based on Ampere® architecture, up to 933 GB/s throughput. See NVIDIA® documentation for detailed specifications.
In fixed Managed Kubernetes cluster configurations, 1 to 2 GPUs × 24 GB are available, with vCPUs from 16 to 48, RAM from 64 to 320 GB.
NVIDIA® A2
An entry-level GPU. Suitable for simple inference, video and graphics, Edge AI (edge computing), Edge video, mobile cloud gaming.
Based on Ampere® architecture, up to 200 GB/s throughput. See NVIDIA® documentation for detailed specifications.
In fixed Managed Kubernetes cluster configurations, 1 to 4 GPUs × 16 GB are available, with vCPUs from 12 to 48, RAM from 32 to 320 GB.
NVIDIA® GTX 1080
High-performance and energy-efficient GPU. The solution is realized with FinFET technology and GDDR5X memory. Dynamic load balancing helps to divide tasks so resources don't sit idle waiting. Maximizes performance for display, VR, ultra high-resolution settings, and data processing.
Based on Pascal® architecture, up to 320 GB/s throughput. See NVIDIA® documentation for detailed specifications.
In fixed Managed Kubernetes cluster configurations, 1 to 8 GPUs × 8 GB are available, with vCPUs from 8 to 28, RAM from 24 to 96 GB.
NVIDIA® RTX 2080 Ti
High-performance GPU for demanding graphics tasks. Suitable for high-resolution video processing, 3D modeling, rendering, and photo processing. Also suitable for training neural networks, performing complex AI computations, and processing large amounts of data.
Based on Turing® architecture, up to 616 GB/s throughput. See NVIDIA® documentation for detailed specifications.
In fixed Managed Kubernetes cluster configurations, 1 to 4 GPUs × 11 GB are available, with vCPUs from 2 to 48, RAM from 32 to 320 GB.
NVIDIA® RTX 4090
The highest performing GPU in the GeForce series. Suitable for professional design and 3D modeling, video, rendering, ML tasks (model training and inference), LLM models, scientific and engineering computing (e.g., climate modeling or bioinformatics).
Based on Ada Lovelace® architecture, up to 1008 GB/s throughput. See NVIDIA® documentation for detailed specifications.
In fixed Managed Kubernetes cluster configurations, 1 to 4 GPUs × 24 GB are available, with vCPUs from 4 to 64, RAM from 16 to 356 GB.
NVIDIA® A2000
Power-efficient GPU for compact workstations. Suitable for AI, graphics and video rendering.
Based on Ampere® architecture, up to 288 GB/s bandwidth. See NVIDIA® documentation for detailed specifications.
In fixed Managed Kubernetes cluster configurations, 1 to 4 GPUs × 6 GB are available, with vCPUs from 6 to 24, RAM from 16 to 320 GB.
NVIDIA® A5000
A versatile GPU, suitable for any task within its performance limits.
Based on Ampere® architecture, up to 768 GB/s throughput. See NVIDIA® documentation for detailed specifications.
In fixed Managed Kubernetes cluster configurations, 1 to 2 GPUs × 24 GB are available, with vCPUs from 8 to 48, RAM from 32 to 320 GB.