Skip to main content
Create a Managed Kubernetes cluster with GPUs
Last update:

Create a Managed Kubernetes cluster with GPUs

For your information

Creating a Managed Kubernetes cluster with GPUs is not available in Managed Kubernetes clusters on a dedicated server.

GPUs (graphics processing units) can be added to a Managed Kubernetes cluster on a cloud server — at the creating a Managed Kubernetes cluster on a cloud server or adding a node group on a cloud server.

To see the availability of GPUs in the regions, see the availability matrix GPU for Managed Kubernetes.

On GPU nodes, you can use pre-installed drivers or install the drivers yourself. Not available for node groups with GPUs without drivers automatic cluster scaling.

Create a cluster with GPUs

Use the instructions Create a Managed Kubernetes cluster on a cloud server.

Select:

  • configuration — fixed configuration of a group of nodes with GPU;
  • GPU drivers — toggle switch by default GPU drivers is enabled and the cluster uses pre-installed drivers. To Install GPU drivers yourself turn off the toggle switch GPU drivers.

Available GPUs

NVIDIA® A100NVIDIA® Tesla T4NVIDIA® A30NVIDIA® A2
(updated analog
NVIDIA® Tesla T4)
NVIDIA® GTX 1080NVIDIA® RTX 2080 TiNVIDIA® RTX 4090NVIDIA® A2000
(RTX 3060 analog)
NVIDIA® A5000
(RTX 3080 analog)
Memory40 GB.
HBM2
16 GB
GDDR6
24 GB
HBM2
16 GB
GDDR6
8 GB
GDDR5X
11 GB.
GDDR6
24 GB
GDDR6X
6 GB
GDDR6
24 GB
GDDR6
CUDA kernels6192256038041280256043521638433288192
Tensor kernels43232022440544512104256

To see the current list of GPUs, go to control panels under Cloud platformKubernetes → click Create a cluster → block Node ConfigurationFixed with GPU.

To see the availability of GPUs in the regions, see the availability matrix GPU for Managed Kubernetes.

NVIDIA® A100

Offers maximum performance for AI, HPC and data processing. Suitable for deep learning, scientific research and data analytics.

Based on Ampere® architecture, up to 2 TB/s throughput. View detailed specifications in NVIDIA® documentation.

In fixed Managed Kubernetes cluster configurations, 1 to 8 GPUs × 40 GB are available, with vCPUs from 6 to 48, RAM from 87 to 704 GB.

NVIDIA® Tesla T4

Suitable for Machine Learning and Deep Learning, inference, graphics and video rendering. Works with most AI frameworks and is compatible with all types of neural networks.

Based on Turing® architecture, up to 300GB/s throughput. See detailed specifications in NVIDIA® documentation.

In fixed Managed Kubernetes cluster configurations, 1 to 4 GPUs × 16 GB are available, with vCPUs from 4 to 24, RAM from 32 to 320 GB.

NVIDIA® A30

Suitable for AI-inference, HPC, language processing, conversational artificial intelligence, recommender systems.

Based on Ampere® architecture, up to 933GB/s of bandwidth. See detailed specifications in NVIDIA® documentation.

In fixed Managed Kubernetes cluster configurations, 1 to 2 GPUs × 24 GB are available, with vCPUs from 16 to 48, RAM from 64 to 320 GB.

NVIDIA® A2

An entry-level GPU. Suitable for simple inference, video and graphics, Edge AI (edge computing), Edge video, mobile cloud gaming.

Based on Ampere® architecture, up to 200 GB/s throughput. View detailed specifications in NVIDIA® documentation.

In fixed Managed Kubernetes cluster configurations, 1 to 4 GPUs × 16 GB are available, with vCPUs from 12 to 48, RAM from 32 to 320 GB.

NVIDIA® GTX 1080

High-performance and energy-efficient GPU. The solution is realized with FinFET technology and GDDR5X memory. Dynamic load balancing helps to divide tasks so resources don't sit idle waiting. Maximizes performance for display, VR, ultra high-resolution settings, and data processing.

Based on Pascal® architecture, up to 320GB/s throughput. View detailed specifications in NVIDIA® documentation.

In fixed Managed Kubernetes cluster configurations, 1 to 8 GPUs × 8 GB are available, with vCPUs from 8 to 28, RAM from 24 to 96 GB.

NVIDIA® RTX 2080 Ti

High-performance GPU for demanding graphics tasks. Suitable for high-resolution video processing, 3D modeling, rendering, and photo processing. Also suitable for training neural networks, performing complex AI computations, and processing large amounts of data.

Based on Turing® architecture, up to 616GB/s throughput. See detailed specifications in NVIDIA® documentation.

In fixed Managed Kubernetes cluster configurations, 1 to 4 GPUs × 11 GB are available, with vCPUs from 2 to 48, RAM from 32 to 320 GB.

NVIDIA® RTX 4090

The highest performing GPU in the GeForce series. Suitable for professional design and 3D modeling, video, rendering, ML tasks (model training and inference), LLM models, scientific and engineering computing (e.g., climate modeling or bioinformatics).

Based on Ada Lovelace® architecture, up to 1008 GB/s throughput. View detailed specifications in NVIDIA® documentation.

In fixed Managed Kubernetes cluster configurations, 1 to 4 GPUs × 24 GB are available, with vCPUs from 4 to 64, RAM from 16 to 356 GB.

NVIDIA® A2000

Power-efficient GPU for compact workstations. Suitable for AI, graphics and video rendering.

Based on Ampere® architecture, up to 288GB/s of bandwidth. View detailed specifications in NVIDIA® documentation.

In fixed Managed Kubernetes cluster configurations, 1 to 4 GPUs × 6 GB are available, with vCPUs from 6 to 24, RAM from 16 to 320 GB.

NVIDIA® A5000

A versatile GPU, suitable for any task within its performance limits.

Based on Ampere® architecture, up to 768GB/s of bandwidth. View detailed specifications in NVIDIA® documentation.

In fixed Managed Kubernetes cluster configurations, 1 to 2 GPUs × 24 GB are available, with vCPUs from 8 to 48, RAM from 32 to 320 GB.