Skip to main content

Graphics Processing Units (GPU)

Last update:

You can add a GPU (graphics processing unit) to a cloud server when creating a server or to an existing server.

GPUs are used as dedicated PCI devices inside a cloud server.

Graphics processing units are available in fixed and custom GPU line configurations.

GPU lineup configurations and custom configurations with GPUs can be used with a local or network boot volume.

For cloud servers with a local disk, you can use only the following GPUs:

  • NVIDIA® A100 40Gb;
  • NVIDIA® A100 80Gb;
  • NVIDIA® A30;
  • NVIDIA® RTX 4090 48 GB;
  • NVIDIA® A5000;
  • NVIDIA® RTX 6000 Ada;
  • NVIDIA® RTX 6000 Pro;
  • NVIDIA® H200.

If you need a server with a pre-configured set of machine learning and data analysis tools and libraries, use the AI Marketplace.

Available GPUs

MemoryCUDA CoresTensor Cores

NVIDIA® A100 40 GB

NVIDIA® A100 40Gb NVLink (upon request)

40 GB
HBM2

6192432
NVIDIA® A100 80 GB80 GB
HBM2
6912432
NVIDIA® Tesla T416 GB
GDDR6
2560320
NVIDIA® A3024 GB
HBM2
3804224
NVIDIA® A2
(updated analog of
NVIDIA® Tesla T4
16 GB
GDDR6
128040
NVIDIA® GTX 10808 GB
GDDR5X
2560
NVIDIA® RTX 2080 Ti11 GB
GDDR6
4352544
NVIDIA® RTX 4090 24 GB24 GB
GDDR6X
16384512
NVIDIA® RTX 4090 48 GB48 GB
GDDR6X
16384512
NVIDIA® RTX 6000 Ada
(analog of L40)
48 GB
GDDR6X
18176568
NVIDIA® A2000
(analog of RTX 3060)
6 GB
GDDR6
3328104
NVIDIA® A5000
(analog of RTX 3080)
24 GB
GDDR6
8192256
NVIDIA® H10080 GB
HBM3
16896528
NVIDIA® H200141 GB
HBM3e
16896528
NVIDIA® L424 GB
GDDR6
20480640
NVIDIA® RTX 6000 Pro48 GB
GDDR7
18432576

You can view the current list of GPUs in the control panel: from the top menu, click ProductsCloud Servers → click Create Server.

You can check GPU availability in regions in the GPU for Cloud Servers availability matrix.

NVIDIA® A100 40 GB

Offers maximum performance for AI, HPC, and data processing. Suitable for deep learning, scientific research, and data analytics.

Based on the Ampere® architecture, with 40 GB of HBM2 memory and bandwidth up to 1.5 GB/s. See the detailed specifications in the NVIDIA® documentation.

In fixed GPU lineup configurations, 1 to 8 GPU × 40 GB are available, with 6 to 48 vCPUs and 87 to 700 GB of RAM.

In custom configurations, 1 to 8 GPU × 40 GB are available, with 2 to 32 vCPUs and 512 MB to 256 GB of RAM.

You can combine two NVIDIA® A100 40Gb GPUs using NVLink technology.

NVLink accelerates data transfer when GPUs are connected compared to the PCIe interface. GPUs connected via NVLink allow for more memory usage and increase server performance for complex calculations, such as training large language ML models.

NVLink works with NVIDIA® A100 40Gb — GPUs based on the Ampere® architecture, with 40 GB of HBM2 memory and bandwidth up to 1.5 GB/s. See the detailed specifications of NVIDIA® A100 40Gb and the description of NVLink technology in the NVIDIA® documentation.

NVIDIA® A100 40Gb NVLink is available upon request — create a ticket.

NVIDIA® A100 80 GB

Offers maximum performance for AI, HPC, and data processing, as well as a large memory capacity for resource-intensive tasks. Suitable for deep learning, scientific research, and data analytics.

Based on the Ampere® architecture, with 80 GB of HBM2 memory and bandwidth up to 1.5 GB/s. See the detailed specifications in the NVIDIA® documentation.

In fixed GPU lineup configurations, 1 to 8 GPU × 80 GB are available, with 12 to 96 vCPUs, 128 to 1000 GB of RAM, and 128 GB to 6.88 TB of local disk.

In custom configurations, 1 to 8 GPU × 80 GB are available, with 12 to 192 vCPUs, 64 GB to 1000 GB of RAM, and 256 GB to 3.36 TB of local disk.

NVIDIA® Tesla T4

Suitable for Machine Learning and Deep Learning, inference, graphics work, and video rendering. Works with most AI frameworks and is compatible with all types of neural networks.

Based on the Turing® architecture, with 16 GB of GDDR6 memory and bandwidth up to 300 GB/s. See the detailed specifications in the NVIDIA® documentation.

In fixed GPU lineup configurations, 1 to 4 GPU × 16 GB are available, with 4 to 24 vCPUs and 32 to 320 GB of RAM.

In custom configurations, 1 to 4 GPU × 16 GB are available, with 2 to 32 vCPUs and 512 MB to 256 GB of RAM.

NVIDIA® A30

Suitable for AI inference, HPC, language processing, conversational AI, and recommendation systems.

Based on the Ampere® architecture, with 24 GB of HBM2 memory and bandwidth up to 933 GB/s. See the detailed specifications in the NVIDIA® documentation.

In fixed GPU lineup configurations, 1 to 2 GPU × 24 GB are available, with 16 to 48 vCPUs and 64 to 320 GB of RAM.

In custom configurations, 1 to 2 GPU × 24 GB are available, with 2 to 32 vCPUs and 512 MB to 256 GB of RAM.

NVIDIA® A2

Entry-level GPU. Suitable for simple inference, video and graphics, Edge AI (edge computing), edge video, and mobile cloud gaming.

Based on the Ampere® architecture, with 16 GB of GDDR6 memory and bandwidth up to 200 GB/s. See the detailed specifications in the NVIDIA® documentation.

In fixed GPU lineup configurations, 1 to 4 GPU × 16 GB are available, with 12 to 48 vCPUs and 32 to 320 GB of RAM.

In custom configurations, 1 to 4 GPU × 16 GB are available, with 2 to 32 vCPUs and 512 MB to 256 GB of RAM.

NVIDIA® GTX 1080

A high-performance and energy-efficient GPU. The solution is implemented using FinFET technology and GDDR5X memory. Dynamic load balancing helps distribute tasks so that resources do not sit idle. Offers maximum performance for display, VR, ultra-high-resolution settings, and data processing.

Based on the Pascal® architecture, with 8 GB of GDDR5X memory and a bandwidth of up to 320 GB/s. See detailed specifications in the NVIDIA® documentation.

In fixed GPU lineup configurations, 1 to 8 GPU × 8 GB are available, with 8 to 28 vCPUs and 24 to 96 GB of RAM.

In custom configurations, 1 to 8 GPU × 8 GB are available, with 2 to 32 vCPUs and 512 MB to 256 GB of RAM.

NVIDIA® RTX 2080 Ti

A high-performance GPU for complex graphical tasks. Suitable for high-resolution video processing, 3D modeling, rendering, and photo editing. Also suitable for training neural networks, performing complex calculations in the field of artificial intelligence, and processing large volumes of data.

Based on the Turing® architecture, with 11 GB of GDDR6 memory and a bandwidth of up to 616 GB/s. See detailed specifications in the NVIDIA® documentation.

In fixed GPU lineup configurations, 1 to 4 GPU × 11 GB are available, with 2 to 48 vCPUs and 32 to 320 GB of RAM.

In custom configurations, 1 to 4 GPU × 11 GB are available, with 2 to 32 vCPUs and 512 MB to 256 GB of RAM.

NVIDIA® RTX 4090 24 GB

The highest-performance GPU in the GeForce series. Suitable for professional design and 3D modeling, video work, rendering, ML tasks (model training and inference), working with large language models (LLM), and scientific and engineering calculations (for example, in climate modeling or bioinformatics).

Based on the Ada Lovelace® architecture, with 24 GB of GDDR6X memory and a bandwidth of up to 1008 GB/s. See detailed specifications in the NVIDIA® documentation.

In fixed GPU lineup configurations, 1 to 4 GPU × 24 GB are available, with 4 to 64 vCPUs and 16 to 356 GB of RAM.

In custom configurations, 1 to 4 GPU × 24 GB are available, with 2 to 32 vCPUs and 4 to 256 GB of RAM.

NVIDIA® RTX 4090 48 GB

A GPU similar to the NVIDIA® RTX 4090 24 Gb, but with increased memory capacity.

Suitable for professional design and 3D modeling, video production, rendering, ML tasks (training and inference), working with large language models (LLM), and scientific/engineering computations (e.g., climate modeling or bioinformatics).

Based on the Ada Lovelace® architecture, with 48 GB of GDDR6X memory and a bandwidth of up to 1008 GB/s. See detailed specifications in the NVIDIA® documentation.

In fixed GPU lineup configurations, 1 to 8 GPU × 48 GB are available, with 12 to 192 vCPUs, 64 to 896 GB of RAM, and 64 to 800 GB of local disk.

In custom configurations, 1 to 8 GPU × 48 GB are available, with 12 to 192 vCPUs, 64 to 896 GB of RAM, and 50 to 800 GB of local disk.

NVIDIA® RTX 6000 Ada

A professional GPU for computing and graphics power. Suitable for ML tasks, rendering, scientific computing, and high-performance visualization.

Based on the Ada Lovelace® architecture, with 48 GB of GDDR6X memory and a bandwidth of up to 960 GB/s. See detailed specifications in the NVIDIA® documentation.

In fixed GPU lineup configurations, 1 to 4 GPU × 48 GB are available, with 12 to 96 vCPUs, 64 to 450 GB of RAM, and 64 GB to 2 TB of local disk.

In custom configurations, 1 to 4 GPU × 48 GB are available, with 12 to 96 vCPUs, 64 to 450 GB of RAM, and 64 GB to 3.52 TB of local disk.

NVIDIA® A2000

An energy-efficient GPU for compact workstations. Suitable for AI, graphics, and video rendering.

Based on the Ampere® architecture, with 6 GB of GDDR6 memory and a bandwidth of up to 288 GB/s. See detailed specifications in the NVIDIA® documentation.

In fixed GPU lineup configurations, 1 to 4 GPU × 6 GB are available, with 6 to 24 vCPUs and 16 to 320 GB of RAM.

In custom configurations, 1 to 4 GPU × 6 GB are available, with 2 to 32 vCPUs and 512 MB to 256 GB of RAM.

NVIDIA® A5000

A versatile GPU suitable for any task within its performance range.

Based on the Ampere® architecture, with 24 GB of GDDR6 memory and a bandwidth of up to 768 GB/s. See detailed specifications in the NVIDIA® documentation.

In fixed GPU lineup configurations, 1 to 4 GPU × 24 GB are available, with 8 to 48 vCPUs, 16 to 450 GB of RAM, and 100 GB to 2 TB of local disk.

In custom configurations, 1 to 2 GPU × 24 GB are available, with 2 to 32 vCPUs and 512 MB to 256 GB of RAM.

NVIDIA® H100

A powerful GPU suitable for AI, HPC, and scalable computing.

Based on the Hopper™ architecture, with 80 GB of HBM3 memory and a bandwidth of up to 3 TB/s. See detailed specifications in the NVIDIA® documentation.

In fixed GPU lineup configurations, 1 to 2 GPU × 80 GB are available, with 12 to 48 vCPUs and 128 to 256 GB of RAM.

In custom configurations, 1 to 2 GPU × 80 GB are available, with 2 to 48 vCPUs and 2 GB to 256 GB of RAM.

NVIDIA® H200

A professional GPU for accelerating generative AI, HPC, large language model (LLM) inference, model fine-tuning, and image/video generation.

Based on the Hopper™ architecture, with 141 GB of HBM3 memory and a bandwidth of up to 4.8 TB/s. See detailed specifications in the NVIDIA® documentation.

In fixed and custom GPU lineup configurations, 1 to 8 GPU × 141 GB are available, with 12 to 192 vCPUs, 120 GB to 1 TB of RAM, and 256 GB to 3 TB of local disk.

NVIDIA® L4

A versatile GPU for accelerating AI/ML workloads, video processing, streaming, and VDI. Suitable for running modern large language models (LLM) and multimodal models.

Based on the Ada Lovelace® architecture, with 24 GB of GDDR6 memory and a bandwidth of up to 3 TB/s. See detailed specifications in the NVIDIA® documentation.

In fixed GPU lineup configurations, 1 to 8 GPU × 24 GB are available, with 8 to 128 vCPUs and 32 GB to 512 GB of RAM.

In custom configurations, 1 to 8 GPU × 24 GB are available, with 8 to 256 vCPUs and 64 GB to 640 GB of RAM.

NVIDIA® RTX 6000 Pro

A professional GPU for accelerating generative AI, large language model (LLM) inference, model fine-tuning, image and video generation, 3D rendering, and video processing.

Based on the Blackwell® architecture, with 96 GB of GDDR7 memory and bandwidth up to 1.6 TB/s.

See the detailed specifications in the NVIDIA® documentation.

In fixed and custom GPU lineup configurations, 1 to 8 GPU × 96 GB are available, with 16 to 256 vCPUs, 120 GB to 1 TB of RAM, and 256 GB to 3 TB of local disk.

Create a cloud server with GPU

Use the Create a cloud server instruction.

When creating a server, select:

  • source — pre-configured GPU-optimized images are marked in the version list as GPU optimized. These images contain the drivers required to work with graphics processors. If you choose a different source, you will need to install the drivers on the server yourself for stable NVIDIA® GPU operation;
  • configuration — a fixed or custom GPU lineup configuration from 2 vCPUs.