Graphics Processing Units (GPU)
You can add a GPU (graphics processing unit) to a cloud server when creating a server or to an existing server.
GPUs are used as dedicated PCI devices inside a cloud server.
Graphics processing units are available in fixed and custom GPU line configurations.
GPU lineup configurations and custom configurations with GPUs can be used with a local or network boot volume.
For cloud servers with a local disk, you can use:
- NVIDIA® A100 40Gb;
- NVIDIA® A100 80Gb;
- NVIDIA® A30;
- NVIDIA® RTX 4090 48 GB;
- NVIDIA® A5000;
- NVIDIA® RTX 6000 Ada;
- NVIDIA® RTX 6000 Pro;
- NVIDIA® H200.
If you need a server with a pre-configured set of machine learning and data analysis tools and libraries, use the AI Marketplace.
Available GPUs
You can view the current list of GPUs in the control panel: from the top menu, click Products → Cloud Servers → click Create Server.
You can check GPU availability in regions in the GPU for Cloud Servers availability matrix.
NVIDIA® A100 40 GB
Offers maximum performance for AI, HPC, and data processing. Suitable for deep learning, scientific research, and data analytics.
Based on the Ampere® architecture, with 40 GB of HBM2 memory and a bandwidth of up to 1.5 GB/s. See the detailed specifications of the NVIDIA® A100 40Gb in the NVIDIA® documentation.
In fixed GPU lineup configurations, 1 to 8 GPU × 40 GB are available, with 6 to 48 vCPUs and 87 to 700 GB of RAM.
In custom configurations, 1 to 8 GPU × 40 GB are available, with 2 to 32 vCPUs and 512 MB to 256 GB of RAM.
NVIDIA® A100 40 GB NVLink
You can combine two NVIDIA® A100 40Gb GPUs using NVLink technology.
NVLink accelerates data transfer when GPUs are connected compared to the PCIe interface. GPUs connected via NVLink allow for more memory usage and increase server performance for complex calculations, such as training large language ML models.
NVLink works with NVIDIA® A100 40Gb — GPUs based on the Ampere® architecture, with 40 GB of HBM2 memory and a bandwidth of up to 1.5 GB/s. See the detailed specifications of the NVIDIA® A100 40Gb and a description of NVLink technology in the NVIDIA® documentation.
NVIDIA® A100 40Gb NVLink are available upon request — create a ticket.
NVIDIA® A100 80 GB
Offers maximum performance for AI, HPC, and data processing, as well as a large memory capacity for resource-intensive tasks. Suitable for deep learning, scientific research, and data analytics.
Based on the Ampere® architecture, with 80 GB of HBM2 memory and a bandwidth of up to 1.5 GB/s. See the detailed specifications of the NVIDIA® A100 80Gb in the NVIDIA® documentation.
In fixed GPU lineup configurations, 1 to 8 GPU × 80 GB are available, with 12 to 96 vCPUs, 128 to 1000 GB of RAM, and 128 GB to 6.88 TB of local disk.
In custom configurations, 1 to 8 GPU × 80 GB are available, with 12 to 192 vCPUs, 64 GB to 1000 GB of RAM, and 256 GB to 3.36 TB of local disk.
NVIDIA® Tesla T4
Suitable for machine learning and deep learning, inference, graphics processing, and video rendering. It works with most AI frameworks and is compatible with all types of neural networks.
Based on the Turing® architecture, with 16 GB of GDDR6 memory and a bandwidth of up to 300 GB/s. See the detailed specifications of the NVIDIA® Tesla T4 in the NVIDIA® documentation.
In fixed GPU lineup configurations, 1 to 4 GPU × 16 GB are available, with 4 to 24 vCPUs and 32 to 320 GB of RAM.
In custom configurations, 1 to 4 GPU × 16 GB are available, with 2 to 32 vCPUs and 512 MB to 256 GB of RAM.
NVIDIA® A30
Suitable for AI inference, HPC, language processing, conversational AI, and recommendation systems.
Based on the Ampere® architecture, with 24 GB of HBM2 memory and a bandwidth of up to 933 GB/s. See the detailed specifications of the NVIDIA® A30 in the NVIDIA® documentation.
In fixed GPU lineup configurations, 1 to 2 GPU × 24 GB are available, with 16 to 48 vCPUs and 64 to 320 GB of RAM.
In custom configurations, 1 to 2 GPU × 24 GB are available, with 2 to 32 vCPUs and 512 MB to 256 GB of RAM.
NVIDIA® A2
Entry-level GPU. Suitable for simple inference, video and graphics, Edge AI (edge computing), edge video, and mobile cloud gaming.
Based on the Ampere® architecture, with 16 GB of GDDR6 memory and a bandwidth of up to 200 GB/s. See the detailed specifications of the NVIDIA® A2 in the NVIDIA® documentation.
In fixed GPU lineup configurations, 1 to 4 GPU × 16 GB are available, with 12 to 48 vCPUs and 32 to 320 GB of RAM.
In custom configurations, 1 to 4 GPU × 16 GB are available, with 2 to 32 vCPUs and 512 MB to 256 GB of RAM.
NVIDIA® GTX 1080
A high-performance and energy-efficient GPU. The solution is implemented using FinFET technology and GDDR5X memory. Dynamic load balancing helps distribute tasks so that resources do not sit idle. Offers maximum performance for display, VR, ultra-high-resolution settings, and data processing.
Based on the Pascal® architecture, with 8 GB of GDDR5X memory and a bandwidth of up to 320 GB/s. See the detailed specifications of the NVIDIA® GTX 1080 in the NVIDIA® documentation.
In fixed GPU lineup configurations, 1 to 8 GPU × 8 GB are available, with 8 to 28 vCPUs and 24 to 96 GB of RAM.
In custom configurations, 1 to 8 GPU × 8 GB are available, with 2 to 32 vCPUs and 512 MB to 256 GB of RAM.
NVIDIA® RTX 2080 Ti
A high-performance GPU for complex graphics tasks. Suitable for:
- high-resolution video processing;
- 3D model creation;
- rendering and photo processing;
- neural network training;
- complex artificial intelligence calculations;
- large-scale data processing.
Based on the Turing® architecture, with 11 GB of GDDR6 memory and a bandwidth of up to 616 GB/s. See the detailed specifications of the NVIDIA® RTX 2080 Ti in the NVIDIA® documentation.
Fixed GPU lineup configurations offer 1 to 4 GPUs × 11 GB, with 2 to 48 vCPUs and 32 to 320 GB of RAM.
Custom configurations offer 1 to 4 GPUs × 11 GB, with 2 to 32 vCPUs and 512 MB to 256 GB of RAM.
NVIDIA® RTX 4090 24 Gb
A high-performance GeForce series GPU suitable for:
- professional design and 3D modeling;
- video editing and rendering;
- ML tasks (model training and inference);
- working with large language models (LLMs);
- scientific and engineering calculations (e.g., climate modeling or bioinformatics). Based on the Ada Lovelace® architecture, with 24 GB of GDDR6X memory and a bandwidth of up to 1008 GB/s. See the detailed specifications of the NVIDIA® RTX 4090 in the NVIDIA® documentation.
Fixed GPU lineup configurations offer 1 to 4 GPUs × 24 GB, with 4 to 64 vCPUs and 16 to 356 GB of RAM.
Custom configurations offer 1 to 4 GPUs × 24 GB, with 2 to 32 vCPUs and 4 to 256 GB of RAM.
NVIDIA® RTX 4090 48 Gb
A high-performance GeForce series GPU with more memory than the NVIDIA® RTX 4090 24 Gb, suitable for:
- professional design and 3D modeling;
- video editing and rendering;
- ML tasks (model training and inference);
- working with large language models (LLMs);
- scientific and engineering calculations (e.g., climate modeling or bioinformatics).
Based on the Ada Lovelace® architecture, with 48 GB of GDDR6X memory and a bandwidth of up to 1008 GB/s. See the detailed specifications of the NVIDIA® RTX 4090 in the NVIDIA® documentation.
Fixed GPU lineup configurations offer 1 to 8 GPUs × 48 GB, with 12 to 192 vCPUs, 64 to 896 GB of RAM, and a local disk from 64 to 800 GB.
Custom configurations offer 1 to 8 GPUs × 48 GB, with 12 to 192 vCPUs, 64 to 896 GB of RAM, and a local disk from 50 to 800 GB.
NVIDIA® RTX 6000 Ada
A professional GPU for computing and graphics power. Suitable for ML tasks, rendering, scientific computing, and high-performance visualization.
Based on the Ada Lovelace® architecture, with 48 GB of GDDR6X memory and a bandwidth of up to 960 GB/s. See the detailed specifications of the NVIDIA® RTX 6000 Ada in the NVIDIA® documentation.
Fixed GPU lineup configurations offer 1 to 4 GPUs × 48 GB, with 12 to 96 vCPUs, 64 to 450 GB of RAM, and a local disk from 64 GB to 2 TB.
Custom configurations offer 1 to 4 GPUs × 48 GB, with 12 to 96 vCPUs, 64 to 450 GB of RAM, and a local disk from 64 GB to 3.52 TB.
NVIDIA® A2000
An energy-efficient GPU for compact workstations. Suitable for AI, graphics, and video rendering.
Based on the Ampere® architecture, with 6 GB of GDDR6 memory and a bandwidth of up to 288 GB/s. See the detailed specifications of the NVIDIA® A2000 in the NVIDIA® documentation.
Fixed GPU lineup configurations offer 1 to 4 GPUs × 6 GB, with 6 to 24 vCPUs and 16 to 320 GB of RAM.
Custom configurations offer 1 to 4 GPUs × 6 GB, with 2 to 32 vCPUs and 512 MB to 256 GB of RAM.
NVIDIA® A5000
A versatile GPU suitable for any tasks within its performance range.
Based on the Ampere® architecture, with 24 GB of GDDR6 memory and a bandwidth of up to 768 GB/s. See the detailed specifications of the NVIDIA® A5000 in the NVIDIA® documentation.
Fixed GPU lineup configurations offer 1 to 4 GPUs × 24 GB, with 8 to 48 vCPUs, 16 to 450 GB of RAM, and a local disk from 100 GB to 2 TB.
Custom configurations offer 1 to 2 GPUs × 24 GB, with 2 to 32 vCPUs and 512 MB to 256 GB of RAM.
NVIDIA® H100
A powerful GPU suitable for AI, HPC, and scalable computing.
Based on the Hopper™ architecture, with 80 GB of HBM3 memory and a bandwidth of up to 3 TB/s. See the detailed specifications of the NVIDIA® H100 in the NVIDIA® documentation.
Fixed GPU lineup configurations offer 1 to 2 GPUs × 80 GB, with 12 to 48 vCPUs and 128 to 256 GB of RAM.
Custom configurations offer 1 to 2 GPUs × 80 GB, with 2 to 48 vCPUs and 2 GB to 256 GB of RAM.
NVIDIA® H200
A professional GPU suitable for:
- accelerating generative AI;
- high-performance computing (HPC);
- inference of large language models (LLMs);
- model fine-tuning;
- image and video generation.
Based on the Hopper™ architecture, with 141 GB of HBM3 memory and a bandwidth of up to 4.8 TB/s. See the detailed specifications of the NVIDIA® H200 in the NVIDIA® documentation.
Fixed and custom GPU lineup configurations offer 1 to 8 GPUs × 141 GB, with 12 to 192 vCPUs, 120 GB to 1 TB of RAM, and a local disk from 256 GB to 3 TB.
NVIDIA® L4
A versatile GPU for accelerating AI/ML workloads, video processing, streaming, and VDI. Suitable for running modern large language models (LLMs) and multimodal models.
Based on the Ada Lovelace® architecture, with 24 GB of GDDR6 memory and a bandwidth of up to 3 TB/s. See the detailed specifications of the NVIDIA® L4 in the NVIDIA® documentation.
Fixed GPU lineup configurations offer 1 to 8 GPUs × 24 GB, with 8 to 128 vCPUs and 32 to 512 GB of RAM.
Custom configurations offer 1 to 8 GPUs × 24 GB, with 8 to 256 vCPUs and 64 to 640 GB of RAM.
NVIDIA® RTX 6000 Pro
A professional GPU for:
- accelerating generative AI;
- inference of language models (LLMs);
- model fine-tuning;
- image and video generation;
- 3D rendering and video processing.
Based on the Blackwell® architecture, with 96 GB of GDDR7 memory and a bandwidth of up to 1.6 TB/s. See the detailed specifications of the NVIDIA® RTX 6000 Pro in the NVIDIA® documentation.
Fixed and custom GPU lineup configurations offer 1 to 8 GPUs × 96 GB, with 16 to 256 vCPUs, 120 GB to 1 TB of RAM, and a local disk from 256 GB to 3 TB.
Create a cloud server with GPU
Use the Create a cloud server instruction.
When creating a server, select:
- source — GPU-optimized, off-the-shelf images marked in the version list as GPU optimized. These images include the drivers required for GPU operation. If you choose another source, you must install drivers on the server manually for stable NVIDIA® GPU operation;
- configuration — a fixed or custom GPU lineup configuration from 2 vCPUs.