Create a cloud server with GPU
GPUs (graphics processing units) can be added to the cloud server — with the server creation or to an existing server.
GPUs are used as dedicated PCI devices inside the cloud server.
GPUs are available:
- in fixed line configurations GPU Line;
- and custom configurations.
GPU Line and custom GPU configurations can be used with a local or network boot disk. For cloud servers, only the NVIDIA® A100 40Gb or NVIDIA® A30 in the ru-7a pool segment can be used with a local disk.
If you need a server with a set of preconfigured tools and libraries for machine learning and data analytics, try AI-marketplace.
Create a server with GPU
Use the instructions Create a cloud server.
Select:
- source — ready-made GPU-optimized image: check the box. GPU optimized imagesto filter GPU-optimized OS images. The images contain the drivers needed to work with GPUs. If you choose a different source, you will need to use NVIDIA® GPUs for stable operation. install drivers on the server yourself;
- configuration — fixed GPU Line configuration or arbitrary configuration from 2 vCPUs.
Add a GPU to an existing cloud server
If the cloud server has an arbitrary configuration, GPUs can be added to it.
For cloud servers with local disk, only NVIDIA® A100 40Gb or NVIDIA® A30 can be added in the ru-7a pool segment.
- В control panels from the top menu, press Products and select Cloud servers.
- Open the server page → tab Configuration.
- Click Change configuration.
- Make sure that in the block Configuration change arbitrary configuration is selected.
- Click Add GPU. If the server has 1 vCPU, the value will automatically change to 2 vCPUs.
- Select the GPU type.
- Specify the number of GPUs.
- Click Save and reboot.
- If the server is not created from an off-the-shelf GPU-optimized image, for stable NVIDIA® GPUs install drivers on the server yourself.
Available GPUs
To see the current list of GPUs, see control panels: from the top menu, press Products → Cloud servers → click Create a server.
To see the availability of GPUs in the regions, see the availability matrix GPU for cloud servers.
NVIDIA® A100 40Gb
Offers maximum performance for AI, HPC and data processing. Suitable for deep learning, scientific research and data analytics.
Powered by Ampere® architecture, with 40GB HBM2 memory and up to 1.5GB/s bandwidth. See detailed specifications in NVIDIA® documentation.
Fixed GPU Line configurations are available from 1 to 8 GPUs × 40 GB, with vCPUs from 6 to 48, RAM from 87 to 700 GB.
Random configurations are 1 to 8 GPUs × 40 GB, with vCPUs from 2 to 32, RAM from 512 MB to 256 GB.
NVIDIA® A100 40Gb NVLink
You can combine two GPUs NVIDIA® A100 40Gb using NVLink technology.
NVLink accelerates the communication speed of GPU interconnect compared to PCIe. GPUs interconnected by NVLink allow more memory to be utilized and improve server performance for complex computations, such as training large language ML models.
NVLink works with NVIDIA® A100 40Gb — GPUs based on Ampere® architecture, with 40GB HBM2 memory and up to 1.5GB/s bandwidth. View detailed specifications NVIDIA® A100 40Gb and description NVLink technologies in the NVIDIA® documentation.
NVIDIA® A100 40Gb NVLink available upon request — file a ticket.
NVIDIA® A100 80Gb
Offers maximum performance for AI, HPC, and data processing, as well as large memory capacity for compute-intensive tasks. Suitable for deep learning, scientific research, and data analytics.
Based on Ampere® architecture, with 80GB HBM2 memory and up to 1.5GB/s bandwidth. See detailed specifications at NVIDIA® documentation.
Fixed GPU Line configurations are available from 1 to 8 GPUs × 80 GB, with vCPUs from 12 to 96, RAM from 128 to 1000 GB.
NVIDIA® Tesla T4
Suitable for Machine Learning and Deep Learning, inference, graphics and video rendering. Works with most AI frameworks and is compatible with all types of neural networks.
Based on Turing® architecture, with 16GB GDDR6 memory and up to 300GB/s bandwidth. See detailed specifications in NVIDIA® documentation.
In fixed GPU Line configurations, 1 to 4 GPUs × 16 GB are available, with vCPUs from 4 to 24, RAM from 32 to 320 GB.
Random configurations are 1 to 4 GPUs × 16 GB, with vCPUs from 2 to 32, RAM from 512 MB to 256 GB.
NVIDIA® A30
Suitable for AI-inference, HPC, language processing, conversational artificial intelligence, recommender systems.
Based on Ampere® architecture, with 24GB HBM2 memory and up to 933GB/s of bandwidth. See detailed specifications at NVIDIA® documentation.
In fixed GPU Line configurations, 1 to 2 GPUs × 24 GB are available, with vCPUs from 16 to 48, RAM from 64 to 320 GB.
In random configurations, 1 to 2 GPUs × 24 GB, with vCPUs from 2 to 32, RAM from 512 MB to 256 GB.
NVIDIA® A2
An entry-level GPU. Suitable for simple inference, video and graphics, Edge AI (edge computing), Edge video, mobile cloud gaming.
Based on Ampere® architecture, with 16GB GDDR6 memory and up to 200GB/s bandwidth. See detailed specifications in NVIDIA® documentation.
In fixed GPU Line configurations, 1 to 4 GPUs × 16 GB are available, with vCPUs from 12 to 48, RAM from 32 to 320 GB.
Random configurations are 1 to 4 GPUs × 16 GB, with vCPUs from 2 to 32, RAM from 512 MB to 256 GB.
NVIDIA® GTX 1080
High-performance and energy-efficient GPU. The solution is realized with FinFET technology and GDDR5X memory. Dynamic load balancing helps to divide tasks so resources don't sit idle waiting. Maximizes performance for display, VR, ultra high-resolution settings, and data processing.
Based on Pascal® architecture, with 8GB GDDR5X memory and up to 320GB/s of bandwidth. See detailed specifications in NVIDIA® documentation.
In fixed GPU Line configurations, 1 to 8 GPUs × 8 GB are available, with vCPUs from 8 to 28, RAM from 24 to 96 GB.
Random configurations are 1 to 8 GPUs × 8 GB, with vCPUs from 2 to 32, RAM from 512 MB to 256 GB.
NVIDIA® RTX 2080 Ti
High-performance GPU for demanding graphics tasks. Suitable for high-resolution video processing, 3D modeling, rendering, and photo processing. Also suitable for training neural networks, performing complex AI computations, and processing large amounts of data.
Based on Turing® architecture, with 11GB GDDR6 memory and up to 616GB/s of bandwidth. See detailed specifications in NVIDIA® documentation.
In fixed GPU Line configurations, 1 to 4 GPUs × 11 GB are available, with vCPUs from 2 to 48, RAM from 32 to 320 GB.
Random configurations are 1 to 4 GPUs × 11 GB, with vCPUs from 2 to 32, RAM from 512 MB to 256 GB.
NVIDIA® RTX 4090
The highest performing GPU in the GeForce series. Suitable for professional design and 3D modeling, video, rendering, ML tasks (model training and inference), LLM models, scientific and engineering computing (e.g., climate modeling or bioinformatics).
Based on Ada Lovelace® architecture, with 24GB GDDR6X memory and up to 1008GB/s of bandwidth. View detailed specifications in NVIDIA® documentation.
In fixed GPU Line configurations, 1 to 4 GPUs × 24 GB are available, with vCPUs from 4 to 64, RAM from 16 to 356 GB.
In random configurations, 1 to 4 GPUs × 24 GB, with vCPUs from 2 to 32, RAM from 4 to 256 GB.
NVIDIA® A2000
Power-efficient GPU for compact workstations. Suitable for AI, graphics and video rendering.
Based on Ampere® architecture, with 6GB GDDR6 memory and up to 288GB/s of bandwidth. See detailed specifications in NVIDIA® documentation.
In fixed GPU Line configurations, 1 to 4 GPUs × 6 GB are available, with vCPUs from 6 to 24, RAM from 16 to 320 GB.
Random configurations are 1 to 4 GPUs × 6 GB, with vCPUs from 2 to 32, RAM from 512 MB to 256 GB.
NVIDIA® A5000
A versatile GPU, suitable for any task within its performance limits.
Based on Ampere® architecture, with 24GB GDDR6 memory and up to 768GB/s of bandwidth. See detailed specifications in NVIDIA® documentation.
In fixed GPU Line configurations, 1 to 2 GPUs × 24 GB are available, with vCPUs from 8 to 48, RAM from 32 to 320 GB.
Random configurations are 1 to 2 GPUs × 24 GB, with vCPUs from 2 to 32, RAM from 512 MB to 256 GB.