Skip to main content

Aim Virtual Machine

Last update:

Aim Virtual Machine is a preconfigured cloud server with an out-of-the-box tool for tracking experiments in machine learning (ML). The tool is aimed at ML engineers who need to compare models, analyze metrics, and manage experiments.

The image from which the server is deployed contains:

  • Aim - A tool for logging, visualizing, and tracking experiments in ML;

  • Docker - a platform for running containerized applications;

  • drivers required to work with graphics processing units (GPUs).

Tasks to be solved

  • Logging of experiments - recording metrics, hyperparameters, and artifacts throughout the model training process;

  • visualization of experimental results and data comparison via the Aim UI web interface.

Minimum resource requirements

Number of vCPUs1
RAM2 GB.
Boot disk30 GB.
GPU availabilityNot required

Create a cloud server with Aim

  1. In the dashboard, from the top menu, click Products and select AI Marketplace.

  2. Click Create Server.

  3. Fill in the blocks:

  4. Check the price of the server.

  5. Click Create Server.

Name and location

  1. Enter the name of the server.

  2. Select the region and pool segment where the server will be created. The list of available GPUs depends on the pool segment. Once the server is created, you cannot change the region and pool segment.

Source

Select an image Aim VM (Ubuntu 22.04 LTS 64-bit).

props.OptionalGpuForTitleGPU

  1. Click Add GPU.

  2. Select the type of GPU. When selecting a GPU, consider the requirements for ML models and the tools you are using. Refer to the Graphics Processing Units (GPUs) manual for GPU specifications and descriptions.

  3. Specify the number of GPUs.

After the server is created, you can change the type and number of GPUs or delete GPUs. For more information, see Change Cloud Server Configuration.

Configuration

  1. Specify the number of vCPUs.

  2. Specify the size of the RAM.

Once the server is created, you can change the configuration.

Disks

  1. Select the type of boot disk. When using the local disk as the boot disk, GPUs are not available.

  2. Specify the disk size in GB or TB. The maximum size for all network disks is 10,240 GB (10 TB), and the maximum size for a local disk is 1,256 GB (1 TB).

  3. If you selected the SSD Universal v2 disk type, specify the total number of read and write operations in IOPS. After you create the disk, you can change the number of IOPS - decrease or increase. There is no limit to the number of IOPS changes.

  4. Optional: to add additional server disks:

    4.1 Click Add Disk.

    4.2 Select the disk type.

    4.3 Specify the disk size in GB or TB. The maximum size for all network disks is 10,240 GB (10 TB), and the maximum size for a local disk is 1,256 GB (1 TB).

    4.4 If you selected SSD Universal v2 disk type, specify the total number of read and write operations in IOPS. After creating the disk, you can change the number of IOPS - decrease or increase. There is no limit to the number of IOPS changes.

    Once the server is created, you can disconnect additional disks from it or connect new ones.

Network

You can add a server to a new subnet or to an existing subnet. The subnet can be:

  • private without access from the Internet. You cannot connect to the server from the Internet, including via SSH or RDP;
  • private with one public IP address. A static public IP address is connected to the private address of the server via a cloud router. The server will be accessible from the Internet through this public IP address;
  • public, in which all addresses are accessible from the Internet.
  1. To add an existing private subnet:

    1.1. In the Subnet field, select an existing subnet.

    1.2 Optional: Change the default private IP address of the server.

  2. To add a new private subnet:

    2.1. In the Subnet field, select the Private subnet type.

    2.2 Optional: Change the CIDR of the subnet.

    2.3 Optionally, enable the DHCP toggle switch. Learn more about DHCP in the Selectel blog article DHCP Protocol Principles.

    2.4 Optional: Change the IP address of the default gateway.

    2.5 Optional: change the network where the subnet will be created - you can select an existing network or create a new one. If you are creating a new network, enter a name for the network.

Optional: Access

  1. In the Password field for "root":

    1.1 Copy the password of the root user - a user with unlimited rights to all actions on the system.

    1.2 Save the password in a safe place and do not transmit it in public.

  2. Place an SSH key for the project on the server for secure connection:

    2.1 If the SSH key is not added to the cloud platform, click , enter the key name, insert the public key in OpenSSH format, and click Add.

    2.2 If an SSH key is added to the cloud platform, select the existing key in the SSH Key field.

Optional: Additional settings

  1. To create an interruptible server, check the Interruptible Server checkbox.

  2. If you plan to create multiple servers and want to improve the fault tolerance of your infrastructure, add the server to a placement group:

    2.1 To create a new group, click , enter a group name, and select a policy to place on different hosts:

    • preferably - the system will try to place servers on different hosts. If there is no suitable host when creating a server, it will be created on the same host;

    • mandatory - servers in the group must be located on different hosts. If there is no suitable host when creating a server, the server will not be created.

    2.2 If a group has been created, select the placement group in the Placement Group field.

  3. To add additional information or filter servers in the list, add server tags. A tag with the name of the image is automatically added. To add a new tag, enter a tag in the Tags field.

Optional: Automation

  1. To add a script that will be executed by the cloud-init agent when the operating system first starts up, in the User data field:

    • open the Text tab and paste the script with text;
    • or open the File tab and upload the file with the script.

    Examples of scripts and supported formats can be found in the User data instructions.

Run Aim

  1. In the dashboard, from the top menu, click Products and select AI Marketplace.

  2. In the Aim Virtual Machine card, click Go to GUI.

  3. Enter login - admin.

  4. Enter the password - UUID of the server. Can be copied in the control panel: in the top menu, click ProductsAI-marketplace → in the server menu, select Copy UUID.

  5. Click Sign In.