General information about ML-platform product

Selectel's ML platform is a prepared infrastructure for implementing ML development processes: training, deployment of ML models, etc. The infrastructure consists of software and hardware components that are configured and prepared for operation.

All available cloud server configurations are used to select ML platform components.Once the platform is connected, it can be extended with its own software components.Tested use:

ClearML;
Kubeflow — For more information on installing Kubeflow, see the Install Kubeflow instructions.

There are no additional restrictions on Selectel's management of the ML platform cluster.

Platform components

By default, the ML platform consists of:

from hardware components:
- cloud platform — base for Managed Kubernetes with NVIDIA® GPUs (Tesla T4, A2, A30, A100, A2000, A5000, GTX 1080, RTX 2080 Ti);
program components:
- Managed Kubernetes clusters with pre-configuration;
- domain to access the Managed Kubernetes cluster;
- SSO Keycloak — authorization in internal services of the platform;
- Prom Stack — monitoring of platform components;
- Forecastle is the home page of the platform;
- S3 — storage of datasets and experiment data;
- Container Registry — storage of container images.

In Managed Kubernetes clusters:

drivers installed;
annotation of nodes has been performed;
added the necessary GPU resources for computation;
network is configured, including Traefik Kubernetes Ingress.

When the ClearML platform is installed in a cluster, it is directly managed through the SDK, which is installed in the user's own IDE.ClearML uses cluster nodes to run ML experiments.The ClearML architecture allows for a variety of component configurations:

A single Managed Kubernetes cluster for all ML tasks;
several clusters of Managed Kubernetes — each for its own task (Inference and Training);
connecting a dedicated server as a computational node for ML experiments.

Connect the platform

In the Control Panel, on the top menu, click Products and select ML Platform.
Click Create Test Application.
Select a data type.
Specify the amount of data in GB or MB.
Optional: so that we can recommend appropriate ways to connect to the ML platform, enter the data source.For example: Selectel, on-premise, or other cloud providers.
Optional: To allow us to consider your specific data security requirements during the test, check the box Have additional data security requirements in the test.Describe the requirements in the Application Comments field.
Specify the model size in GB or MB.
Specify the number of people who will be using the platform at the same time.
Select the desired GPU model or check the No GPU model requirement checkbox.GPU specifications can be viewed in the Available GPUs subsection of the instructions Create a cloud server with a GPU.
Enter the contacts of the technician. They are needed to clarify the technical details of the test.
Optional: enter comments on the application.For example, specify the desired tools, components, or data security requirements in the test.
Click Submit Application.A ticket will be automatically generated with the ML platform test application.
Wait for a Selectel employee to respond to your ticket.They will contact you with the details of setting up an ML platform.

Cost

The cost of ML-platform is calculated after processing the application and selecting the configuration.It is formed only from the cost of the platform components: Managed Kubernetes cluster, S3 and Container Registry.