Skip to main content

General information about ML-platform product

Last update:

Selectel's ML platform is a prepared infrastructure for implementing ML development processes: training, deployment of ML models, and so on. The infrastructure consists of software and hardware components that are configured and prepared for operation.

When selecting ML platform components, all available cloud server configurations are used. Once the platform is connected, it can be extended with its own software components. Tested use:

  • ClearML;
  • Kubeflow - For more information on installing Kubeflow, see the Install Kubeflow instructions.

There are no additional restrictions on Selectel's management of the ML platform cluster.

Platform components

By default, the ML platform consists of:

  • from hardware components:
    • cloud platform - base for Managed Kubernetes with NVIDIA® GPUs (Tesla T4, A2, A30, A100, A2000, A5000, GTX 1080, RTX 2080 Ti);
  • program components:
    • Managed Kubernetes clusters with pre-configuration;
    • domain to access the Managed Kubernetes cluster;
    • SSO Keycloak - authorization in internal services of the platform;
    • Prom Stack - monitoring of platform components;
    • Forecastle is the home page of the platform;
    • S3 - storage of datasets and experiment data;
    • Container Registry - storage of container images.

In Managed Kubernetes clusters:

  • drivers installed;
  • annotation of nodes has been performed;
  • added the necessary GPU resources for computation;
  • network is configured, including Traefik Kubernetes Ingress.

When the ClearML platform is installed in a cluster, it is directly managed through the SDK, which is installed in the user's own IDE. ClearML uses cluster nodes to run ML experiments. The ClearML architecture allows for a variety of component configurations:

  • A single Managed Kubernetes cluster for all ML tasks;
  • several clusters of Managed Kubernetes - each for its own task (Inference and Training);
  • connecting a dedicated server as a computational node for ML experiments.

Connect the platform

  1. In the Control Panel, on the top menu, click Products and select ML Platform.
  2. Click Create Test Application.
  3. Select a data type.
  4. Specify the amount of data in GB or MB.
  5. Optional: So that we can recommend appropriate ways to connect to the ML platform, enter the data source. For example: Selectel, on-premise, or other cloud providers.
  6. Optional: To allow us to consider your specific data security requirements during the test, check the box Have additional data security requirements in the test. Describe the requirements in the Comments field of the application.
  7. Specify the model size in GB or MB.
  8. Specify the number of people who will be using the platform at the same time.
  9. Select the desired GPU model or check the No GPU model requirement checkbox. You can see the GPU specifications in the Available GPUs subsection of the Create a cloud server with a GPU instruction.
  10. Enter the contacts of the technician. These are required to clarify the technical details of the test.
  11. Optional: enter comments for the application. For example, specify the desired tools, components, or data security requirements for the test.
  12. Click Submit Application. A ticket with a request for ML-platform test will be automatically generated.
  13. Wait for a Selectel employee to respond to your ticket. He or she will contact you with the details of creating an ML platform.

Cost

The cost of ML-platform is calculated after processing the application and selecting the configuration. It is formed only from the cost of the platform components: Managed Kubernetes cluster, S3 and Container Registry.