Skip to main content
Connect Selectel Object Storage to ClearML
Last update:

Connect Selectel Object Storage to ClearML

In ClearML, Selectel object storage can be connected to store datasets, results, and experiment artifacts.

  1. Open the ClearML clearml.conf configuration file. Change the api, aws and development blocks:

    api {
    ...
    files_server: s3://s3.<pool>.storage.selcloud.ru:443/<container_name>
    ...
    }
    ...
    sdk {
    ...
    aws {
    s3 {
    host: "s3.storage.selcloud.ru:443"
    region: "ru-1"
    key: "<access_key>"
    secret: "<secret_key>"
    use_credentials_chain: false
    credentials: [{
    bucket: "<container_name>"
    secure: true
    }]
    }
    boto3 {
    pool_connections: 512
    max_multipart_concurrency: 16
    }
    }
    ...
    }
    ...
    ...
    development {
    ...
    default_output_uri: "s3://s3.<pool>.storage.selcloud.ru:443/<container_name>/<path>"
    ...
    }
    ...

    Specify:

    • <container_name> — name of the container in the object store where datasets and artifacts will be stored. The name can be viewed in the control panel: in the top menu, click ProductsObject StorageContainers;
    • <access_key> — Access Key ID from S3 key issued to the user;
    • <secret_key> — Secret Access Key from S3 key issued to the user.
    • <path> — prefix in the object store;
    • <pool> — pool where the object storage is located (e.g, ru-1).
  2. To load the datasets into ClearML Server, run the python script.

    An example script to load a single dataset:

    # Создать датасет через класс Dataset
    from clearml import Dataset
    dataset = Dataset.create(
    dataset_name="<dataset_name>",
    dataset_project="<project_name>",
    output_uri="s3://s3.storage.selcloud.ru:443/<container_name>/<path>",
    )

    # Добавить файлы в датасет
    dataset.add_files(
    path="<local_path_to_dataset>",
    )

    # Загрузить датасет в ClearML Server
    dataset.upload()

    # Закоммитить изменения в датасете
    dataset.finalize()

    Specify:

    • <dataset_name> — dataset name, will be displayed in WebApp;
    • <project_name> — project name, will be displayed in the WebApp;
    • <container_name>/<path> — prefix in the object store from step 1;
    • <local_path_to_dataset> — path to the dataset on the local machine.