Skip to main content

Connect Selectel Object Storage to ClearML

Last update:

In ClearML, Selectel object storage can be connected to store datasets, results, and experiment artifacts.

  1. Open the ClearML clearml.conf configuration file. Change the api, aws and development blocks:

    api {
    ...
    files_server: s3://s3.<pool>.storage.selcloud.ru:443/<container_name>
    ...
    }
    ...
    sdk {
    ...
    aws {
    s3 {
    host: "s3.storage.selcloud.ru:443"
    region: "ru-1"
    key: "<access_key>"
    secret: "<secret_key>"
    use_credentials_chain: false
    credentials: [{
    bucket: "<container_name>"
    secure: true
    }]
    }
    boto3 {
    pool_connections: 512
    max_multipart_concurrency: 16
    }
    }
    ...
    }
    ...
    ...
    development {
    ...
    default_output_uri: "s3://s3.<pool>.storage.selcloud.ru:443/<container_name>/<path>"
    ...
    }
    ...

    Specify:

    • <container_name> — name of the container in the object store where datasets and artifacts will be stored. The name can be viewed in the control panel: in the top menu, click ProductsObject StorageContainers;
    • <access_key> — Access Key ID from S3 key issued to the user;
    • <secret_key> — Secret Access Key from S3 key issued to the user.
    • <path> — prefix in the object store;
    • <pool> — pool where the object storage is located (e.g, ru-1).
  2. To load the datasets into ClearML Server, run the python script.

    An example script to load a single dataset:

    # Создать датасет через класс Dataset
    from clearml import Dataset
    dataset = Dataset.create(
    dataset_name="<dataset_name>",
    dataset_project="<project_name>",
    output_uri="s3://s3.storage.selcloud.ru:443/<container_name>/<path>",
    )

    # Добавить файлы в датасет
    dataset.add_files(
    path="<local_path_to_dataset>",
    )

    # Загрузить датасет в ClearML Server
    dataset.upload()

    # Закоммитить изменения в датасете
    dataset.finalize()

    Specify:

    • <dataset_name> — dataset name, will be displayed in WebApp;
    • <project_name> — project name, will be displayed in the WebApp;
    • <container_name>/<path> — prefix in the object store from step 1;
    • <local_path_to_dataset> — path to the dataset on the local machine.