Skip to main content
Connect Selectel Object Storage to ClearML
Last update:

Connect Selectel Object Storage to ClearML

In ClearML, datasets, results and artifacts of experiments can be connected for storage Selectel object storage.

  1. Open the ClearML configuration file clearml.conf. Change the blocks api, aws и development:

    api {
    ...
    files_server: s3://s3.<pool>.storage.selcloud.ru:443/<container_name>
    ...
    }
    ...
    sdk {
    ...
    aws {
    s3 {
    host: "s3.storage.selcloud.ru:443"
    region: "ru-1"
    key: "<access_key>"
    secret: "<secret_key>"
    use_credentials_chain: false
    credentials: [{
    bucket: "<container_name>"
    secure: true
    }]
    }
    boto3 {
    pool_connections: 512
    max_multipart_concurrency: 16
    }
    }
    ...
    }
    ...
    ...
    development {
    ...
    default_output_uri: "s3://s3.<pool>.storage.selcloud.ru:443/<container_name>/<path>"
    ...
    }
    ...

    Specify:

    • <container_name> — name of the container in the object store where datasets and artifacts will be stored. The name can be viewed in control panels under Object Storage → tab Containers;
    • <access_key> — Access Key ID from S3 key issued to the user;
    • <secret_key> — Secret Access Key of S3 key issued to the user.
    • <path> — prefix in the object store;
    • <pool> — pool, in which the object store is located (e.g., ru-1).
  2. To load the datasets into ClearML Server, run the python script.

    An example script to load a single dataset:

    # Создать датасет через класс Dataset
    from clearml import Dataset
    dataset = Dataset.create(
    dataset_name="<dataset_name>",
    dataset_project="<project_name>",
    output_uri="s3://s3.storage.selcloud.ru:443/<container_name>/<path>",
    )

    # Добавить файлы в датасет
    dataset.add_files(
    path="<local_path_to_dataset>",
    )

    # Загрузить датасет в ClearML Server
    dataset.upload()

    # Закоммитить изменения в датасете
    dataset.finalize()

    Specify:

    • <dataset_name> — dataset name, will be displayed in WebApp;
    • <project_name> — project name, will be displayed in WebApp;
    • <container_name>/<path> — prefix in the object store from step 1;
    • <local_path_to_dataset> — path to the dataset on the local machine.