Skip to main content
Connect Selectel Object Storage to ClearML
Last update:

Connect Selectel Object Storage to ClearML

In ClearML, Selectel object storage can be connected to store datasets, results, and experiment artifacts.

  1. Open the ClearML clearml.conf configuration file. Modify the api, aws, and development blocks:

    api {
    ...
    files_server: s3://s3.<pool>.storage.selcloud.ru:443/<container_name>
    ...
    }
    ...
    sdk {
    ...
    aws {
    s3 {
    host: "s3.storage.selcloud.ru:443"
    region: "ru-1"
    key: "<access_key>"
    secret: "<secret_key>"
    use_credentials_chain: false
    credentials: [{
    bucket: "<container_name>"
    secure: true
    }]
    }
    boto3 {
    pool_connections: 512
    max_multipart_concurrency: 16
    }
    }
    ...
    }
    ...
    ...
    development {
    ...
    default_output_uri: "s3://s3.<pool>.storage.selcloud.ru:443/<container_name>/<path>"
    ...
    }
    ...

    Specify:

    • <container_name> is the name of the container in the object store where datasets and artifacts will be stored. The name can be viewed in control panel under Object StorageContainers tab;
    • <access_key> — Access Key ID from the S3-key issued to the user;
    • <secret_key> — Secret Access Key from the S3-key issued to the user.
    • <path> is a prefix in the object store;
    • <pool> — pool, which contains the object store (for example, ru-1).
  2. To load the datasets into ClearML Server, run the python script.

    An example script to load a single dataset:

    # Create dataset via Dataset class
    from clearml import Dataset
    dataset = Dataset.create(
    dataset_name="<dataset_name>",
    dataset_project="<project_name>",
    output_uri="s3://s3.storage.selcloud.ru:443/<container_name>/<path>",
    )

    # Add files to dataset
    dataset.add_files(
    path="<local_path_to_dataset>",
    )

    # Upload dataset to ClearML Server
    dataset.upload()

    # Commit changes to dataset
    dataset.finalize()

    Specify:

    • <dataset_name> — dataset name, will be displayed in WebApp;
    • <project_name> — project name, will be displayed in WebApp;
    • <container_name>/<path> is the prefix in the object store from step 1;
    • <local_path_to_dataset> is the path to the dataset on the local machine.