Skip to main content

Connect Selectel S3 to ClearML

Last update:

In ClearML, you can connect Selectel S3 to store datasets, results, and experiment artifacts.

  1. Open the ClearML configuration file clearml.conf. Update the api, aws and development: blocks:

    api {
    ...
    files_server: s3://s3.<pool>.storage.selcloud.ru:443/<container_name>
    ...
    }
    ...
    sdk {
    ...
    aws {
    s3 {
    host: "s3.storage.selcloud.ru:443"
    region: "ru-1"
    key: "<access_key>"
    secret: "<secret_key>"
    use_credentials_chain: false
    credentials: [{
    bucket: "<container_name>"
    secure: true
    }]
    }
    boto3 {
    pool_connections: 512
    max_multipart_concurrency: 16
    }
    }
    ...
    }
    ...
    ...
    development {
    ...
    default_output_uri: "s3://s3.<pool>.storage.selcloud.ru:443/<container_name>/<path>"
    ...
    }
    ...

    Specify:

    • <container_name> — the name of the S3 bucket where datasets and artifacts will be stored. You can find the name in the control panel: in the top menu, click ProductsS3Buckets;
    • <access_key> — the Access Key ID from an S3 key issued to the user;
    • <secret_key> — the Secret Access Key from an S3 key issued to the user.
    • <path> — the prefix in S3;
    • <pool> — the pool where S3 is located (for example, ru-1).
  2. To upload datasets to ClearML Server, run a python script.

    Sample script for loading a single dataset:

    # Create dataset via Dataset class
    from clearml import Dataset
    dataset = Dataset.create(
    dataset_name="<dataset_name>",
    dataset_project="<project_name>",
    output_uri="s3://s3.storage.selcloud.ru:443/<container_name>/<path>",
    )

    # Add files to dataset
    dataset.add_files(
    path="<local_path_to_dataset>",
    )

    # Upload dataset to ClearML Server
    dataset.upload()

    # Commit dataset changes
    dataset.finalize()

    Specify:

    • <dataset_name> — the name of the dataset, will be displayed in the WebApp;
    • <project_name> — the name of the project, will be displayed in the WebApp;
    • <container_name>/<path> — the S3 prefix from step 1;
    • <local_path_to_dataset> — the path to the dataset on the local machine</g.