Skip to main content

Foundation Models Catalog Payment Model and Prices

Last update:

Balance

A single balance or cloud platform balance is used to pay for cloud platform resources depending on the type of balance in the account.

You can pay for resources with different types of funds: fixed assets, bonuses.

Top up your balance before paying.

Payment Model

The cloud platform utilizes a pay-as-you-go payment model.The balance is debited every hour for the previous hour of using the cloud platform resources.

Payment for inference-service resources is formed by projects.Each project contains a group of resources: number of GPUs, vCPU, RAM, disk size.

Every astronomical hour, the cost of the resource group is updated.

All resources on the cloud platform for Foundation Models Catalog are quota.For quota resources, the maximum consumption per hour is taken into account.

note

For example, at 13:25, an inference service with two inference instances is created.The configuration of each inference instance is: 24 vCPUs, 160 GB RAM, 1 GPU, 300 GB disk.At 13:40 inference-service is scaled - the number of inference-instances is reduced to one.The consumption of 48 vCPUs, 320 GB RAM, 2 GPUs, 600 GB disk will be counted in the bill for the hour 13:00-14:00.For the hour 14:00-15:00 only consumption of 24 vCPUs, 160 GB RAM, 1 GPU, 300 GB disk will be counted if the number of inference-instances was not increased or other inference-services were not created.

If a project has increased the number of inference services or the number of inference instances in an inference service, the resource charges will change immediately.

note

For example, at 13:25, an inference service with one inference-instance is created.Configuration of the inference-instance: 24 vCPUs, 160 GB RAM, 1 GPU, 300 GB disk.At 13:40 the inference service is scaled - increasing the number of inference instances from one to two.During the hour 13:00-14:00, 48 vCPUs, 320 GB RAM, 2 GPUs, 600 GB disk will be charged.

Blocking resources if there are insufficient funds in the balance

If there are insufficient funds on your balance at the time of debiting, all resources of the cloud platform will be automatically blocked - and you will continue to be charged for them.

To restore access to resources, you need to top up your balance by the amount of the debt within 14 days after the blocking.The debt for resources that were charged during the blocking period will be automatically repaid.Projects are not blocked - you can delete the entire project or resources via API.

If within 14 days after the blocking you do not replenish the balance for the amount of the debt, all the resources of the cloud platform will be deleted.Projects are not deleted.

You can set up balance notifications and auto-replenishment to ensure that you always have enough money on your balance.

Prices

The cost depends on the selected inference service configuration.When using inference service, you pay only for the cloud platform resources: GPU, vCPU, RAM, disk size.The cost does not depend on the number of tokens.

You can view resource prices when creating an inference service in the control panel.

Reporting documents

Reporting documents are available after payment.