Skip to main content

Foundation Models Catalog Payment Model and Prices

Last update:

Balance

To pay for cloud platform resources, depending on the balance type in your account, you can use a unified balance or a cloud platform balance.

You can pay for resources with different types of funds: basic funds or bonuses.

Before paying, top up your balance.

Payment model

The cloud platform uses a pay-as-you-go payment model. Every hour, your balance is debited for the previous hour of cloud platform resource usage.

Inference service resource payment is organized by project. Each project contains a group of resources: the number of GPUs, vCPUs, RAM, and disk size.

The resource group cost is updated every astronomical hour.

All cloud platform resources for Foundation Models Catalog are subject to quotas. For quota-based resources, the maximum hourly consumption is taken into account.

note

For example, an inference service with two inference instances is created at 13:25. The configuration of each inference instance is 24 vCPU, 160 GB RAM, 1 GPU, 300 GB disk. At 13:40, the inference service is scaled—the number of inference instances is reduced to one. The invoice for the 13:00–14:00 hour will account for the consumption of 48 vCPU, 320 GB RAM, 2 GPU, 600 GB of disk. For the 14:00–15:00 hour, only the consumption of 24 vCPU, 160 GB RAM, 1 GPU, 300 GB of disk will be accounted for, provided that the number of inference instances was not increased and no other inference services were created.

If an inference service or the number of inference instances in an inference service is added to a project, the payment for resources will change immediately.

note

For example, an inference service with one inference instance is created at 13:25. The configuration of the inference instance is 24 vCPU, 160 GB RAM, 1 GPU, 300 GB disk. At 13:40, the inference service is scaled—the number of inference instances is increased from one to two. For the 13:00–14:00 hour, you will be charged for 48 vCPU, 320 GB RAM, 2 GPU, 600 GB of disk.

Blocking resources if there are insufficient funds on your balance

If there are insufficient funds on your balance at the time of the debit, all cloud platform resources will be automatically blocked — however, you will continue to be charged for them.

To restore access to resources, you need to top up your balance by the amount of debt within 14 days of being blocked. The debt for resources billed during the blocking period will be automatically paid off. Projects are not blocked—you can delete the entire project or individual resources via the API.

If you do not top up your balance by the debt amount within 14 days of being blocked, all cloud platform resources will be deleted. Projects are not deleted in this process.

To ensure you always have enough money on your balance, you can configure balance notifications and automatic balance top-ups.

Prices

The cost depends on the selected inference service configuration. When using an inference service, you only pay for cloud platform resources: GPU, vCPU, RAM, and disk size. The cost does not depend on the number of tokens.

You can view resource prices when creating an inference service in the control panel.

Accounting documents

After payment, you can receive accounting documents.