Create inference service

1. Select a model

In the Control Panel, on the top menu, click Products and select Foundation Models Catalog.
In the model card, click Create.
Enter the name of the inference service.
To filter the inference services in the list, add tags.A tag with the model name is automatically added.To add a new tag, in the Tags field, type a tag and press Enter.
Optional: enter a description of the inference service.For example, specify its purpose.
Click Continue.

Set the parameters of the model.

1.1 Select the data type of the model parameters.

1.2 Select the data type for the KV cache.

1.3 Select the maximum length of the context.
Select the inference service configuration.When selecting, consider the expected model performance metrics.

Once an inference service has been created, the configuration cannot be changed.
Click Continue.

Check the final configuration of the inference service.
Check the price of inference service.
Click Create Inference Service.Creating an inference service can take about 15 minutes.