Skip to main content

Auto scaling of a group of nodes

Last update:
For your information

Auto Zoom is not available:

  • for groups of nodes with GPUs without drivers;
  • groups of nodes on dedicated servers.

In a Managed Kubernetes cluster on a cloud server, node groups are autoscaled using Cluster Autoscaler.It helps to optimally utilize cluster resources — depending on the load on the cluster, the number of nodes in the group will automatically decrease or increase.

You do not need to install Cluster Autoscaler in the cluster — it is installed automatically when the cluster is created.After the cluster is created, you can configure Cluster Autoscaler for each node group.

You can enable autoscaling of a group of nodes can be enabled in the control panel, via API Managed Kubernetes or through Terraform.

Managed Kubernetes uses Metrics Server to autoscale pods.

Principle of operation

The minimum and maximum number of nodes in a group can be set when autoscaling is enabled — Cluster Autoscaler will only change the number of nodes within these limits.

If the node group is in ACTIVE status, Cluster Autoscaler checks every 10 seconds if there are Pods in PENDING status and analyzes the load — requests from Pods on vCPU, RAM and GPU.Depending on the results of the check, Nodes are added or removed.The node group at this time goes to PENDING_SCALE_UP or PENDING_SCALE_DOWN status.The status of the cluster during autoscaling is ACTIVE.

Learn more about cluster statuses in the instructions View Cluster Status.

Adding a node

If there are pods in PENDING status and there are not enough free resources in the cluster to accommodate them, the necessary number of nodes will be added to the cluster.In a cluster with Kubernetes version 1.28 and higher, Cluster Autoscaler will work in several groups at once and distribute nodes evenly.

note

For example, you have two node groups with autoscaling enabled.The load on the cluster has increased and requires the addition of four nodes.Two new nodes will be created simultaneously in each node group.

In a cluster with Kubernetes version 1.27 and below, nodes are added one per validation cycle.

Deleting a node

If there are no pods in PENDING status, Cluster Autoscaler checks the number of resources that are requesting pods.

If the requested number of resources for pods on one node is less than 50% of its resources, Cluster Autoscaler marks the node as unnecessary.If the number of resource requests on a node does not increase after 10 minutes, Cluster Autoscaler will check if pods can be moved to other nodes.

Cluster Autoscaler will not migrate pods and therefore will not delete a node if one of the conditions is met:

  • Pods use PodDisruptionBudget;
  • there is no PodDisrptionBudget in Kube-system pods;
  • pods are created without a controller — for example, Deployment, ReplicaSet, StatefulSet;
  • Pods use local storage;
  • the other nodes don't have the resources for the pod's requests;
  • there is a mismatch between nodeSelector, affinity and anti-affinity rules or other parameters.

You can allow such submissions to carry over — add an annotation to do so:

cluster-autoscaler.kubernetes.io/safe-to-evict: "true"

If there are no restrictions, the pods will be moved and the low-loaded nodes will be removed.Nodes are removed one at a time per test cycle.

Autoscaling to zero nodes

In a node group you can configure autoscaling to zero nodes — at low load all nodes in the group are deleted. The node group card with all settings is not deleted. When the load increases, nodes can be added to this node group again.

Autoscaling to zero nodes only works if there are more than two working nodes left in other cluster node groups. Working nodes must remain in the cluster to accommodate the system components that are required for the cluster to function.

note

For example, autoscaling to node zero will not work if in a cluster:

  • two groups of nodes, with one working node in each group;
  • one node group with two working nodes.

When there are no nodes in the group, you don't pay for unused resources.

Recommendations

For optimal performance of Cluster Autoscaler, we recommend:

  • make sure that the project has quotas for vCPU, RAM, GPU and disk capacity to create the maximum number of nodes in the group;
  • specify resource requests in the manifests for pods;
  • check that the nodes in the group have the same configuration and labels;
  • configure PodDisruptionBudget for pods for which no stoppages are allowed.This will help to avoid down time when transferring between nodes;
  • do not use any other Cluster Autoscaler;
  • do not manually modify node resources through the control panel.Cluster Autoscaler will not take these changes into account and all new nodes will be created with the original configuration.

Enable autoscaling

For your information

If you set the minimum number of nodes in the group to be greater than the current number of nodes, it will not scale to the lower limit immediately.The group of nodes will scale only after the pods appear in PENDING status.Same with the upper limit of nodes in the group — if the current number of nodes is greater than the upper limit, deletion will start only after the pods are checked.

  1. In the dashboard, on the top menu, click Products and select Managed Kubernetes.
  2. Open the Cluster page → Cluster Composition tab.
  3. From the menu of the node group, select Change Number of Nodes.
  4. In the Number of nodes field, open the With autoscaling tab.
  5. Set the minimum and maximum number of nodes in the group — the value of nodes will change only in this range.For fault-tolerant operation of system components we recommend using at least two working nodes in the cluster, nodes can be in different groups.
  6. Click Save.

Configure Cluster Autoscaler

You can configure Cluster Autoscaler separately for each node group.

View the parameters, their descriptions, and default values in the Cluster Autoscaler Param eters table. If you do not specify a parameter in the manifest, the default value will be used.

Manifesto example:

apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-autoscaler-nodegroup-options
namespace: kube-system
data:
config.yaml: |
150da0a9-6ea6-4148-892b-965282e195b0:
scaleDownUtilizationThreshold: 0.55
scaleDownUnneededTime: 7m
zeroOrMaxNodeScaling: true
e3dc24ca-df9d-429c-bcd5-be85f8d28710:
scaleDownGpuUtilizationThreshold: 0.25
ignoreDaemonSetsUtilization: true

Here 150da0a9-6ea6-4148-892b-965282e195b0 and e3dc24ca-df9d-429c-bcd5-be85f8d28710 are the unique identifiers (UUIDs) of the node groups in the cluster.You can view them in the control panel: in the top menu, click ProductsManaged KubernetesKubernetes section ⟶ cluster page ⟶ copy the UUID above the node group card, next to the pool segment.

Cluster Autoscaler Settings

DescriptionDefault value
scaleDownUtilizationThresholdThe minimum vCPU and RAM utilization of a node at which the system can delete the node. If the node uses less than the specified percentage of vCPU and RAM, for example, less than 50% with a value of 0.5, the system removes the node0.5
scaleDownGpuUtilizationThresholdThe minimum GPU utilization at which the system can delete a node. If the node uses less than the specified percentage of GPU, for example, less than 50% with a value of 0.5, the system removes the node0.5
scaleDownUnneededTimeWait time before removing a low-load node. The system will not remove a node as soon as the node load drops — it will wait for a specified time to make sure that the load drop is stable10m
scaleDownUnreadyTimeThe time to wait before deleting a node in NotReady status. The system will not leave a node in NotReady status in the cluster — it will wait the specified time to make sure that the node is hung and will not recover, and then delete it20m
maxNodeProvisionTimeWaiting time for adding a new node. If an error occurs and a node is not added within the specified time, the system will restart the node addition process15m
zeroOrMaxNodeScalingAllows you to automatically change the number of nodes only up to zero or the maximum you set. This is useful if you want the system to deploy all nodes in a group at once when a load occurs, and remove all nodes when no load occursfalse
ignoreDaemonSetsUtilizationAllows DaemonSets to be disregarded when the system determines whether to reduce the number of nodes in a group. If true, service services are not countedfalse