Monitoring a Redis cluster, nodes, and databases

In Redis Managed Databases, you can monitor the cluster state.

To assess the general state of the cluster, check its status.

For a more detailed analysis, you can:

view the status of cluster nodes — as charts in the control panel;
view the status of databases — as charts in the control panel;
export cluster node and database metrics in Prometheus format.

When analyzing charts, keep in mind that the time in the Control Panel corresponds to the time on your device and does not depend on the region where the cluster is located.

note

For example, you have created a cluster in Tashkent, in the uz-1 pool. The device used to access the Control Panel is set to Moscow time. The time on the metrics charts will be displayed in Moscow time.

View cluster status

In the control panel top menu, click Products and select Managed Databases.
Open the Active tab.

In the cluster row, check the status.

ACTIVE	Cluster is available
CREATING	Cluster is being created
UPDATING	Cluster is updating
RESIZING	Cluster is scaling
ERROR	An error occurred, create a ticket
DISK FULL	Disk is full; the cluster is read-only. To make the cluster read and write, scale the cluster and select a configuration with a larger disk size
DEGRADED	Some cluster nodes are unavailable
DELETING	Cluster is being deleted

View cluster node status

In the control panel top menu, click Products and select Managed Databases.
Open the Active tab.
Open the cluster page → Monitoring tab.
In the Cluster monitoring block, click Cluster nodes.
Select the nodes for which you want to view metrics.
View the available cluster node metrics.

Cluster node metrics in the control panel

Memory	Used memory excluding OS cache and buffers, in percent or gigabytes
vCPU	vCPU usage of cluster nodes in percent
CPU iowait	Percentage of time the CPU spent waiting for I/O
Disk	Occupied disk space in percent or gigabytes. It takes into account the part of disk space reserved for service needs and unavailable for database placement. For more information about reserving disk space, see the instructions Using disk space in a Redis cluster
Load Average	Average system load over a period of time. It shows the number of processes being handled by the cluster cores. The indicator is presented as three values — for one minute, five minutes, and 15 minutes. These values should not exceed the number of cores on a node
OOM	Number of processes that terminated with an `Out of Memory` error due to insufficient RAM
Disk load	Read and write speed in KB/s or the number of read and write operations per second
Network load	Number of bits or packets sent and received via the network interface

View database status

In the control panel top menu, click Products and select Managed Databases.
Open the Active tab.
Open the cluster page → Monitoring tab.
In the Cluster monitoring block, click Databases.
Select the nodes for which you want to view metrics.
View the available database metrics.

Database metrics in the control panel

Evicted and expired keys	Two parameters are displayed: `Evicted` — number of keys evicted due to memory limit exceeded; `Expired` — number of expired keys
Number of keys	Number of keys in all databases and number of keys with a defined time-to-live (TTL)
Connections	Number of connections to cluster databases
Slow queries	Number of slow queries per second. Queries that take longer than 0.01 seconds are considered slow
Queries	Number of successful and failed queries per second. Requests for which a key does not exist, has been evicted due to memory limits, or has exceeded its time to live (TTL) are considered failed

1. Get a token

The token provides access to metrics for all clusters of a project in one pool.

In the control panel top menu, click Products and select Managed Databases.
Open the Active tab.
Open the cluster page → Monitoring tab.
In the Tokens for Prometheus block, click Create token. The token will be generated automatically.
Copy the token. To do this, click in the token row.

2. Get metrics in Prometheus format

Historical information for clusters is unavailable; metrics are requested in real time only. A list of all supported metrics in Managed Databases and their descriptions can be viewed in the Metrics in Prometheus format table.

Configuration file
CLI

Add the following to the Prometheus configuration file:
```
scrape_configs:
  - job_name: get-metrics-from-dbaas
    scrape_interval: 1m
    static_configs:
      - targets:
        - '<domain>'
    scheme: https
    authorization:
      type: Bearer
      credentials: <token>
```
Specify:
- <domain> — Managed Databases API domain. This is the part of the URL used to access the API without https:// and /v1, for example ru-3.dbaas.selcloud.ru. The URL depends on the region and pool and can be found in the list of URLs;
- <token> — the token you copied when obtaining the token in step 5.
Open the page in your browser where the metrics in Prometheus format will be available:
```
http://<ip_address>:9090/targets
```
Specify <ip_address> — the IP address where Prometheus is installed.
Configure monitoring and alerts for database clusters yourself.

Metrics in Prometheus format

Metrics in Prometheus format are provided for all clusters. A specific cluster can be found by the database cluster ID in the ds_id label.

Infrastructure-level metrics
Application-level metrics

dbaas_memory_percent	Used memory excluding OS cache and buffers (RAM) in percent
dbaas_memory_bytes	Used memory excluding OS cache and buffers (RAM) in bytes
dbaas_oom_count	Number of processes that terminated with an `Out of Memory` error due to insufficient RAM
dbaas_cpu	vCPU usage on database cluster nodes in percent
dbaas_cpu_iowait	I/O wait time in percent
dbaas_disk_percent	Occupied disk space in percent. It takes into account the part of disk space reserved for service needs and unavailable for database placement. For more information about reserving disk space, see the instructions Using disk space in a Redis cluster
dbaas_disk_bytes	Occupied disk space in bytes. It takes into account the part of disk space reserved for service needs and unavailable for database placement. For more information about reserving disk space, see the instructions Using disk space in a Redis cluster
dbaas_disk_read_iops	Number of read operations per second
dbaas_disk_write_iops	Number of write operations per second
dbaas_disk_read_bytes	Disk read speed in bytes per second
dbaas_disk_write_bytes	Disk write speed in bytes per second
dbaas_node_load1	Average system load over one minute. Shows how many processes are being processed by cluster cores
dbaas_node_load5	Average system load over five minutes. Shows how many processes are being processed by cluster cores
dbaas_node_load15	Average system load over 15 minutes. Shows how many processes are being processed by cluster cores
dbaas_network_receive_bytes	Number of bytes received via the network interface
dbaas_network_transmit_bytes	Number of bytes sent via the network interface
dbaas_network_receive_packets	Number of packets received via the network interface per second
dbaas_network_transmit_packets	Number of packets sent via the network interface per second
dbaas_role	Node role: `0` — role unknown; `1` — master; `2` — replica

dbaas_connected_clients	Number of connections to cluster databases
dbaas_keyspace_hits_total	Number of successful requests per second
dbaas_keyspace_misses_total	Number of failed requests per second. Requests are considered failed if they indicate that the key does not exist, was evicted due to memory limit exceeded, or its TTL has expired
dbaas_db_keys	Total number of keys in all databases
dbaas_db_keys_expiring	Total number of expired keys
dbaas_evicted_keys_total	Number of keys evicted due to memory limit exceeded
dbaas_expired_keys_total	Number of keys with an expired time-to-live
dbaas_slowlog_length	Number of slow queries per second. Queries taking longer than 0.01 seconds are considered slow