Skip to main content
Monitoring of Kafka cluster and nodes
Last update:

Monitoring of Kafka cluster and nodes

You can monitor the status of the Kafka cluster in the control panel:

Cluster and database node metrics can also be export in Prometheus format.

View the status of the node cluster

  1. В control panels go to Cloud platformDatabases.
  2. Open the cluster page → tab Monitoring.
  3. In the block Monitoring of cluster servers see what's available node cluster metrics.

Cluster node metrics in the control panel

vCPUHow many percent of the node cluster cores are utilized
Load Average

The average system load over a period of time. Shows how many processes are processed by the cluster cores. The indicator is presented in the form of three values — for one minute, five minutes and 15 minutes. These values should not be greater than the number of cores on the node.

MemoryMemory utilization excluding cache and operating system buffers in percent or gigabytes
DiskOccupied disk space in percent or gigabytes

View cluster status

  1. В control panels go to Cloud platformDatabases.
  2. Look at the status in the cluster row → column Status.
ACTIVEThe cluster is available
CREATINGA cluster is created
UPDATINGChanges are applied to the cluster
RESIZINGThe cluster is scalable
ERRORThere's been a mistake, file a ticket
DISK FULL

The disk is full and the cluster is read-only. To make the cluster read and write, disk cleanup or scale the cluster and select a configuration with a larger disk size

DEGRADEDSome nodes in the cluster are unavailable
DELETINGThe cluster is being deleted

Disk fullness notifications

If the cluster disk is 80% full, a notification will automatically be sent to the account's email.

If the cluster disk is 95% or more full, the cluster will go into status DISK_FULL and will be read-only. To make the cluster read-write, disk cleanup or scale the cluster and select a configuration with a larger disk size.

Clear the disk

Open a transaction transaction_read_only = no and delete unnecessary data using one of the queries:

  • DROP TABLE — deletes the structure (data, privileges, indexes, constraints, triggers). Use when completely deleting a table with data and structure:

    BEGIN;
    SET transaction_read_only = no;
    DROP TABLE table_name;
    COMMIT;
  • TRUNCATE TABLE — deletes the contents of the table, but the structure is preserved. Works faster DROP TABLE. Use when deleting all rows of a table while preserving the table structure:

    BEGIN;
    SET transaction_read_only = no;
    TRUNCATE TABLE table_name;
    COMMIT;
  • DELETE — use to delete specific strings.

For your information

We do not recommend using the query DELETE FROM table WHERE ... to clean up the disk. This query can create oversized samples on large tables and place them on disk. The remaining free disk space may run out completely, causing problems with Kafka and the need to restore it manually.

Export metrics in Prometheus format

You can export metrics in the format Prometheus and then configure monitoring and alerts for Kafka clusters yourself. Historical information for clusters is not available — metrics are only requested in real time.

To export metrics, you need to get a monitoring token. The token gives access to metrics of all clusters in one project pool.

  1. В control panels go to Cloud platformDatabases.

  2. Open the cluster page → tab Monitoring.

  3. In the block Metrics in Prometheus format click Manage tokens.

  4. Click Create.

  5. Enter the name of the token.

  6. Click Create. The token will be generated automatically.

  7. Add to the Prometheus configuration file:

    scrape_configs:
    - job_name: get-metrics-from-dbaas
    scrape_interval: 1m
    static_configs:
    - targets:
    - '<pool>.dbaas.selcloud.ru'
    scheme: https
    authorization:
    type: Bearer
    credentials: <monitoring_token>

    Specify:

    • <pool> — the pool in which the token operates, e.g. ru-3. The address (URL) depends on the region and pool, you can look in the URL list;
    • <monitoring_token> — monitoring token value.
  8. The source of the metrics will appear at http://<localhost>:9090/targets.

    Specify <localhost> — IP address where Prometheus is installed.

  9. See what's available Prometheus metrics.

Metrics in Prometheus format

Metrics in Prometheus format are provided for all clusters. A specific cluster can be found by the database cluster identifier in the label ds_id.

dbaas_memory_percentMemory utilization excluding cache and operating system buffers (RAM) in percent
dbaas_memory_bytesOccupied memory excluding cache and operating system buffers (RAM) in bytes
dbaas_oom_countNumber of processes that terminated with an Out of Memory error due to lack of RAM
dbaas_cpuPercent vCPU utilization on database cluster nodes
dbaas_cpu_iowaitI/O waiting time in percent
dbaas_disk_percentPercentage of disk space occupied
dbaas_disk_bytesOccupied disk space in bytes
dbaas_disk_read_iopsNumber of read operations per second
dbaas_disk_write_iopsNumber of recording operations per second
dbaas_disk_read_bytesDisk read speed in bytes per second
dbaas_disk_write_bytesData write speed to disk in bytes per second
dbaas_node_load1The average value of system load in one minute. Shows how many processes are processed by the cluster cores
dbaas_node_load5The average system utilization over five minutes. Shows how many processes are processed by the cluster cores
dbaas_node_load15The average value of system utilization over 15 minutes. Shows how many processes are processed by the cluster cores
dbaas_network_receive_bytesNumber of bytes received through the network interface
dbaas_network_transmit_bytesNumber of bytes sent through the network interface
dbaas_network_receive_packetsNumber of packets received through the network interface per second
dbaas_network_transmit_packetsNumber of packets sent through the network interface per second
dbaas_role

Role of the node:

  • 0 — role unknown;
  • 1 — master;
  • 2 — replica