Skip to main content
Monitoring of Kafka cluster and nodes
Last update:

Monitoring of Kafka cluster and nodes

You can monitor the status of the Kafka cluster in the control panel:

Cluster and database node metrics can also be exported in Prometheus format.

View the status of the node cluster

  1. In Control Panel, go to Cloud PlatformDatabases.
  2. Open the cluster page → Monitoring tab.
  3. In the Cluster Server Monitoring block, see the available cluster-node metrics.

Cluster node metrics in the control panel

Average

vCPUHow many percent of the node cluster cores are loaded
Load

The average system load over a period of time. Indicates how many processes are processed by the cluster cores. The indicator is presented as three values — in one minute, five minutes and 15 minutes. These values should be no greater than the number of cores on the node.

MemoryMemory utilization excluding cache and operating system buffers in percent or gigabytes
DiskUsed disk space in percent or gigabytes

View cluster status

  1. In Control Panel, go to Cloud PlatformDatabases.
  2. View the status in the cluster row → Status column.
ACTIVECluster available
CREATINGThe cluster is being created
UPDATINGChanges are applied to the cluster
RESIZINGThe cluster is scaled
ERRORAn error occurred, create a ticket
DISK FULL

The disk is full and the cluster is read-only. For the cluster to work on read and write, clear the disk or scale the cluster and select a configuration with larger disk size

DEGRADEDPart of the cluster nodes are inaccessible
DELETINGCluster is being deleted

Disk fullness notifications

If the cluster disk is 80% full, a notification will automatically be sent to the account's email.

If the cluster disk is 95% or more full, the cluster will go to DISK_FULL status and will be read-only. For the cluster to work on read and write, clean disk or scale cluster and select a configuration with a larger disk size.

Clean disk

Open the transaction transaction_read_only = no and remove unnecessary data using one of the queries:

  • DROP TABLE — deletes the structure (data, privileges, indexes, constraints, triggers). Use when completely deleting a table with data and structure:

    BEGIN;
    SET transaction_read_only = no;
    DROP TABLE table_name;
    COMMIT;
  • TRUNCATE TABLE — deletes the contents of the table, while the structure is preserved. Works faster than DROP TABLE. Use when deleting all rows of a table while preserving the table structure:

    BEGIN;
    SET transaction_read_only = no;
    TRUNCATE TABLE table_name;
    COMMIT;
  • DELETE — use to delete specific rows.

    For your information

    We do not recommend using a DELETE FROM table WHERE ... query to clean up a disk. This query can create large size samples on large tables and place them on disk. The remaining free disk space may run out completely, causing problems with Kafka and the need to manually restore its operation.

Export metrics in Prometheus format

You can export metrics in Prometheus format and then customize monitoring and alerts for Kafka clusters yourself. Historical information for clusters is not available — metrics are only requested in real time.

To export metrics, you need to get a monitoring token. The token gives access to the metrics of all clusters in a single project pool.

  1. In Control Panel, go to Cloud PlatformDatabases.

  2. Open the cluster page → Monitoring tab.

  3. In the Metrics in Prometheus block, click Manage tokens.

  4. Press Create.

  5. Enter the name of the token.

  6. Press Create. The token will be generated automatically.

  7. Add to the Prometheus configuration file:

    scrape_configs:
    - job_name: get-metrics-from-dbaas
    scrape_interval: 1m
    static_configs:
    - targets:
    - '<pool>.dbaas.selcloud.ru'
    scheme: https
    authorization:
    type: Bearer
    credentials: <monitoring_token>

    Specify:

    • <pool> is the pool in which the token is valid, e.g. ru-3. The address (URL) is region and pool dependent, can be viewed in URL list;
    • <monitoring_token> is the value of the monitoring token.
  8. The source of the metrics will appear at http://<localhost>:9090/targets.

    Specify <localhost> — the IP address where Prometheus is installed.

  9. Check out the available metrics-in-prometheus-format.

Metrics in Prometheus format

Prometheus-formatted metrics are provided for all clusters. A specific cluster can be found by the database cluster identifier in the ds_id label.

dbaas_memory_percentMemory utilization excluding operating system cache and buffers (RAM) in percent
dbaas_memory_bytesMemory utilization excluding operating system cache and buffers (RAM) in bytes.
dbaas_cpuCPU utilization on database cluster nodes in percent
dbaas_cpu_iowaitI/O wait time in percent
dbaas_disk_percentUsed disk space in percent
dbaas_disk_bytesDisk space occupied in bytes
dbaas_disk_read_iopsNumber of read operations per second
dbaas_disk_write_iopsNumber of write operations per second
dbaas_disk_read_bytesSpeed of reading data from disk in bytes per second
dbaas_disk_disk_write_bytesSpeed of writing data to disk in bytes per second
dbaas_node_load1Average system load in one minute. Indicates how many processes are being processed by the cluster cores
dbaas_node_load5Average system load over five minutes. Indicates how many processes are processed by the cluster cores
dbaas_node_load15Average system load in 15 minutes. Indicates how many processes are processed by the cluster cores
network_receive_bitsNumber of bits received over the network interface
network_transmit_bitsNumber of bits sent over the network interface
network_receive_packetsNumber of packets received over the network interface per second
network_transmit_packetsNumber of packets sent over the network interface per second