How to Design Metrics With Prometheus Metric Types: the USE Method

#kubernetes #tutorial #prometheus

This is the third part of a series about designing metrics for event-driven systems. You can check the first part and the second part of this series before proceeding.

While I discussed the general principles of designing metrics in the first part, I explained Prometheus metric types in the second part. I applied them as the RED method in the second part. In this article, I'll explain the USE method with Prometheus. Finally, a short discussion about the Four Golden Signals and a conclusion about all the methods.

Let's go...

The USE Method

The USE method by Brendan Gregg is a set of rules for designing metrics mainly used for the system not exposed to the users, like databases, message brokers, streaming platforms, etc.
Its key metrics are:

Utilization - the level to which a resource has been used
Errors - distribution of the number of errors per time
Saturation - the level to which a resource has extra work which can not be handled. It has to wait or drop extra work.

Implementation

I'll make an example of the USE method observing a CPU, memory, and network to simplify things and be close to what we use in daily work. I did examples using docker-compose, Prometheus, and Grafana. To get metrics from the system, I'm using the node-exporter. The complete example is in my github repo.

CPU Utilization

CPU utilization is the percentage of time the CPU is busy. The node-exporter provides the node_cpu_seconds_total metrics. This metric is a counter which counts the number of seconds the CPU has spent in each mode. One of the modes is idle, which is when the CPU is not busy.

In a period, say 1m, observe an average change in the idle counter. When subtracting a previously calculated value
from 1, we get the CPU utilization:

1 - avg(rate(node_cpu_seconds_total{mode="idle"}[1m]))

It is the same principle as in the RED method. We use counters, observe the rate of change, and then calculate the average.

If you are interested, continue to the rest on my blog.

DEV Community

How to Design Metrics With Prometheus Metric Types: the USE Method

The USE Method

Implementation

CPU Utilization

Top comments (0)

Read next

DevOps vs. Platform Engineering: Another Trend or the Next Big Thing?

Discord Developer Cheat Sheet

Sharing Secrets Between Kubernetes Clusters Using external-secrets PushSecret

Practical Experience: Integrating Over 50 Neural Networks Into One Open-Source Project