Hello everyone, in this post I gonna explain how to monitor your .Net Core service using Prometheus and Grafana.
Why do you need monitoring of your app
Let's start from identifying reason to add metrics to your app. Why do we need metrics? Metrics allow us to check performance of app, find hotspots in your code, analyze app data and collect statistics. How many users do you have? How fast is your app for them? How often does your app call database and how fast is it's response? How many of users do some action on your website? Metrics allow you to answer all this questions. Typically in your app you could use 2 kinds of metrics:
1) Application monitoring tool that could monitor app performance and microservices communication (examples: New Relic, AppDynamics)
2) Monitoring that allows to add metrics into your app, mostly business or performance metrics. This is what Prometheus designed for.
Metrics could be collected on app side and sent to some storage (often it's time-series database) using 2 models:
1) Push model. Every interval (for example 30s) app sends metrics to some endpoint in predefined format
2) Pull model. App exposes some endpoint with metrics and external tool collects them from time to time. Prometheus mostly uses this model. I prefer it over push because it's less complicated and app shouldn't care about sending metrics somewhere.
Prometheus
Prometheus is a monitoring system and time-series database. It uses pull model for collecting metrics.
Grafana
Grafana is a tool that allows you to visualize metrics. Is supports multiple target databases with metrics and provide a way to query them and show output as a chart.
Prometheus-net
Prometheus-net is a .NET library that supports Prometheus metrics collection and sharing them on specific endpoint.
Metric server
For exposing metrics you should enable metrics server in following way:
public void Configure(IApplicationBuilder app)
{
// some code
app.UseMetricServer(5000, "/prometheus"); // starts exporter on port 5000 and endpoint /prometheus
}
Monitoring middleware
Out of box library provides middleware that collects metrics of http requests. It tracks count of currently executed requests and timing. It could be enabled in following way:
public void Configure(IApplicationBuilder app)
{
// some code
app.UseRouting();
app.UseHttpMetrics();
// some code
}
It's recommended to call UseHttpMetrics
after UseRouting
because Prometheus tracks controller and action names. It's good if you want to have detailed statistics per endpoint or controller but it skips all previous middlewares! Middleware shouldn't be slow but it's still a problem that metrics are not accurate.
Custom metrics
1) Counter
Counter is simple counter that only increments it's value:
var counter = new Counter("cache_misses_total", "Cache misses total count");
counter.Inc(); // +1
counter.Inc(100); // +100
Note that every metric require 2 mandatory fields - metric name (I recommend to use default Prometheus format like in example above) and short description.
In Grafana mostly you will visualize increase of your metric value during some period of time:
increase(cache_misses_total[5m])
Note that it's not necessary to hardcode 5m
value, there are dashboard variables that could be used for that purpose: link
2) Gauge
Gauge is a counter with decrement operation support:
var gauge = new Gauge("cache_misses_balance", "Cache misses balance");
gauge.Inc(); // +1
gauge.Inc(100); / +100
gauge.Dec(); // -1
gauge.Dec(20); // -20
gauge.Set(50); // set 50 as gauge value
3) Histogram
Histogram is used for tracking number of events splitted between buckets. It often used for tracking time of operations. Prometheus also supports calculation of quantiles based on histogram value but it's not 100% accurate because Prometheus doesn't track exact values of events inside buckets. If buckets are small it's fine but if not - linear approximation is used that could result in lost of metrics accuracy.
4) Summary
Summary is close to histogram but it's calculated on client side, so it gives more accurate values but uses more CPU. I don't recommend to use it because in most cases trend is more important that 100% accurate metrics values.
Tracking time
All mentioned metrics has support for tracking time of operations in convenient way:
using (histogram.NewTimer())
{
ExecuteOperation();
}
No need for awkward StopWatch
usages in your code!
Labels
What if we have single operation that could have 1 or more results? What if we have single operation that could have 1 or more possible options of execution? For example, single HTTP request could be processed by N controllers with M actions, how to process that case? This is what labels were designed for. Let's say we try to access the cache and we have 2 possible result: success (value was found) and fail (value is missing). We can track those cases separately using labels:
var counter = new Counter("cache_usages_count", "Cache usages count", new[] {"result"}); // result is label name
counter.WithLabels("Success").Inc(); // success is "result" label value
counter.WithLabels("Fail").Inc();
Monitoring application
In application I recommend to use wrapper around prometheus-net library. It could be useful for changing prometheus-net to AppMetrics or similar library in future. This is what I've used:
public interface IMetricsFactory
{
ICounter CreateCounter(CounterConfiguration configuration);
IGauge CreateGauge(GaugeConfiguration configuration);
// etc...
}
Where metrics interfaces are similar to prometheus-net interfaces:
public interface ICounter
{
void Increment(double value = 1, params string[] labels);
}
Also all interfaces inherit interface with timer support:
public interface IMetricWithTimer
{
ITimer CreateTimer(params string[] labels);
}
public interface ICounter : IMetricWithTimer
{
// etc...
}
I don't like static way of using metrics libraries because in this case all of your nugets/projects depend on exact implementation so I use more canonical OOP way and moved all abstractions into one nuget and implementations into second one. So now all child projects depends on interfaces only while root project has package with implementations installed.
Inside package with implementations I used wrappers around default Prometheus metrics:
public class Counter : ICounter
{
private readonly PrometheusCounter _counter;
public Counter(PrometheusCounter counter)
{
_counter = counter;
}
public IDisposable CreateTimer() => _counter.NewTimer();
public void Increment(double value = 1, params string[] labels) => _counter.WithLabels(labels).Inc(value);
// etc...
}
Instance of prometheus counter is injected by metrics factory:
using PrometheusCounter = Prometheus.Counter;
public class MetricsFactory : IMetricsFactory
{
public ICounter CreateCounter(CounterConfiguration configuration)
{
var prometheusCounter = new PrometheusCounter(configuration.MetricName, configuration.MetricDescription);
return new Counter(prometheusCounter);
}
// etc...
}
where CounterConfiguration
is simple:
public class CounterConfiguration
{
public string MetricName { get; }
public string MetricDescription { get; }
public CounterConfiguration(string metricName, string metricDescription)
{
MetricName = metricName;
MetricDescription = metricDescription;
}
}
Enable monitoring in Prometheus
Now you should start to monitor your app from Prometheus itself. You should modify your default prometheus.yml
file to achieve that:
scrape_configs:
- job_name: prometheus
static_configs:
- targets: ['localhost:5000']
Grafana dashboards
Great, we added metrics to our app. How to add cool charts like in hacker films to our Grafana? You should setup data source in your Grafana first: link. After that you can create dashboard add charts there. You can use
this one as a base.
Enhanced dotnet metrics
If you need more dotnet metrics please check this library. Please note that adding dotnet metrics can affect performance!
Conclusion
In this article I showed an example of basic usage prometheus-net library for monitoring your dotnet app. What are you using for monitoring your app? Tell me in comments
Top comments (2)
Hello, I think I found some errors in your code snippets.
The IMetricWithTimer should be:
And the Counter
It should be noted that this article only applies to aspnet, not any dotnet service