How to integrate coroot-pg-agent with prometheus

#postgres #prometheusexporter #corootpgagent #postgresmonitoring

For monitoring postgres server most of the opensource stacks consists of grafana with prometheus.

Connecting postgres metrics to prometheus is very interesting task and there are certain tools/libraries are available.

Such libraries are helpful for monitoring and writing alert rules over prometheus.

postgres_exporter(https://github.com/prometheus-community/postgres_exporter)
coroot-pg-agent(https://github.com/coroot/coroot-pg-agent).

Today we are going to discuss more about coroot-pg-agent.

coroot-pg-agent can be run using docker, more information can be found here.(https://github.com/coroot/coroot-pg-agent)

On official postgresql : https://www.postgresql.org/about/news/coroot-pg-agent-an-open-source-postgres-exporter-for-prometheus-2488/

Now while running coroot-pg-agent with prometheus, there are certain things which we should keep it in mind.

coroot-pg-agent using docker runs on port 80 by default, We can run it on custom port using following command through docker

docker run --name coroot-pg-agent \
--env DSN="postgresql://<USER>:<PASSWORD>@<HOST>:5432/postgres?connect_timeout=1&statement_timeout=30000" \
--env LISTEN="0.0.0.0:<custom_port_for_pg_agent>" \
-p <custom_port_for_pg_agent>:<custom_port_for_pg_agent> \
ghcr.io/coroot/coroot-pg-agent

We can also pass scrape-interval using --env PG_SCRAPE_INTERVAL.

After executing above command we see output as follows, custom_port_for_pg_agent is 3000 here.

I0823 21:00:58.259629       1 main.go:35] static labels: map[]
I0823 21:00:58.273610       1 main.go:41] listening on: 0.0.0.0:3000

prometheus.yml for prometheus configs

global:
  scrape_interval: 5m
  scrape_timeout: 3m
  evaluation_interval: 15s

scrape_configs:
  - job_name: prometheus
    static_configs:
      - targets: ["localhost:9090"]
  - job_name: coroot-pg-agent
    static_configs:
      - targets: ["<localhost-ip>:<custom_port_for_pg_agent>"]

We can always change the scrape interval and timeouts as per our needs, was testing over local hence kept it like this.

Keep in mind while editing above yml scrape_interval should always be greater than scrape_timeout.

To run prometheus using docker we can use following command where we're using official image from prometheus at docker

docker run \
    -p 9090:9090 \
    -v ~/pro/prometheus.yml:/etc/prometheus/prometheus.yml \
    prom/prometheus

After executing above command you will see output like following

ts=2022-08-23T21:02:26.203Z caller=main.go:495 level=info msg="No time or size retention was set so using the default time retention" duration=15d
ts=2022-08-23T21:02:26.203Z caller=main.go:539 level=info msg="Starting Prometheus Server" mode=server version="(version=2.38.0, branch=HEAD, revision=818d6e60888b2a3ea363aee8a9828c7bafd73699)"
ts=2022-08-23T21:02:26.203Z caller=main.go:544 level=info build_context="(go=go1.18.5, user=root@e6b781f65453, date=20220816-13:29:14)"
ts=2022-08-23T21:02:26.204Z caller=main.go:545 level=info host_details="(Linux 5.10.47-linuxkit #1 SMP PREEMPT Sat Jul 3 21:50:16 UTC 2021 aarch64 87decec12cad (none))"
ts=2022-08-23T21:02:26.204Z caller=main.go:546 level=info fd_limits="(soft=1048576, hard=1048576)"
ts=2022-08-23T21:02:26.204Z caller=main.go:547 level=info vm_limits="(soft=unlimited, hard=unlimited)"
ts=2022-08-23T21:02:26.205Z caller=web.go:553 level=info component=web msg="Start listening for connections" address=0.0.0.0:9090
ts=2022-08-23T21:02:26.205Z caller=main.go:976 level=info msg="Starting TSDB ..."
ts=2022-08-23T21:02:26.206Z caller=tls_config.go:195 level=info component=web msg="TLS is disabled." http2=false
ts=2022-08-23T21:02:26.207Z caller=head.go:495 level=info component=tsdb msg="Replaying on-disk memory mappable chunks if any"
ts=2022-08-23T21:02:26.207Z caller=head.go:538 level=info component=tsdb msg="On-disk memory mappable chunks replay completed" duration=10.125µs
ts=2022-08-23T21:02:26.207Z caller=head.go:544 level=info component=tsdb msg="Replaying WAL, this may take a while"
ts=2022-08-23T21:02:26.207Z caller=head.go:615 level=info component=tsdb msg="WAL segment loaded" segment=0 maxSegment=0
ts=2022-08-23T21:02:26.207Z caller=head.go:621 level=info component=tsdb msg="WAL replay completed" checkpoint_replay_duration=21.416µs wal_replay_duration=117.958µs total_replay_duration=159.167µs
ts=2022-08-23T21:02:26.208Z caller=main.go:997 level=info fs_type=EXT4_SUPER_MAGIC
ts=2022-08-23T21:02:26.208Z caller=main.go:1000 level=info msg="TSDB started"
ts=2022-08-23T21:02:26.208Z caller=main.go:1181 level=info msg="Loading configuration file" filename=/etc/prometheus/prometheus.yml
ts=2022-08-23T21:02:26.210Z caller=main.go:1218 level=info msg="Completed loading of configuration file" filename=/etc/prometheus/prometheus.yml totalDuration=2.047292ms db_storage=750ns remote_storage=1.709µs web_handler=292ns query_engine=583ns scrape=341.75µs scrape_sd=16.708µs notify=542ns notify_sd=792ns rules=1µs tracing=9.625µs
ts=2022-08-23T21:02:26.210Z caller=main.go:961 level=info msg="Server is ready to receive web requests."
ts=2022-08-23T21:02:26.210Z caller=manager.go:941 level=info component="rule manager" msg="Starting rule manager..."

We can hit localhost:9090 over browser and see screen like following