Because prometheus has not for long term storage, it will be erased at appropriate intervals. For long term data store, it would be preserved as being influxdb. However, AWS S3 is more easy place to manage.
AWS Kinesis firehose has came to the Tokyo region in July 2017. By using kinesis firehose, we can automatically save records in S3.
So I implemented a remote write adapter integration of prometheus which can sends records to Kinesis.
https://github.com/shirou/prometheus_remote_kinesis
This is just for evaluation and not deployed to production, there may be some problems.
prometheus_remote_kinesis
How to use
It is necessary to build with go, but I omit it. It is easy to use multi stage build with Docker.
$ prometheus_remote_kinesis -stream-name prometheus-backup
- -s tream-name
- kinesis stream name (required)
- -l isten-addr
- listen address. If not specified `:9501`.
Of course, you should set AWS credentials.
I also put it in the docker hub.
https://hub.docker.com/r/shirou/prometheus_remote_kinesis/
It should start with such feeling.
docker run - d -name remote_kinesis \
--restart=always \
-p 9501:9501 \
-e STREAM_NAME=backup-prometheus-log \
shirou/prometheus_remote_kinesis
Settings on the prometheus side
Set the remote write setting of prometheus.yml
as follows. It is important to add a -
before the url to make it a sequence.
remote_write:
- url: http://localhost:9501/receive
The settings of kinesis and kinesis forehose are omitted.
The setting is over with the above. As time goes on the logs are generated more and more in s3.
JSON format
The data sent to kinesis was made into JSON format like this.
{
"name" : "scrape_duration_seconds" ,
"time" : 1513264725773 ,
"value" : 0.004345524 ,
"labels" : {
"__name__" : "scrape_duration_seconds" ,
"instance" : "localhost: 9090" ,
"job" : "prometheus" ,
"monitor" : "monitor"
}
}
In Timeseries of prometheus, multiple samples can be stored for one records. However, since I do not want to increase the hierarchy much, I flatten it so that I create one record for each sample. As it was impossible to truly label, it is on the map. In addition, it assumes use from Athena or S3 SELECT, and it makes it with new line (JSON-LD). I tried to send it to kinesis with gzip compressed, but I removed currently because my t2.small uses a CPU too much. In addition, it sends it by PutRecords
by every 500 records. As it will be buffered, it may be lost if remoe_kinesis die. Graceful shutdown is implemented, though.
Since the write request from prometheus comes with snappy compression + protobuf, it may be the fastest way to transfer to kinesis with its byte sequence as it is, but this will be difficult to handle it later .
Summary
I created remote storage integration to save to s3 via AWS Kinesis for long term log preset of prometheus.
I have not deployed to production environment, so there may be problems. Another problem is reading from S3. But it is just a plain JSON format, I think it is easy to convert if necessary. Although it is possible to read directory from promehteus with remote read storage integration, but it probably not good performance.
Oh, AlpacaJapan is recruiting highly acclaimed people who will do this around. Please contact Twitter @r_rudi
Top comments (0)