In a nutshell
I have written a Saltstack Prometheus exporter in Go and I am here to share my experience working on it.It is fully open-source:
Infrastructure and development
When talking about tech, I like many things…
Prometheus
I like Prometheus, and I like to write Prometheus exporters. It truly is a game changer when you need to expose service metrics to get fancy dashboards and alerts. The metrics are elegant, efficient, and easy to operate thanks to PromQL. They are also easy to integrate to any development project.
// Metrics:
salt_expected_responses_total{function="cmd.run", state=""} 6
salt_function_responses_total{function="cmd.run",state="",success="true"} 6
// Prometheus query to get number missing responses:
salt_expected_responses_total - on(function) salt_function_responses_total
// elegant, is it not?
Saltstack
I like Salt to manage and operate multiple systems. It is one of the most known infrastructure automation system, among Ansible, Chef, Puppet and so on. I am using it for different needs, at work, for my home lab, and also for several associations I am operating the infrastructure for. The system is not perfect, but from my point of view it is one of the best (some might disagree and it is totally fine).
salt "*" state.sls do_awesome_stuffs
Coding!
There is also another thing I really enjoy doing: writing code. For the past 7 years I have been writing Python code, before that it was C++/C# and Java. But a couple of years ago, I found a new language to play with: Golang. I may repeat myself in the adjectives, but Go is elegant, easy to learn and quite powerful.
package main
func main() {
go func(){
// straightforward concurrency!!
}()
// …
}
// elegant, is...? (am I repeating myself?)
Open Source
There is one last thing to add to the "I like" list: Open Source :)
It is always good to see that your work benefits to people. And if you are lucky enough, they can even contribute by reporting issues, requesting features or submitting changes. Most importantly, you do not need to have a big and well known project to start seeing people using it and contributing to it.
By making it open-source, you usually end up with a better project:
- features you did not think about
- fix of bugs you never encountered with your usecases
- a documentation you would have not written for yourself
This is something I was lucky to experience with my previous open-source project (mqtt-exporter) which reinforced my wish to continue on the Open Source journey.
The project: salt-exporter
You might have noticed, I am kind of an enthusiast regarding technologies, especially regarding infra and code. Most importantly I like to mix them both.
You might wonder why I am telling you all of this.
Well, I had what seems to be a simple need: metrics to know the good health of Salt jobs. It was a perfect opportunity to try my new preferred cocktail 🍸: some Salt, some Prometheus, some Go, with a pinch of Python.
Obvious approaches != best approaches
First thing first, to get metrics I needed to find the appropriate way to get the data.
The first obvious approach was to leverage the Salt API. However, the API module is not running nor configured by default. Using it would mean requiring the users to start and monitor the Salt API. Users would also need to change the configuration of the Salt master to configure the authentication system and fine tune the permissions.
My focus is always to have something that works out of the box limiting any requirements to the minimum . Method rejected!
Another approach would be to create a Salt engine or returner. But this would mean writing the exporter in Python while I want to write it in Go (because why not). And again, too much configuration to my taste. I had a feeling that I could create something that even the Salt master would ignore the existence of. Methods rejected!
Deep dive in Salt internals
To be completely honest with you, my first idea was actually to connect to the ZMQ of Salt. But I wanted to explore the other "interfaces" Salt is providing. Anyway, the ZMQ was now my best shot, or so I thought...
To better understand how ZMQ was integrated, I read carefully the well written Salt architecture documentation. This very interesting reading was suddenly interrupted by one specific sentence:
The EventPublisher picks it up and distributes it to all connected event listeners on master_event_pub.ipc.
💡 Eureka! It immediately appeared to me that this was my way into the Salt brain. I knew about IPC, but I never played with it. Yay, something new to play with!
IPC means inter-process communication. There are multiple methods to implement it. The only thing you need to know here is that Saltstack uses IPC sockets - which is like network communication but using a file.
I also read that Salt was using MessagePack to format their messages. MessagePack is a format like JSON, but more compact.
See below an example of code to connect to the master_event_pub.ipc
file and then read and decode the MessagePack payload:
eventBus, err := net.Dial("unix", "/var/run/salt/master/master_event_pub.ipc")
decoder := msgpack.NewDecoder(eventBus)
message, err := decoder.DecodeMap()
However, the decoded message still needed work:
map[string]interface {}{
"body": []uint8{
50, 48, 50, 51, ...},
"head": map[string]interface {}{},
}
The event details was in the body
. It almost looked like MessagePack but with a catch. This is where the deep dive in Salt architecture went deeper: I explored the source code of Salt. Fortunately, Salt developers wrote a comprehensive docstring explaining their intent:
In the new style, when the tag is longer than 20 characters, an end of tag string is appended to the tag given by the string constant TAGEND, that is, two line feeds '\n\n'. When the tag is less than 20 characters then the tag is padded with pipes "|" out to 20 characters as before. When the tag is exactly 20 characters no padded is done.
source: https://github.com/saltstack/salt/blob/master/salt/utils/event.py
The trick was that the topic was outside the MessagePack structure of the body
. So I wrote a quick function to parse the payload properly.
func Parse(message map[string]interface{}) (event.SaltEvent, error) {
body := string(message["body"].([]byte))
lines := strings.SplitN(body, "\n\n", 2)
tag := lines[0]
eventContent := []byte(lines[1])
// Parse message body
ev := event.SaltEvent{Tag: tag}
if err := msgpack.Unmarshal(eventContent, &ev.Data); err != nil {
return event.SaltEvent{}, err
}
}
note: The downside of reading from the IPC is that the exporter would be impacted if Salt changed the way its different processes communicate. But it is ok, I already have plenty of ideas to adapt if such cases happen.
Expose the metrics
We have the events, the only remaining part was to expose these metrics.
Again, Prometheus metrics were really easy to implement:
import "github.com/prometheus/client_golang/prometheus/promauto"
newJobTotal := promauto.NewCounterVec(
prometheus.CounterOpts{
Name: "salt_new_job_total",
Help: "Total number of new jobs processed",
},
[]string{"function", "state"},
)
newJobTotal.WithLabelValues("state.sls", "test").Inc()
Then I just needed to make these metrics accessible from a webserver:
import (
"net/http"
"github.com/prometheus/client_golang/prometheus/promhttp"
)
mux := http.NewServeMux()
mux.Handle("/metrics", promhttp.Handler())
httpServer := http.Server{Addr: "0.0.0.0:2112", Handler: mux}
httpServer.ListenAndServe()
Voilà.
I had now everything I needed to fully write my exporter with all the requirements I had: something working out of the box without configuration in Salt and without possible impact on Salt itself.
Developer Experience
Let's talk a little about coding!
Go vs Python
This was not my first Go project. But it was my first project using the Prometheus library.
The development of the exporter was pleasant. Go itself is really easy to scale up using goroutines. It is good to play with concurrency without having to worry about GIL, or about multiprocessing implementation, or about using asyncio compatible libraries, or even having to choose between threads, processes and async. In Go, everything is straightforward while allowing more control than in Python.
The other aspects I really like about Go is its dependency management and how easy it is to build a binary. On the contrary, Python is an interpreted language, so you have to deal with dependencies at runtime, and packaging your code can be quite annoying.
Building a Go program is simple:
go build ./...
Distributing a Go program is simple, and it is even better with GoReleaser and GitHub Actions:
# GitHub actions job
jobs:
goreleaser:
runs-on: ubuntu-latest
steps:
- name: Run GoReleaser
uses: goreleaser/goreleaser-action@v4
with:
distribution: goreleaser
version: latest
args: release --clean
note: if you want a GoReleaser example, you can find the one I use here
Last but not least, I really appreciate coming back to a typed language.
I will not go deeper in the comparison of Go versus Python, there are a ton of articles out there. I really like both, but I have to admit the more I write Go code, the less I want to write Python code.
But in the end, what really matters is to pick the most appropriate language for you and your project.
Inconsistencies?
The most annoying aspect of the development was the inconsistency in some of Salt's internal behavior.
For example, failing scheduled jobs sometimes return success=true
, sometimes success=false
. I found some GitHub issues about it, but nothing to explain the inconsistency. It was easy to find a safe workaround, but I will probably have a look at this particular issue to fix it at the source.
Salt security
Earlier in the article I said "I immediately started to see events in MessagePack format", some of you might have noticed that it should not be that easy. Well, you would be right, but it is not as bad as you think.
Prior to Salt 3006, the IPC file permissions were too permissive. Any user logged on the server could read the IPC, and then see all the events. By listening to these events, in certain situations, anyone could catch interesting information such as secrets. "But, there are no secrets in the events" some might argue, and you would be mostly right. But what if a user runs the command pillar.items
... anyone could get the result from the event queue. Reminder: pillars can be used to share secrets from the master to the minion…
sudo salt "*" pillar.items
node1:
----------
token: myVeryImportantSecretToken
password: myVeryImportantAdminPassword
Anyway, after reaching them out, the Saltstack team fixed this issue and hardened the Salt Master in 3006 version.
Time to wrap up
Yes, you see me coming... I liked working on this exporter.
It has been really cool to read tons of Python code when diving into the Salt codebase. I have deepened my knowledge in Salt making it easier for me to do advanced troubleshooting and design.
Writing the exporter itself has been quite satisfying too. As expected, Go brought great performance and scalability, while keeping it maintainable. The build and distribution of the exporter was very easy to implement and automate.
Next time I will talk about how a development tool became an actual debug/operation tool for production (spoiler, it is called Salt Live and it is on the same repository than Salt-exporter).
Cheers!
- Salt Exporter repository: https://github.com/kpetremann/salt-exporter
- Documentation: https://kpetremann.github.io/salt-exporter/
Top comments (0)