Hey, if you followed the previous parts about writing a PID 1 (and a service launcher) with Go you may expect this.
The approach of the created service-launcher
is to follow a single process, so if it forks and main process exits, the service fails/succeeds with code from main process instead of following the forked one.
To follow those processes (that can spawn several processes) we should tweak the service-launcher
to follow a variable number of them.
Instead of doing this and thus, making a monolith around service-launcher
I have decided to create a helper to launch and follow processes that fork.
Important idea: What do we call "main" process?
We call main process to the process spawned byservice-launcher
. That process will be monitored and will decide if a service is running or exited (failed/succeeded).
Default behavior
From fork(2)
:
fork()
creates a new process by duplicating the calling process. The new process is referred to as the child process. The calling process is referred to as the parent process.
So if we call a fork
-ing process and do pstree, there will be 2 identical processes, one as parent and the other as child.
Take the following C code:
# include <unistd.h>
# include <sys/types.h>
# include <sys/wait.h>
int main() {
pid_t pid=fork();
switch(pid){
case -1:
return -1;
case 0:
// Forked process. Sleep and exit
sleep(20);
return 0;
}
wait(NULL); // Wait for its child
return 0;
}
If we run htop
or pstree
while running that code we get the following tree:
Ok, that's right, but, what if the parent process exits before child? To achieve this, remove the wait
call from C code. In sake of brevity I won't show it, but that "forked" process bubbles trying to find the next CHILD_SUBREAPER. The PID 1 acts as child reaper implicitly, so if none of the processes on tree has that role, PID 1 will become the parent of our orphaned process. We lost it.
This is the default behavior, and the process that launched ./fork
cannot follow the fork(2)
-ed process.
Having control over them
So we want a program able to follow all its descendent processes.
Thanks Linux for cgroups(7)
. We'll be using the version 2 (unified hierarchy).
How do cgroups2
work
First, you need cgroup2
mounted on /sys/fs/cgroup
. service-launcher
does mount this filesystem by default on init.
Now, simply cd to /sys/fs/cgroup
and mkdir
. cd
into that dir and surprise! All the enabled controllers are present just after you did mkdir
. Add your current shell PID to the cgroup echo $$ >> cgroup.procs
. You can also cat
that file to see all PIDs inside.
Some rules that apply to "adding processes to cgroups":
- If the added process already has child processes, child processes are not moved inside the cgroup, only the specified PID.
- Processes and their threads are added into the cgroup.
- Child processes created by a process inside the cgroup, are also added automatically to the cgroup (and their PID to
cgroup.procs
)
If we take this rules we can make a program that:
- Creates a new cgroup with an specified name
- Executes another process inheriting an fd ( with options
O_APPEND|O_WRONLY
) to thecgroup.procs
file. Arguments are to be passed to the real executable process. This child process does:- Append it's PID to the fd 3 (first fd inherited)
- Closes the fd
- Calls
execve(2)
(golang.org/x/sys/unix.Exec
in Go) with arguments from cmdline.
- Child process is followed until it
fork(2)
s and exits. - Now their children (from
cgroup.procs
) are watched. - Exits with status from last process in cgroup
- Removes cgroup before exiting
How do we "watch". First thing I thought is to create a ticker and unix.Wait4
with WNOHANG
but then came across a section on cgroups(7)
talking about the cgroup.events
file. So we could use inotify(7)
to watch them for changes and react to process changes inside the cgroup.
Ok, this works, but what about the process tree? It still holds the same default behavior explained previously.
Changing the default behavior
We simply call prctl(2)
with PR_SET_CHILD_SUBREAPER
to become a subreaper on the current tree so orphaned child processes are attached to us.
But with great power comes great responsibility. If this process becomes a subreaper, it must wait(2)
child processes when appended. To achieve this we listen to our favourite signal, SIGCHLD
, wait processes and check if cgroup still contains at least 1 process. If cgroup is empty, our exit code will be the code from last process inside the cgroup.
Code & Run
- Main process (supervisor).
cmd/run-in-cgroup
- Subprocess (executor).
cmd/exec-in-cgroup
Let's run dhclient
on eth0
. This process forks and previously the service succeeded when main process exited, leaving the forked process untracked.
We changed the service file as follows:
---
name: dhclient@eth0
description: DHCP client for eth0
exec: run-in-cgroup
arguments:
- -name=dhclient-eth0
- --
- dhclient
- eth0
So service-launcher
detects dhclient as running because the monitor process is alive and trackable. Also dhclient
is a child of its monitor process so the tree is clean and there are no lost things floating around.
Do you have any thoughts here? Feedback? Can we make this better? Write a comment or feel free to create an issue.
Help is needed with the name, I don't feel good calling this thing go-pid1
(and my lack of creativity is notable, I thought gp1
😅).
On next article I'll be making it work in a complete system with Void + SDDM + KDE (or something simpler, but KDE should work).
See you next time.
Top comments (0)