MrViK

Posted on Jan 7, 2021

Following slippery processes

#go #linux #systemsprogramming #showdev

Hey, if you followed the previous parts about writing a PID 1 (and a service launcher) with Go you may expect this.
The approach of the created service-launcher is to follow a single process, so if it forks and main process exits, the service fails/succeeds with code from main process instead of following the forked one.

To follow those processes (that can spawn several processes) we should tweak the service-launcher to follow a variable number of them.

Instead of doing this and thus, making a monolith around service-launcher I have decided to create a helper to launch and follow processes that fork.

Important idea: What do we call "main" process?
We call main process to the process spawned by service-launcher. That process will be monitored and will decide if a service is running or exited (failed/succeeded).

Default behavior

From fork(2):

fork() creates a new process by duplicating the calling process. The new process is referred to as the child process. The calling process is referred to as the parent process.

So if we call a fork-ing process and do pstree, there will be 2 identical processes, one as parent and the other as child.

Take the following C code:

# include <unistd.h>
# include <sys/types.h>
# include <sys/wait.h>

int main() {
    pid_t pid=fork();
    switch(pid){
        case -1:
            return -1;
        case 0:
            // Forked process. Sleep and exit
            sleep(20);
            return 0;
    }

    wait(NULL); // Wait for its child

    return 0;
}

If we run htop or pstree while running that code we get the following tree:

Ok, that's right, but, what if the parent process exits before child? To achieve this, remove the wait call from C code. In sake of brevity I won't show it, but that "forked" process bubbles trying to find the next CHILD_SUBREAPER. The PID 1 acts as child reaper implicitly, so if none of the processes on tree has that role, PID 1 will become the parent of our orphaned process. We lost it.

This is the default behavior, and the process that launched ./fork cannot follow the fork(2)-ed process.

Having control over them

So we want a program able to follow all its descendent processes.

Thanks Linux for cgroups(7). We'll be using the version 2 (unified hierarchy).

How do `cgroups2` work

First, you need cgroup2 mounted on /sys/fs/cgroup. service-launcher does mount this filesystem by default on init.

Now, simply cd to /sys/fs/cgroup and mkdir. cd into that dir and surprise! All the enabled controllers are present just after you did mkdir. Add your current shell PID to the cgroup echo $$ >> cgroup.procs. You can also cat that file to see all PIDs inside.

Some rules that apply to "adding processes to cgroups":

If the added process already has child processes, child processes are not moved inside the cgroup, only the specified PID.
Processes and their threads are added into the cgroup.
Child processes created by a process inside the cgroup, are also added automatically to the cgroup (and their PID to cgroup.procs)

If we take this rules we can make a program that:

Creates a new cgroup with an specified name
Executes another process inheriting an fd ( with options O_APPEND|O_WRONLY) to the cgroup.procs file. Arguments are to be passed to the real executable process. This child process does:
- Append it's PID to the fd 3 (first fd inherited)
- Closes the fd
- Calls execve(2) (golang.org/x/sys/unix.Exec in Go) with arguments from cmdline.
Child process is followed until it fork(2)s and exits.
Now their children (from cgroup.procs) are watched.
Exits with status from last process in cgroup
Removes cgroup before exiting

How do we "watch". First thing I thought is to create a ticker and unix.Wait4 with WNOHANG but then came across a section on cgroups(7) talking about the cgroup.events file. So we could use inotify(7) to watch them for changes and react to process changes inside the cgroup.

Ok, this works, but what about the process tree? It still holds the same default behavior explained previously.

Changing the default behavior

We simply call prctl(2) with PR_SET_CHILD_SUBREAPER to become a subreaper on the current tree so orphaned child processes are attached to us.

But with great power comes great responsibility. If this process becomes a subreaper, it must wait(2) child processes when appended. To achieve this we listen to our favourite signal, SIGCHLD, wait processes and check if cgroup still contains at least 1 process. If cgroup is empty, our exit code will be the code from last process inside the cgroup.

Code & Run

Main process (supervisor). cmd/run-in-cgroup
Subprocess (executor). cmd/exec-in-cgroup

Let's run dhclient on eth0. This process forks and previously the service succeeded when main process exited, leaving the forked process untracked.

We changed the service file as follows:

---
name: dhclient@eth0
description: DHCP client for eth0
exec: run-in-cgroup
arguments:
  - -name=dhclient-eth0
  - --
  - dhclient
  - eth0

The result looks like this:

So service-launcher detects dhclient as running because the monitor process is alive and trackable. Also dhclient is a child of its monitor process so the tree is clean and there are no lost things floating around.

Do you have any thoughts here? Feedback? Can we make this better? Write a comment or feel free to create an issue.

Help is needed with the name, I don't feel good calling this thing go-pid1 (and my lack of creativity is notable, I thought gp1 😅).

On next article I'll be making it work in a complete system with Void + SDDM + KDE (or something simpler, but KDE should work).

See you next time.

DEV Community

Following slippery processes

Default behavior

Having control over them

How do `cgroups2` work

Changing the default behavior

Code & Run

Top comments (0)

Read next

CREATING AND CONNECTING TO A LINUX VIRTUAL MACHINE SCALE SET

I built my own search engine

Why we Built a Mini-Language for a Golang Hackathon

gRPC Streaming: Best Practices and Performance Insights

Default behavior

Having control over them

How do cgroups2 work

Changing the default behavior

Code & Run

Read next

CREATING AND CONNECTING TO A LINUX VIRTUAL MACHINE SCALE SET

I built my own search engine

Why we Built a Mini-Language for a Golang Hackathon

gRPC Streaming: Best Practices and Performance Insights

How do `cgroups2` work