DevOps VN

Posted on Aug 7, 2022

Deep into Container — Build your own container with Golang

#go #devops

Hi guys, continuing with the series of Deep into Container, we already know that containers are built from Linux Namespaces and Cgroups, and to learn more deeply about it, we're going to learn how to build your own container using Golang.

This article I referenced from Build Your Own Container Using Less than 100 Lines of Go by Julian Friedman and Building a container from scratch in Go by Liz Rice.

This is part four in the series Deep into Container:

Linux namespaces and Cgroups: What are containers made from?
Deep into Container Runtime.
How Kubernetes works with Container Runtime.
Deep into Container - Build your own container with Golang.

Building a Container

Create a file named container.go and write some simple code as follows.

package main

import (
    "os"
)

func main() {

}

func must(err error) {
    if err != nil {
        panic(err)
    }
}

If you are familiar with Docker then you know a command to run the container is docker run <container> <command>, for example:

docker run busybox echo "A"

You will see the container run and print the letter "A", and if you run the following command:

docker run -it busybox sh

The container run and the shell will attach to it.

/ #

If we type a command now, that command is running in the container.

/ # hostname
d12ccc0e00a0

/ # ps
PID   USER     TIME  COMMAND
1     root      0:00 sh
9     root      0:00 ps

The hostname command doesn't print the hostname of the server that prints the hostname of a container, and the ps command print only two processes.

Now we will build a similar container like the above using Golang, update the container.go as follows.

package main

import (
    "os"
)

// docker run <image> <command>
// go run container.go run <command>
func main() {
    switch os.Args[1] {
    case "run":
        run()
    default:
        panic("Error")
    }
}

func run() {

}

func must(err error) {
    if err != nil {
        panic(err)
    }
}

We add a function named run() and in the main function, we use the switch case syntax to check that when we run the program with the flag as run, it will run the run() function. Now when we run the command go run container.go run, it will be similar to when we run docker run.

Next, we update the run() function as follows.

package main

import (
    "os"
  "os/exec"
)

// docker run <image> <command>
// go run container.go run <command>
func main() {
    switch os.Args[1] {
    case "run":
        run()
    default:
        panic("Error")
    }
}

func run() {
    cmd := exec.Command(os.Args[2], os.Args[3:]...)
    cmd.Stdin = os.Stdin
    cmd.Stdout = os.Stdout
    cmd.Stderr = os.Stderr

    must(cmd.Run())
}

func must(err error) {
    if err != nil {
        panic(err)
    }
}

We use the os/exec package to execute user input commands that are stored in the os.Args array, for example, when we type go run container.go run echo "A", then the os.Args will have a value of:

Args[0] = "container.go"
Args[1] = "run"
Args[2] = "echo"
Args[3] = "A"

The value that we need to pass into the exec.Command() we get from the index two of os.Args. The syntax of Command() function as follows.

exec.Command(name string, arg ...string)

The function takes the first argument which is the command it will execute and the remaining values are arguments of that command.

Now, try to run the same command as docker run -it busybox sh with your program.

go run container.go run sh

You will see that it is mostly the same when you run the docker command.

We have successfully taken the first step 😁, but when you type the hostname command, it will print the hostname of our server, not of the container.

# hostname
LAPTOP-2COB82RG

If you type the command to change the hostname in our program, it will affect the outside of the server as well.

# hostnamectl set-hostname container

Type exit and enter, now outside the server, we type the hostname we will see it has been changed.

Our program is currently just running the sh command not the container at all, next, we will go through each step to build the container. As we know the container is built from Linux Namespaces.

Namespaces

Namespaces provide the isolation environment that helps us run a process independent of other processes on the same server. At the time of writing, there are six namespaces as follows,

PID: The PID namespace provides processes with an independent set of process IDs (PIDs) from other namespaces. The PID namespace makes the first process created within it assigned with PID 1.
MNT: Mount namespaces control mount points, and provide you to mount and unmount folders without affecting other namespaces.
NET: Network namespaces create their network stack for the process.
UTS: UNIX Time-Sharing namespaces allow a process has a separate hostname and domain name.
USER: User namespaces create their own set of UIDS and GIDS for the process.
IPC: IPC namespaces isolate processes from inter-process communication, this prevents processes in different IPC namespaces from using.

We will use PID, UTS, and MNT namespaces in our Golang program.

UTS namespace

The first thing we need to isolate is the hostname so that our program has its hostname. Update container.go.

package main

import (
  "os"
  "os/exec"
  "syscall"
)

// docker run <image> <command>
// go run container.go run <command>
func main() {
    switch os.Args[1] {
    case "run":
        run()
    default:
        panic("Error")
    }
}

func run() {
    cmd := exec.Command(os.Args[2], os.Args[3:]...)
    cmd.Stdin = os.Stdin
    cmd.Stdout = os.Stdout
    cmd.Stderr = os.Stderr
    cmd.SysProcAttr = &syscall.SysProcAttr{
        Cloneflags: syscall.CLONE_NEWUTS,
    }

    must(cmd.Run())
}

func must(err error) {
    if err != nil {
        panic(err)
    }
}

To use Linux namespaces in Go, we simply pass the namespace flag we want to use in cmd.SysProcAttr.

cmd.SysProcAttr = &syscall.SysProcAttr{
    Cloneflags: syscall.CLONE_NEWUTS,
}

Now let's try again.

go run container.go run sh

Run the command to change the hostname.

# hostnamectl set-hostname wsl
# hostname
wsl

Type exit and enter, now outside the server, you type the hostname command and you'll see the hostname of the server not change at all. We have completed the next step in building the container 😁.

However for our program to be more like a container, we need to do a few more things. As you can see when we run docker run -it busybox sh and then type hostname it will have its hostname, not like we run the program, and we have to manually type the command to change the hostname. Update container.go.

package main

import (
    "os"
    "os/exec"
    "syscall"
)

// docker run <image> <command>
// ./container run <command>
func main() {
    switch os.Args[1] {
    case "run":
        run()
    case "child":
        child()
    default:
        panic("Error")
    }
}

func run() {
    cmd := exec.Command("/proc/self/exe", append([]string{"child"}, os.Args[2:]...)...)
    cmd.Stdin = os.Stdin
    cmd.Stdout = os.Stdout
    cmd.Stderr = os.Stderr
    cmd.SysProcAttr = &syscall.SysProcAttr{
        Cloneflags: syscall.CLONE_NEWUTS,
    }

    must(cmd.Run())
}

func child() {
    syscall.Sethostname([]byte("container"))

    cmd := exec.Command(os.Args[2], os.Args[3:]...)
    cmd.Stdin = os.Stdin
    cmd.Stdout = os.Stdout
    cmd.Stderr = os.Stderr

    must(cmd.Run())
}

func must(err error) {
    if err != nil {
        panic(err)
    }
}

we add another function named child() and in the run function, we execute the child function by exec.Command.

exec.Command("/proc/self/exe", append([]string{"child"}, os.Args[2:]...)...)

We change the first argument to /proc/self/exe, this command will self-executing the program with a child argument. The child process now runs in isolation UTS namespaces, and we change the hostname with the function syscall.Sethostname([]byte("container")).

go run container.go run sh -> /proc/self/exe child sh -> syscall.Sethostname([]byte("container")) -> exec.Command("sh")

Let's try again.

go run container.go run sh

Typing hostname and you will see your process has its own hostname.

# hostname
container

So we have completed the next step 😁.

Next, try to type the ps command to list the process, and see if is it the same as when we run docker run?

# ps
PID   TTY      TIME     CMD
11254 pts/3    00:00:00 sudo
11255 pts/3    00:00:00 bash
17530 pts/3    00:00:00 go
17626 pts/3    00:00:00 container
17631 pts/3    00:00:00 exe
17636 pts/3    00:00:00 sh
17637 pts/3    00:00:00 ps

Not like at all, the processes you see are processes outside the server.

PID namespace

We will use the PID namespace to create a process with an independent set of process IDs (PIDs). Update container.go as follows.

...
func run() {
 cmd := exec.Command("/proc/self/exe", append([]string{"child"}, os.Args[2:]...)...)
 cmd.Stdin = os.Stdin
 cmd.Stdout = os.Stdout
 cmd.Stderr = os.Stderr
 cmd.SysProcAttr = &syscall.SysProcAttr{
  Cloneflags: syscall.CLONE_NEWUTS | syscall.CLONE_NEWPID,
 }
 must(cmd.Run())
}
...

We just need to add one flag is syscall.CLONE_NEWPID, now let's run again.

go run container.go run sh

# ps
PID   TTY      TIME     CMD
11254 pts/3    00:00:00 sudo
11255 pts/3    00:00:00 bash
17530 pts/3    00:00:00 go
17626 pts/3    00:00:00 container
17631 pts/3    00:00:00 exe
17636 pts/3    00:00:00 sh
17637 pts/3    00:00:00 ps

What? It does not change at all. Why?

When we run the ps program, it will get process information in /proc folder in Linux, let's try.

ls /proc

Now, the filesystem of your process looks the same as the host, because its filesystem is inherited from the current server, let's change that.

MNT namespace

Update container.go as follows.

package main

import (
    "os"
    "os/exec"
    "syscall"
)

// docker run <image> <command>
// ./container run <command>
func main() {
    switch os.Args[1] {
    case "run":
        run()
    case "child":
        child()
    default:
        panic("Error")
    }
}

func run() {
    cmd := exec.Command("/proc/self/exe", append([]string{"child"}, os.Args[2:]...)...)
    cmd.Stdin = os.Stdin
    cmd.Stdout = os.Stdout
    cmd.Stderr = os.Stderr
    cmd.SysProcAttr = &syscall.SysProcAttr{
        Cloneflags: syscall.CLONE_NEWUTS | syscall.CLONE_NEWPID | syscall.CLONE_NEWNS,
    }

    must(cmd.Run())
}

func child() {
    syscall.Sethostname([]byte("container"))
    must(syscall.Chdir("/"))
    must(syscall.Mount("proc", "proc", "proc", 0, ""))

    cmd := exec.Command(os.Args[2], os.Args[3:]...)
    cmd.Stdin = os.Stdin
    cmd.Stdout = os.Stdout
    cmd.Stderr = os.Stderr

    must(cmd.Run())
}

func must(err error) {
    if err != nil {
        panic(err)
    }
}

We use syscall.CLONE_NEWNS flag to create a process with MNT namespaces, and change the filesystem.

syscall.Chdir("/")
syscall.Mount("proc", "proc", "proc", 0, "")

Now, let's run again.

go run container.go run sh

Typing the ps command.

# ps
PID TTY      TIME     CMD
1   pts/3    00:00:00 exe
7   pts/3    00:00:00 sh
8   pts/3    00:00:00 ps

We succeeded 😁.

Conclusion

So we know how to build a simple container using Golang, but in reality, the container will have many other things, like Cgroups to limit the process's resources, create USER namespaces, and mount files from the container to the server, etc…

But basically, the main feature for containers to create an isolation environment is Linux namespaces. If you have any questions or need more clarification, you can ask in the comment section below.

DEV Community

Deep into Container — Build your own container with Golang

Building a Container

Namespaces

UTS namespace

PID namespace

MNT namespace

Conclusion

Top comments (0)

Read next

Terraforming Resource Control Policies

Linux Kernel and Boot process for Beginner

Go: Pointers & Memory Management

SimplySocket: A Lightweight WebSocket Wrapper for Go