Ivan Velichko recently made me aware of issues with debugging distroless containers in Kubernetes with kubectl debug
. This blog will take a look at the problem and show you can get access to the filesystem of a distroless pod for debugging purposes.
The problem Ivan found was a lack of permissions to access the filesystem of the container being debugged. This is best explained with some examples. With a regular (non-distroless) container, you can do the following to start an ephemeral debug container that shares various namespaces with the target container:
$ kubectl run nginx-pod --image nginx
pod/nginx-pod created
$ kubectl debug -it nginx-pod --image alpine --target nginx-pod
Targeting container "nginx-pod". If you don't see processes from this container it may be because the container runtime doesn't support this feature.
Defaulting debug container name to debugger-h4fzv.
If you don't see a command prompt, try pressing enter.
/ # ps
PID USER TIME COMMAND
1 root 0:00 nginx: master process nginx -g daemon off;
32 101 0:00 nginx: worker process
33 101 0:00 nginx: worker process
34 101 0:00 nginx: worker process
35 101 0:00 nginx: worker process
36 101 0:00 nginx: worker process
37 101 0:00 nginx: worker process
38 101 0:00 nginx: worker process
39 101 0:00 nginx: worker process
40 101 0:00 nginx: worker process
41 101 0:00 nginx: worker process
308 root 0:00 /bin/sh
317 root 0:00 ps
(You could also just use kubectl exec -it nginx-pod -- /bin/sh
, but of course that's not possible in a distroless container)
Note that the filesystem is the filesystem of the Alpine debug container, not the nginx container:
/ # cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.19.1
PRETTY_NAME="Alpine Linux v3.19"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://gitlab.alpinelinux.org/alpine/aports/-/issues"
But we can get to the nginx container filesystem via the /proc/1/root
filesystem. To break this down:
-
/proc
is a virtual filesystem created by the kernel that contains various metadata -
1
refers to the process id, in this case our running nginx master process; and -
root
is a link to the root of the filesystem the process is running in.
So we can access the index.html file inside the nginx container like this:
/ # cat /proc/1/root/usr/share/nginx/html/index.html
<!DOCTYPE html>
<html>
…
/ #
Now let's try that with the cgr.dev/chainguard/nginx
image, which is one of Chainguard's distroless images:
$ kubectl run nginx-distroless --image cgr.dev/chainguard/nginx
pod/nginx-distroless created
$ kubectl debug -it nginx-distroless --image alpine --target nginx-distroless
Targeting container "nginx-distroless". If you don't see processes from this container it may be because the container runtime doesn't support this feature.
Defaulting debug container name to debugger-bcr26.
If you don't see a command prompt, try pressing enter.
/ # cat /proc/1/root/usr/share/nginx/html/index.html
cat: can't open '/proc/1/root/usr/share/nginx/html/index.html': Permission denied
/ # whoami
root
We get Permission denied
.
It turns out that the problem is that the nginx container is running as the user nonroot
with UID 65532, which we don't have permission to access despite being root
(using --profile
to set a different security profile didn't help either, but I suspect it might in the future). To fix this, we need our debug container to run as the same user as the nginx container. Unfortunately there's no --user
flag for kubectl
, so we need to have an image that runs as this user by default. We could create one e.g with a Dockerfile such as:
FROM alpine
USER 65532
But in the case of Chainguard Images there's a much easier solution. Most of our images come with -dev
variants that run as the same user but also include a shell and can be used for debugging, so we can do:
$ kubectl debug -it nginx-distroless --image cgr.dev/chainguard/nginx:latest-dev --target nginx-distroless -- /bin/sh
Targeting container "nginx-distroless". If you don't see processes from this container it may be because the container runtime doesn't support this feature.
Defaulting debug container name to debugger-nbvjt.
If you don't see a command prompt, try pressing enter.
/ $
/ $ cat /proc/1/root/usr/share/nginx/html/index.html
<!DOCTYPE html>
<html>
…
And everything works as expected.
There's actually a second wrinkle that is also solved by setting the user – if your pod is running with the runAsNonRoot
policy, you won't be able to start a debug container that runs as root with the default profile.
This does point to some ways in which kubectl debug
could be improved:
- Add a
--user
option to set the user in the debug container - Add a formal way to access the target container filesystem. Going via
/proc/1/root
seems a little hacky and non-intuitive - Add some more docs to explain all of this (which is somewhere I plan to help).
(I do see there are some proposed enhancements related to profiles that might help here)
I should also point out that Ivan addresses these problems directly with his cdebug tool. You can use cdebug
to directly debug a pod:
$ cdebug exec -it --privileged pod/nginx-distroless/nginx-distroless
Debugger container name: cdebug-20ba5985
Starting debugger container...
Waiting for debugger container...
Attaching to debugger container...
If you don't see a command prompt, try pressing enter.
/ # cat /proc/1/root/usr/share/nginx/html/index.html
<!DOCTYPE html>
<html>
…
cdebug
also supports a --user
flag if you have the runAsNonRoot
policy e.g:
cdebug exec -it --user 65532 pod/nginx-distroless/nginx-distroless
…
That's about it. Running production workloads in distroless containers is a big improvement in terms of security. With a little bit of knowledge these containers can still be straightforward to debug.
Top comments (6)
Oooh, this is nice. Good "missing piece" of the distroless puzzle. Thanks.
This seems like an area with some low-hanging fruit. Would be great to see a couple new flags to handle this situation, seems like it'll be an increasingly common workflow as distroless gets picked up in organizations.
This is quite useful. Does this still work when Pod Security Adminission with restricted is enabled?
My kubectl leave the ephemeral container behind dunno if that's the default behavior or is supposed to kill it
This is how it's implement at the moment.
github.com/kubernetes/kubernetes/i... was closed but not implemented yet
nice