Kondo Uchio

Posted on Nov 16, 2017

Trying seccomp via mruby

#container #docker #seccomp #mruby

seccomp?

seccomp(2) is a Linux system call that filters processes' syscall invocations. After Linux 3.5, they introduced "seccomp mode 2" that allow systems to filter syscalls by syscall numbers and their arguments.

This is also one of basic containers features, e.g. Docker uses this.

It uses BPF (Berkeley Packet Filter), which is also used in libpcap, to filter syscalls as fast as possible.

mruby?

mruby is one of the Ruby implementations (FYI mruby is created by Matz, as MRI is founded by Matz). It is designed to be embeddable into gadgets and to be lightweight than CRuby.

As a side effect of embedding features, mruby has very clean and concise C API, so it is easy to write C bindings/systems programmings with it. It is similar to Lua language in many aspects.

I am going to try to use seccomp basic features via libseccomp, writing mruby's gem(mrbgem) to bind libseccomp.

Here is mruby-seccomp. You can build a mruby binary with seccomp access, by checking out this repo in some Linux and just hit make(building mruby itself requires CRuby, bison and some libraries).

Basic usage

In C level, you can create seccomp context by seccomp_init(3), add filter rules by seccomp_rule_add(3), then load to current process by seccomp_load(3).

A seccomp context has a default action:
:kill => SCMP_ACT_KILL, :allow => SCMP_ACT_ALLOW, :trap => SCMP_ACT_TRAP, ...
Then you can add custom filter actions with syscall and arguments specifications.

Use :kill by default to make a whitelist, and use :allow a blacklist.

This is a blacklist that restricts uname(2) calls. Build with mruby-uname.

context = Seccomp.new(default: :allow) do |rule|
  rule.trap(:uname)
end

context.load
Uname.nodename # Really calls `uname(2)` !

$ ./mruby/bin/mruby /tmp/test.rb 
Bad system call

Bad system call implies SIGSYS - you even can trap this.

Combination with `fork()/exec()`

Loaded seccomp's information will not be changed after fork/clone and execve, as a kernel document says.

You can load a seccomp context just after fork() and then do execve(), to create a "sandbox" container - which restricts child processes to call specified syscalls.

# fork from https://github.com/iij/mruby-process
# exec from https://github.com/haconiwa/mruby-exec
context = Seccomp.new(default: :allow) do |rule|
  rule.kill(:mkdir, Seccomp::ARG(:>=, 0), Seccomp::ARG(:>=, 0))
end

pid = Process.fork do
  context.load

  puts "==== It will be jailed. Please try to mkdir"
  exec "/bin/sh"
end

p(Process.waitpid2 pid)

$ ./mruby/bin/mruby /tmp/jail.rb 
==== It will be jailed. Please try to mkdir
sh-4.2$ mkdir /tmp/test1234
Bad system call

This is similar to Linux Capabilities, but seccomp has a finer granularity to control programs.

Advanced features

Using seccomp we can catch SIGSYS with the informations of what syscall is blocked(via struct siginfo_t). mruby-seccomp supports this.

context = Seccomp.new(default: :allow) do |rule|
  rule.trap(:uname)
end
Seccomp.on_trap do |syscall|
  puts "Trapped: syscall #{Seccomp.syscall_to_name(syscall)} = ##{syscall}"
end
context.load

begin
  # Then hit `uname(2)`
  p "nodename: " + Uname.nodename
rescue => e
  puts "Catch as error: " + e.message
  puts "Trapping is OK"
end

$ ./mruby/bin/mruby /tmp/trap.rb 
Trapped: syscall uname = #63
Catch as error: uname failed
Trapping is OK

NOTE: Signal handlers will be cleaned after exec(). So exec()'ing to be bash and trapping uname(1) hit is unsupported now.

Conclusion

seccomp can restrict a process and a process tree's syscall invocations. We can do an experiment using mruby, mruby-seccomp and some mrbgems about processes control.

BTW seccomp is supported in Docker, LXC and Haconiwa.

Pull requests to mruby-seccomp is welcomed!

Original Japanese article

Original article written by me in Japanese:

http://udzura.hatenablog.jp/entry/2016/11/18/160020

DEV Community

Trying seccomp via mruby

seccomp?

mruby?

Basic usage

Combination with `fork()/exec()`

Advanced features

Conclusion

Original Japanese article

Top comments (0)

Read next

Day 16: Introduction to DockerHub

From 41 Minutes to 8 Minutes: How I Made Our CI/CD Pipeline 5x Faster

Docker Networking: A Comprehensive Guide

Comprehensive Guide to Setting Up Load Balancing with Traefik, Docker, Django, and React

seccomp?

mruby?

Basic usage

Combination with fork()/exec()

Advanced features

Conclusion

Original Japanese article

Read next

Day 16: Introduction to DockerHub

From 41 Minutes to 8 Minutes: How I Made Our CI/CD Pipeline 5x Faster

Docker Networking: A Comprehensive Guide

Comprehensive Guide to Setting Up Load Balancing with Traefik, Docker, Django, and React

Combination with `fork()/exec()`