Kushal Joshi

Posted on May 15, 2022 • Edited on Jun 12, 2022

Cap'n Proto - RPC at the speed of Rust - Part 1 of 2

#rust #data #network #tutorial

We're entering into a kind of niche subject here but I want to make it more accessible if I can. At the time of writing this post, at my work we've been using GRPC for a few years, because at the time when we made the decision, it was simple, fast, had some Rust support, and corporate sponsorship (it used to be that G stood for Google...). It's not that we've reached the end of using GRPC but I keep wondering what else is possible, as our work context requires us to go faster and faster.

Title photo by Emre Karataş on Unsplash

GRPC is a binary RPC protocol that serialises pretty damn quick and transfers data more than fast enough across our global system. We could use something faster but, to date, we've had other fish to fry in the area of optimisation.

However, today I have some personal projects that need a data protocol and my leaning is to go with RPC. I need speed and I want it to be well supported and simple to integrate.

While it would be easy to use GRPC again, I want to try out another RPC protocol - Cap'n Proto.

Why? because the author of Cap'n Proto was one of the original authors of ProtoBuf 2 (Protocol Buffers 2) which is the open source serialisation format used in GRPC. Also because Cap'n Proto claims to have no serialisation/deserialisation at all once a message is created, which means it should be very fast for transferring data around a distributed system. This includes consideration for saving that same data structure that doesn't need serialisation. It's apparently all handled by the protocol definition, all the way down to the endian consideration for saved data.

The Cap'n Proto site is both funny and full of information as to why it was created - but I can't immediately find a way to get an example up and running or to understand how to convert my app's types to Cap'n Proto types. I think everything I need should be there as I can see there is a section on Encoding which should explain this.

The only hurdle I have is that while the documentation is extensive it is a little confusing in places and mainly focuses on C++ and the C++ RPC system which is a little different to the Rust code. There are Rust examples in the official repo which I will try and leverage here.

Installing Cap'n Proto

There is a note on their site that Homebrew can be used to install on a Mac. But at the time of writing I couldn't figure out what to install.

After some hunting I found that we need the relevant tool to process the Cap'n Proto (capnp) Schema files: https://capnproto.org/capnp-tool.html

I found this can be installed on a Mac with:

brew install capnp

If you don't have Homebrew for you Mac, go here: https://brew.sh/

If you don't have a Mac, there are installation instructions here: https://capnproto.org/install.html

once installed we can make sure it runs and look at the help:

capnp --help

Output

Usage: capnp [<option>...] <command> [<arg>...]

Command-line tool for Cap'n Proto development and debugging.

Commands:
  compile  Generate source code from schema files.
  convert  Convert messages between binary, text, JSON, etc.
  decode   DEPRECATED (use `convert`)
  encode   DEPRECATED (use `convert`)
  eval     Evaluate a const from a schema file.
  id       Generate a new unique ID.

See 'capnp help <command>' for more information on a specific command.

Options:
    -I<dir>, --import-path=<dir>
        Add <dir> to the list of directories searched for non-relative imports
        (ones that start with a '/').
    --no-standard-import
        Do not add any default import paths; use only those specified by -I.
        Otherwise, typically /usr/include and /usr/local/include are added by
        default.
    --verbose
        Log informational messages to stderr; useful for debugging.
    --version
        Print version information and exit.
    --help
        Display this help text and exit.

Ok it's working. What do I do now?

I guess we can start with a message example.

The docs say:

Cap’n Proto messages are strongly-typed and not self-describing. You must define your message structure in a special language, then invoke the Cap’n Proto compiler.

Ok let's have a look at the compiler tool docs.

It says I can do this:

capnp compile -oc++ myschema.capnp

This is fine but I want Rust, not C++ code which this command seems to generate. Looking around, there is a bunch Rust crates that I think will help, plus an examples folder, all in this repo:

https://github.com/capnproto/capnproto-rust

But the example contains an ID in the schema file, so I'm not sure if I need to generate this or it is generated by the tool and... inserted into the schema?

Some more hunting around and text searching for "generate" brought me to the language page where I found this:

So it looks like I need to generate at least 1 id and put it in my schema.

❯ capnp id
@0xb068ff5fb1c4f77e;

And let's use the example from the capnproto-rust repo, but with our ID:

I will call this file src/schema/point.capnp

@0xb068ff5fb1c4f77e;

struct Point {
    x @0 :Float32;
    y @1 :Float32;
}

interface PointTracker {
    addPoint @0 (p :Point) -> (totalPoints :UInt64);
}

What does this describe? It looks like an RPC call to add a Point (with x & y coords defined as f32's) to something like a list of points, and it returns the totalPoints, which is a u64. As this type is not a collection I will assume it means the total-number-of-points.

Quick review of the schema basics:

Capnp comments use a "#"
The capnp types are:
- Void: Void
- Boolean: Bool
- Integers: Int8, Int16, Int32, Int64
- Unsigned integers: UInt8, UInt16, UInt32, UInt64
- Floating-point: Float32, Float64
- Blobs: Text (UTF8 NUL terminated), Data
- Lists: List(T) - the T is a Capnp built-in or defined capnp Schema Struct
Struct fields are consecutively numbered (like protobuf) - but with an "@"
There are Enums but also Unions.
Interfaces wrap methods (the PointTracker interface above contains addPoint method)
".capnp" files can import other ".capnp" files
Types for a field are declared with a :colon

The plan

As a rough plan, I want to be able to serve this interface and use or save the file in some way as a demo of the capnp capabilities. The challenge will be to make it as simple as possible so it facilitates what is an exploratory reference (for me at least) and hopefully some info/learning for anyone else looking at this protocol or learning/exploring Rust.

I've now made a cargo new project folder and added a src/schema folder for the file above.

In case generating a capnp ID sounds like a pain - the vscode-capnp extension for vs-code can generate a capnp ID anytime you need it.

(In fact I accidentally found out later that if you forget, the compiler throws an error and generates the ID for you so you can just copy and paste it in)

Generating a Cap'n Proto Schema

Let's see what the cli tool says about compiling now:

❯ capnp help compile
Usage: capnp compile [<option>...] <source>...

Compiles Cap'n Proto schema files and generates corresponding source code in one
or more languages.

Options:
    -I<dir>, --import-path=<dir>
        Add <dir> to the list of directories searched for non-relative imports
        (ones that start with a '/').
    --no-standard-import
        Do not add any default import paths; use only those specified by -I.
        Otherwise, typically /usr/include and /usr/local/include are added by
        default.
    -o<lang>[:<dir>], --output=<lang>[:<dir>]
        Generate source code for language <lang> in directory <dir> (default:
        current directory).  <lang> actually specifies a plugin to use.  If
        <lang> is a simple word, the compiler searches for a plugin called
        'capnpc-<lang>' in $PATH.  If <lang> is a file path containing slashes,
        it is interpreted as the exact plugin executable file name, and $PATH is
        not searched.  If <lang> is '-', the compiler dumps the request to
        standard output.
    --src-prefix=<prefix>
        If a file specified for compilation starts with <prefix>, remove the
        prefix for the purpose of deciding the names of output files.  For
        example, the following command:
            capnp compile --src-prefix=foo/bar -oc++:corge foo/bar/baz/qux.capnp
        would generate the files corge/baz/qux.capnp.{h,c++}.
    --verbose
        Log informational messages to stderr; useful for debugging.
    --version
        Print version information and exit.
    --help
        Display this help text and exit.

Aha:

the compiler searches for a plugin called 'capnpc-' in $PATH...

Not sure if I have that. Let's see what the autocomplete finds:

❯ capnpc
capnpc        capnpc-c++    capnpc-capnp

Nope. Ok let's install capnpc-rust:

I couldn't find anything about needing to install this. Maybe it's magical and I can just select Rust as the language:

❯ capnp compile -orust src/schema/point-schema.capnp
rust: no such plugin (executable should be 'capnpc-rust')
rust: plugin failed: exit code 1

Yup, it's not magical.

Hmm... maybe it's a Cargo crate?

❯ cargo install capnpnc-rust
    Updating crates.io index
error: could not find `capnpnc-rust` in registry `crates-io` with version `*`

Nope.

Ok maybe I'm going about this the wrong way. I guess I could compile the capnpc-rust to a binary by cloning the repo but that may be unnecessary as what I really want is to compile it from within my own code. Isn't it? 🤷 - This is just a guess from reading the capnproto-rust repo:

It's also strongly hinted at in the capnproto-rust docs:

We can try...

crate::Cargo.toml:

[package]
name = "capnproto-demo"
version = "0.1.0"
edition = "2021"
build = "build.rs"

[dependencies]

[build-dependencies]
capnpc = "0.14"

crate::build.rs:

fn main() {
    capnpc::CompilerCommand::new()
        .src_prefix("src/schema")
        .file("src/schema/point.capnp")
        .run()
        .expect("schema compiler command failed");
}

And it compiles and runs the build cargo build! But it doesn't do anything. 😞 Or maybe it did and there's a schema somewhere on my drive?

It's probably this missing Env-var from the examples:

...but I think I want to specify the output folder myself:

fn main() {
    capnpc::CompilerCommand::new()
        .src_prefix("src/schema")
        .file("src/schema/point.capnp")
        .output_path("src/schema")
        .run()
        .expect("schema compiler command failed");
}

Ok! Now we have a generated schema file that is around 500 lines of code:

I'm going to cargo build again to see what happens when the schema already exists:

❯ ll src/schema 
total 56
-rw-r--r--  1 kushaljoshi  staff   159B 30 Apr 15:58 point.capnp
-rw-r--r--  1 kushaljoshi  staff    20K 30 Apr 17:54 point_capnp.rs

❯ cargo build
    Finished dev [unoptimized + debuginfo] target(s) in 0.04s

❯ ll src/schema
total 56
-rw-r--r--  1 kushaljoshi  staff   159B 30 Apr 15:58 point.capnp
-rw-r--r--  1 kushaljoshi  staff    20K 30 Apr 17:54 point_capnp.rs

Nothing (I ran the second cargo build at 18:00)! this looks good so far. I don't want to be pointlessly regenerating the schema on every build.

Right, now we have a schema and automatically generated code in our build. That's quite nice. Now how do we use it?

Using the generated code

In the generated code there's pub mod point module wrapper so this seems like a good places to start. Let's use that module in our project:

We'll keep it nice and simple. First we can make a server module that will be the capnp server.

Cargo.toml:

...

[dependencies]
capnp = "0.14"

...

main.rs:

mod server;

fn main() {
    println!("Hello, world!");
}

I've left the default new project code for now as a sign-post for very new people to see what is happening and how we are building up the project.

server.rs:

#[path = "./schema/point_capnp.rs"]
mod point_capnp;

use point_capnp::{point, point_tracker};

I'm guessing we need to tell the compiler where the code is.

There's a small issue when we try to build this. The generated code expects the point_capnp mod to be at the top level and doesn't like it being declared inside server:: :

That's a little annoying. The generated code is hard coded to crate::point_np.

I had a read of the issues for a few hours and found this has been addressed, albeit in what feels like a hacky way, and was raised/found as an issue in an old blog article from Hoverbear, which helped immensely here (thanks Ana!).

The simple answer for us right now (if there is a better/simpler solution, please comment) is to add this file - rust.capnp to the schema folder and include it in each schema like this:

point.capnp:

@0xb068ff5fb1c4f77e;

using Rust = import "rust.capnp";
$Rust.parentModule("server");

struct Point {
    x @0 :Float32;
    y @1 :Float32;
}

interface PointTracker {
    addPoint @0 (p :Point) -> (totalPoints :UInt64);
}

This is an irritant as it's a manual change to every schema file but it works great and compiled fine with tons of "associated function is not used" warnings for the generated code. Adding #![allow(dead_code)] at the top of the server.rs file fixed this for now. This is a pattern that works for now but probably won't scale - I'll let my server module "own" the capnp generate code for each schema that server is a host for.

I'm making a first commit to the repo at this point as I have a compiling capnp schema 🎉.

Getting to the Point

At this stage we are almost at the end of most of the available documentation regarding Rust but the capnproto-rust repo contains both serialisation and RPC examples. Deconstructing those, I'm hoping to make the simplest implementation I can here.

Let's make a point from our Point. The docs say:

In Rust, the generated code for the example above includes a point::Reader<'a> struct with get_x() and get_y() methods, and a point::Builder<'a> struct with set_x() and set_y() methods.

To understand how to use these, we have to jump back to the beginning of the documentation to understand how capnp works:

Cap’n Proto generates classes with accessor methods that you use to traverse the message.

Ok so we need to make a message that will contain our Point. I think.

In the address book example capnp::serialized_packed is used to read and write this message to a stream. Docs for this are here.

We can copy this address book code structure to make our Point.

server.rs:

#![allow(dead_code)]

#[path = "./schema/point_capnp.rs"]
mod point_capnp;

pub mod point_demo {
    use crate::server::point_capnp::point;
    use capnp::serialize_packed;

    pub fn write_to_stream() -> ::capnp::Result<()> {
        let mut message = ::capnp::message::Builder::new_default();

        let mut demo_point = message.init_root::<point::Builder>();

        demo_point.set_x(5_f32);
        demo_point.set_y(10_f32);

        serialize_packed::write_message(&mut ::std::io::stdout(), &message)
    }
}

main.rs:

mod server;

fn main() {
    let _ = server::point_demo::write_to_stream();
}

Output:

❯ cargo run
   Compiling capnproto-demo v0.1.0 (/Users/kushaljoshi/code/rust/capnproto/capnproto-demo)
    Finished dev [unoptimized + debuginfo] target(s) in 1.13s
     Running `target/debug/capnproto-demo`
 ̠@ A%

Fantastic! We "serialized" our Point and packed it into a capnp message. The message is not readable (it's the underscore, at-symbol, space, capital-A, percent-symbol) because it is capnp's binary type that does not need further serialization/deserialization over a stream to be used with an application. Can we check it?

Yes! The capnp tool provides a decode feature that needs the schema and the data structure:

❯ cargo run | capnp decode ./src/schema/point.capnp Point
    Finished dev [unoptimized + debuginfo] target(s) in 0.04s
     Running `target/debug/capnproto-demo`
capnp decode: The input is not in "binary" format. It looks like it is in "packed" format. Try that instead.
Try 'capnp decode --help' for more information.

Ok so this didn't work because we need to either tell capnp that it's a packed (compressed) message, or we need to print the raw message to STDOUT. Let's do both to increase our intuition of what is happening here. First we just need to add --packed to the CLI command:

❯ cargo run | capnp decode ./src/schema/point.capnp Point --packed
    Finished dev [unoptimized + debuginfo] target(s) in 0.04s
     Running `target/debug/capnproto-demo`
(x = 5, y = 10)

Now we can see that capnp can unpack (decompress) the message and print out the Point coords that we set. But we may not always have packed data so let's send the Point in its raw message format and make sure we can decode it as we would expect. We need to make a change to server for that:

server.rs

...

pub mod point_demo {
    use crate::server::point_capnp::point;
    use capnp::serialize;

    pub fn write_to_stream() -> ::capnp::Result<()> {
        let mut message = ::capnp::message::Builder::new_default();

        let mut demo_point = message.init_root::<point::Builder>();

        demo_point.set_x(5_f32);
        demo_point.set_y(10_f32);

        serialize::write_message(&mut ::std::io::stdout(), &message)
    }
}

Output:

❯ cargo run | capnp decode ./src/schema/point.capnp Point --packed
   Compiling capnproto-demo v0.1.0 (/Users/kushaljoshi/code/rust/capnproto/capnproto-demo)
    Finished dev [unoptimized + debuginfo] target(s) in 0.65s
     Running `target/debug/capnproto-demo`
capnp decode: The input is not in "packed" format. It looks like it is in "binary" format. Try that instead.
Try 'capnp decode --help' for more information.

A very helpful message that confirms what we know we did. We can remove the --packed flag now.

Output:

❯ cargo run | capnp decode ./src/schema/point.capnp Point
    Finished dev [unoptimized + debuginfo] target(s) in 0.05s
     Running `target/debug/capnproto-demo`
(x = 5, y = 10)

Fabulous.

If you have followed along and got this working, you may want to see the benefit more clearly so for that we can save the data and load it back in without any further serialization/deserialization.

server.rs:

pub mod point_demo {
    use crate::server::point_capnp::point;
    use capnp::serialize;
    use std::fs::File;

    pub fn write_to_stream() -> std::io::Result<()> {
        let mut message = ::capnp::message::Builder::new_default();

        let mut demo_point = message.init_root::<point::Builder>();

        demo_point.set_x(5_f32);
        demo_point.set_y(10_f32);

        // This Result should be consumed properly in an actual app
        let _ = serialize::write_message(&mut ::std::io::stdout(), &message);

        // Save the point
        {
            let file = File::create("point.txt")?;
            let _ = serialize::write_message(file, &message);
        }

        // Read the point from file
        {
            let point_file = File::open("point.txt")?;

            // We want this to panic in our demo incase there is an issue
            let point_reader =
                serialize::read_message(point_file, ::capnp::message::ReaderOptions::new())
                    .unwrap();

            let demo_point: point::Reader = point_reader.get_root().unwrap();
            println!("\n(x = {}, y = {})", demo_point.get_x(), demo_point.get_y());
        }

        Ok(())
    }
}

Output:

❯ cargo run
   Compiling capnproto-demo v0.1.0 (/Users/kushaljoshi/code/rust/capnproto/capnproto-demo)
    Finished dev [unoptimized + debuginfo] target(s) in 0.63s
     Running `target/debug/capnproto-demo`
@ A
(x = 5, y = 10)

So... it doesn't look like much happened there and the output, by design, looks the same.

However, you may have missed what just happened and how awesome this 😄 !!

Let's go through it:

We created a serialized Point from our Point Schema
We set data inside the serialized Point (no need to deserialize Point or serialize x & y float 32 values)
We saved the serialized data to disk using the standard file tools
We read in the serialized data using the standard file tooling (endianness is considered in the filetype)
We used accessor methods on the serialized data and printed the value without deserializing the data.

It's ok - if you are thinking "so what?", then your project use cases may not have been performance critical so far. If they have though, then this should be a wondrous thing to behold!

For the uber skeptical: The Reader above is not a deserializer. It is literally a Reader. It needs a schema and some data and it know how to set the pointers in the data (which is made up of ordered segments) to make the accessor methods point at the correct parts of the data. For more information have a read of the capnp encoding page.

You can decode the file data the same way as the STDOUT output above:

cat point.txt | capnp decode ./src/schema/point.capnp Point
(x = 5, y = 10)

Now this really quite interesting; if it can be saved, it can be thrown over a network and used by any client that has the appropriate map (schema) to read the received data, without any interim deserialization steps.

That's what we will try next in Part 2.

DEV Community

Cap'n Proto - RPC at the speed of Rust - Part 1 of 2

Installing Cap'n Proto

The plan

Generating a Cap'n Proto Schema

Using the generated code

Getting to the Point

Top comments (0)

Read next

Explaining SOLID Principle !

Data Traceability: Key Concepts and Best Practices

Day 4: Your first Container

How to Scrape Data from a Page with Infinite Scroll