DEV Community

Cover image for Trailblazer 2.1.3: How we made tracing +4x faster
Nick Sutterer for Trailblazer

Posted on

Trailblazer 2.1.3: How we made tracing +4x faster

(PaaS providers hate this trick!)

With the release of Trailblazer 2.1.3 not only does #wtf? tracing receive a notable performance boost, we also paved the way for a web-based debugger by refactoring tracing internals.

Tracing faster by factor 4-10

When you run a (deeply nested) operation with #wtf?, the tracing code uses the runtime taskWrap to inject tracing before and after an actual step is called. BTW, we also uploaded new docs for this whole thing "taskWrap", if you're interested in extending TRB (for example, to add logging or APM), check out the new chapter: https://trailblazer.to/2.1/docs/activity#activity-taskwrap

The reason the old tracing code is super slow was because every time our tracing is run, a snapshot of ctx was taken. That was literally just a ctx.inspect.

def default_input_data_collector(wrap_config, ((ctx, _), _))
  {
    # ctx: ctx.to_h.freeze,
    ctx_snapshot: ctx.to_h.collect { |k,v| [k, v.inspect] }.to_h,
  } # TODO: proper snapshot!
end
Enter fullscreen mode Exit fullscreen mode

You can see the # TODO: proper snapshot!, right? Well that's what we were working on the past weeks! Instead of "photographing" the entire ctx over and over with #inspect - which is tremendously slow once the ctx grows - we now compare its variables and only snapshot those that have changed.

Benchmarks

The entire refactoring is summarized in PR 42, which, by its sheer name, should be answering all your questions. And yes, there are still # TODOs after this refactoring, but tracing a moderate ctx is now 4-10 times faster and consumes a lot less memory.

Calculating -------------------------------------
        inspect-only     10.862  (± 9.2%) i/s -     54.000  in   5.066396s
            snapshot     61.024  (± 8.2%) i/s -    305.000  in   5.032931s

Comparison:
            snapshot:       61.0 i/s
        inspect-only:       10.9 i/s - 5.62x  slower
Enter fullscreen mode Exit fullscreen mode

This benchmark was run in a huge client application and made my day.

Another win in this refactoring is that we can now configure a per-object snapshooter, so if you don't want your ActiveRecords to run arbitrary queries, a value_snapshooter will be your friend. Big thanks to @richardboehme for his support here.

Added #left alias

And for all of you who have a problem with Operation.fail because your IDE doesn't like the name, we introduced #left:
https://trailblazer.to/2.1/docs/activity#activity-strategy-railway-left

Top comments (0)