DEV Community

Cover image for Rack for Ruby: Socket Hijacking
Ayush Newatia for AppSignal

Posted on • Originally published at blog.appsignal.com

Rack for Ruby: Socket Hijacking

In the first part of this series, we set up a basic Rack app, learned how to process a request and send a response.

In this post, we'll take over connections from Rack and hold persistent connections to enable pathways such as WebSockets.

First, though, let's look at how an HTTP connection actually works.

HTTP Connections

As this diagram shows, a TCP socket is opened, and a request is sent to a server. The server responds and closes the connection. All communication is in plain text.

HTTP sequence diagram

Using a technique called socket hijacking, we can take control of a socket from Rack when a request comes in. Rack offers two techniques for socket hijacking:

  • Partial hijack: Rack sends the HTTP response headers and hands over the connection to the application.
  • Full hijack: Rack simply hands over the connection to the client without writing anything to the socket.

Partial Hijacking

This is how you do a partial hijack:

class App
  def call(env)
    body = proc do |stream|
      5.times do
        stream.write "#{Time.now}\n\n"
        sleep 1
      end
    ensure
      stream.close
    end

    [200, { "content-type" => "text/plain", "rack.hijack" => body }, []]
  end
end
Enter fullscreen mode Exit fullscreen mode

rack.hijack is a Rack header, set in the same Hash as the HTTP response headers. Rack will look for such headers and process them as per the specification, instead of writing them to the HTTP response.

Run the above app and curl to it. You'll see that it writes the time at one-second intervals.

$ curl -i localhost:9292
Enter fullscreen mode Exit fullscreen mode

Full Hijacking

This is how you'd do a full hijack:

class App
  def call(env)
    headers = [
      "HTTP/1.1 200 OK",
      "Content-Type: text/plain"
    ]

    stream = env["rack.hijack"].call
    stream.write(headers.map { |header| header + "\r\n" }.join)
    stream.write("\r\n")
    stream.flush

    begin
      5.times do
        stream.write "#{Time.now}\n\n"
        sleep 1
      end
    ensure
      stream.close
    end

    [-1, {}, []]
  end
end
Enter fullscreen mode Exit fullscreen mode

In this case, we call the proc passed to us using the rack.hijack key, instead of setting one ourselves in the response. This gives us complete control over the socket. At the end, we return an array with the status -1 only because Rack expects an array to be returned. The contents of this array are ignored since we've taken over the socket.

This is a bad practice, rife with gotchas and weird behavior. Don't do it. Samuel Williams, who is a maintainer of Rack, recommends against it as well.

Streaming Bodies in Rack for Ruby

While full hijacking is a terrible idea, partial hijacking is a useful tool. But it still feels hacky, so Rack 3 formally adopted that approach into the spec by introducing the concept of streaming bodies.

class App
  def call(env)
    body = proc do |stream|
      5.times do
        stream.write "#{Time.now}\n\n"
        sleep 1
      end
    ensure
      stream.close
    end

    [200, { "content-type" => "text/plain" }, body]
  end
end
Enter fullscreen mode Exit fullscreen mode

Here we provide a block as the response body rather than an array. Rack keeps the connection open until the block finishes executing.

There's a huge gotcha here when using Puma. Puma is a multi-threaded server that assigns a thread to each incoming request. We're taking over the socket from Rack, but we're still tying up a Puma thread as long as the connection is open.

Puma concurrency can be configured, but threads are limited, and tying one up for long periods is not a good idea. Let's see this in action first.

$ bundle exec puma -w 1 -t 1:1
Enter fullscreen mode Exit fullscreen mode

In two separate terminal windows, run the following command at the same time:

$ curl localhost:9292
Enter fullscreen mode Exit fullscreen mode

One request is immediately served, but the other is held until the first one completes. This is because we started Puma with a single worker and single thread, meaning it can only serve a single request at a time.

We can get around this by creating our own thread.

class App
  def call(env)
    body = proc do |stream|
      Thread.new do
        5.times do
          stream.write "#{Time.now}\n\n"
          sleep 1
        end
      ensure
        stream.close
      end
    end

    [200, { "content-type" => "text/plain" }, body]
  end
end
Enter fullscreen mode Exit fullscreen mode

Now if you try the above experiment again, you'll see both curl requests are served concurrently because they don't tie up a Puma thread.

Once again, I must warn against this approach, unless you know what you're doing. These demonstrations are largely academic, as systems programming is a deep and complex topic.

Falcon Web Server

Since the threading problem is specific to the Puma web server, let's look at another option: Falcon. This is a new, highly concurrent Rack-compliant web server built on the async gem. It uses Ruby Fibers instead of Threads, which are cheaper to create and have much lower overhead.

The async gem hooks into all Ruby I/O and other waiting operations, such as sleep, and uses these to switch between different Fibers (ensuring a program is never held up doing nothing).

Revert your app to the previous version where we're not spawning a new thread:

class App
  def call(env)
    body = proc do |stream|
      5.times do
        stream.write "#{Time.now}\n\n"
        sleep 1
      end
    ensure
      stream.close
    end

    [200, { "content-type" => "text/plain" }, body]
  end
end
Enter fullscreen mode Exit fullscreen mode

Then remove Puma and install Falcon.

$ bundle remove puma
$ bundle add falcon
Enter fullscreen mode Exit fullscreen mode

Run the Falcon server. We need to explicitly bind it because it only serves https traffic by default.

$ bundle exec falcon serve -n 1 -b http://localhost:9292
Enter fullscreen mode Exit fullscreen mode

The server only uses a single thread, which you can confirm with the command below. You'll need to grab your specific pid from Falcon's logs.

$ top -pid <pid> -stats pid,th
Enter fullscreen mode Exit fullscreen mode

The thread count printed by the above command will be 2 because the MRI uses a thread internally.

Try the earlier experiment again and run two curl requests simultaneously.

$ curl localhost:9292
Enter fullscreen mode Exit fullscreen mode

You'll see they're both served at the same time, thanks to Ruby Fibers!

Falcon is relatively new. Ruby Fibers were only introduced in Ruby 3.0. Since Falcon is Rack-compliant, it can be used with Rails too, but the docs recommend using it with v7.1 or newer only. As such, it's a bit risky to use Falcon in production but it's a very exciting development in the Ruby world, in my opinion. I can't wait to see its progress in the next few years.

We've now learned how to create persistent connections in Rack and how to run them without blocking other requests, but the use cases so far have been academic and contrived. In the next and final part of this series, we'll examine how we can use this technique in a practical way.

Until then, happy coding!

P.S. If you'd like to read Ruby Magic posts as soon as they get off the press, subscribe to our Ruby Magic newsletter and never miss a single post!

Top comments (0)