Phusion white papers Phusion overview

The new Rack socket hijacking API

By Hongli Lai on January 23rd, 2013


Yesterday saw the release of Rack 1.5.0, which adds a new feature to the Rack specification dubbed socket hijacking. This feature allows applications to take over the client socket and perform arbitrary operations on it, e.g. implementing WebSockets, streaming data to the client, etc.

Did Rack not support streaming? Actually yes it did, you can do it by returning a body object that outputs body chunks in the #each method, as explained in our past article Why Rails 4 Live Streaming is a Big Deal. But this API is a bit clunky. The socket hijacking API provides access to a Ruby IO object-like API.

Support for socket hijacking has been added to Phusion Passenger 4 yesterday. The upcoming Phusion Passenger 4 has been covered here, here and here. Phusion Passenger Enterprise customers can already test and enjoy a preview of this feature by downloading the “3.9.2 beta preview (4.0.0 beta 2)” file from the Customer Area.

The socket hijacking API was surprisingly easy to implement, but unfortunately poorly documented at this time. The application-level API is not immediately obvious, and the Rack specification documentation has not yet been updated to cover the hijacking API. In this article we’ll introduce the API and provide an example program.

What the socket hijacking API is not

Some of you may have heard of efforts to develop a “Rack 2.0″ specification which properly covers things such as streaming and evented servers. According to the hijacking API developer, this API is not an attempt towards Rack 2.0. It is a “good enough” solution that works within the confines of the Rack 1.x specification. Things may change in Rack 2.0, though at this time it’s unclear what the progress towards Rack 2.0 is.

It is also unclear whether the API is supposed to be final or not. While implementing this API and writing this article we’ve discovered some room for improvement. The suggestions (which you can find later in this article) have been submitted to the developers.

Overview of the API

The hijacking API provides two modes:

  1. A full hijacking API, which gives the application complete control over what goes over the socket. In this mode, the application server doesn’t send anything over the socket, and lets the application take care of it. This mode is useful if you want to implement arbitrary (even non-HTTP) protocols over the socket. This is subject to limitations: if your application is behind a web server or an HTTP load balancer then those components dictate which protocols you can implement.
  2. A partial hijacking API, which gives the application control over the socket after the application server has already sent out headers. This mode is mostly useful for streaming.

The hijacking API is accessible through the Rack env hash. You can check whether the application server supports the hijacking API by checking env['rack.hijack?'], which returns a boolean value.

Full hijacking

You can perform a full hijack by calling env['rack.hijack'].call. You can access the hijacked socket object through env['rack.hijack_io']. Phusion Passenger’s implementation of env['rack.hijack'] returns the socket object, but it is unclear whether this is supposed to be standard behavior.

You are responsible for:

  • Outputting any HTTP headers, if applicable.
  • Closing the IO object when you no longer need it.

You should output the “Connection: close” header unless you plan on implementing HTTP keep-alive yourself.

Here’s am example of the full hijacking API in action:

# encoding: utf-8
require 'thread'

# Streams the response "Line 1" .. "Line 10", with
# 1 second sleep time between each line.
# 
# Non-Phusion Passenger users may have to turn off their
# web servers' buffering options for streaming to work.
# Phusion Passenger 4 users don't have to do anything, it
# works out-of-the-box thanks to our real-time response
# buffering feature.
app = lambda do |env|
  # Fully hijack the client socket.
  env['rack.hijack'].call
  io = env['rack.hijack_io']
  begin
    io.write("Status: 200\r\n")
    io.write("Connection: close\r\n")
    io.write("Content-Type: text/plain\r\n")
    io.write("\r\n")
    10.times do |i|
      io.write("Line #{i + 1}!\n")
      io.flush
      sleep 1
    end
  ensure
    io.close
  end
end

run app

Partial hijacking

You can perform a partial hijack by assigning a lambda to the rack.hijack response header. This lambda will be called after the application server has sent out headers. The application server will ignore the body part of the Rack response, and will call the ‘rack.hijack’ lambda, passing it the client socket. You are responsible for closing the socket when it’s no longer needed.

It is unclear what the value of the Rack response body should be. Phusion Passenger’s implementation doesn’t care: you can return a two-array response, or a three-array response where where the body can be anything. If the ‘rack.hijack’ response header is set, the body will be completely ignored.

Example:

# encoding: utf-8
require 'thread'

# Streams the response "Line 1" .. "Line 10", with
# 1 second sleep time between each line.
# 
# Non-Phusion Passenger users may have to turn off their
# web servers' buffering options for streaming to work.
# Phusion Passenger 4 users don't have to do anything, it
# works out-of-the-box thanks to our real-time response
# buffering feature.
app = lambda do |env|
  response_headers = {}
  response_headers["Content-Type"] = "text/plain"
  response_headers["rack.hijack"] = lambda do |io|
    # This lambda will be called after the app server has outputted
    # headers. Here we can output body data at will.
    begin
      10.times do |i|
        io.write("Line #{i + 1}!\n")
        io.flush
        sleep 1
      end
    ensure
      io.close
    end
  end
  [200, response_headers, nil]
end

run app

Issues with the hijacking API

Here’s how we think the hijacking API can be improved.

  • env['rack.hijack?'] appears to be unnecessary. You can already check for hijacking support by checking env['rack.hijack'].
  • The partial hijacking API should not involve assigning a lambda to the response headers. As far as we can see, you can just return the lambda as the body. That would be a much more elegant solution.
  • The return value for env['rack.hijack'] should be well-defined.

Conclusion

The Rack hijacking API, while having some quirks in our opinion, gets the job done. We hope that the usage of the hijacking API has become more clear after reading this article. If you have any comments, questions, suggestions or corrections, please let us know.

We at Phusion are working feverishly at the upcoming Phusion Passenger 4 (covered here, here and here). Implementing the hijacking API so quickly is our way of showing you how dedicated we are. Together with Phusion Passenger Enterprise, we aim to deliver the most stable, performant and feature rich polyglot application server out there. If you’re interested in future updates, please subscribe to our newsletter. Until next time!



  • Fort Unchagaon

    I Admire This Type Of Post. Post Is Really Great. fort unchagaon

  • http://twitter.com/raggi James Tucker

    Some notes from a discussion I had with Santiago when he was asking about the feedback contained here:

    1:06 my intent for env['rack.hijack?'] is that it is a boolean field that is stateful

    1:06 and that most hijacking servers can freely set env['rack.hijack'] at the start of every request

    1:06 there may be a difference between the two if:

    1:07 1) the server detects the http client started pipelining (hijack cannot work in this case)

    1:07 2) the server has been configured to disable hijacking

    1:07 3) the server has run out of concurrent resources or the like

    1:08 regarding his second point

    1:08 it’s itneresting, but, that would alter SPEC in potentially breaking or unexpected ways

    1:08 he’s talking about a type, when the current spec is defined by a method

    1:08 (an interface)

    1:09 what if someone does: class Proc; def each; end; end

    1:09 now the spec has to clear up this conflict

    1:09 and this conflict exists in real world code today

    1:09 if we said “if the body responds to call, then #call will be called first”

    1:09 then you would have all the middleware breaking

    1:09 (e.g. rack::file, etc)

    1:10 as it does [200, headers, self]

    1:10 regarding the return value for env['rack.hijack'] – yes, maybe

    1:10 the problem is, that this cannot be so well defined for all systems and servers

    1:10 C backed servers will implement different things from ruby backed servers

    1:11 and the IO and concurrency models of the servers redefine what is possible also

    1:11 so, the spec is intentionally vague

    1:11 as this allows people to use additional server or value features

    1:11 but they must use them with a clear knowledge of what their server is

    1:11 in other words

    1:11 it is left to “implementation defined behavior”

    1:12 also, IO on 1.8 is different from 1.9, is different from 2.0

    1:12 so i can’t even fully spec ruby IO

    1:12 and IO is complex

    1:12 if i provided a method signature only spec

    1:12 then it is undefined if IO.select should work with it

    1:12 because IO.select is not signature based, it’s type based

    1:12 etc, etc

    1:13 so, I’m also fighting lack of ruby spec

    1:13 his comments are fair first thoughts, but they were not thought out with full context of the world

    1:14 (equally, i’m not suggesting my design is perfect, but these are some things i considered)

    1:16 regarding what the value of the rack resposne should be

    1:16 yes, that’s certainly vague, and should be cleared up

    1:16 my recommendation is [200, {}, []]

  • GamesGames
  • GamesGames