The Road to Passenger 3: Technology Preview 4 – Adding new features and removing old limitations
In the past two years that we’ve been developing Phusion Passenger, we’ve received not only many feature requests but also many criticisms about certain limitations. Some feature requests have been implemented in Phusion Passenger 2.x, some have not. Some limitations were solved in the life time of Phusion Passenger 2, others were not because they require a lot of refactoring first. In Phusion Passenger 3 we’ve extensively refactored the code to not only make a lot of new cool features possible, but also to lift a lot of the old limitations. In this Technology Preview we are pleased to announce these changes.
Asynchronous spawning

Previously, when application processes are being spawned, Phusion Passenger is unable to handle HTTP requests until the processes are done spawning, because Phusion Passenger is holding the lock on the application pool while this is happening. Some websites have apps that need a very long time to spawn (30+ seconds) and this would be a problem for them. This behavior would also cause a “thundering herd” problem: suppose that a traffic spike appears, then the first request will cause Phusion Passenger to spawn a process. The other requests in the spike are blocked until that’s done, and then all of a sudden they are processed. Phusion Passenger thinks it needs to spawn more, so it spawns another one and blocks the rest, and so on. This can cause a large number of processes to be spawned all of a sudden, causing a long delay.

In Phusion Passenger 3 spawning happens in the background so that no clients have to be blocked. This turns out to work so well that application process spawning is now virtually unnoticeable, except when spawning the first application process.
Ability to configure minimum number of processes
Phusion Passenger automatically shuts down processes that haven’t been accessed for a while, where “a while” is configurable through the PassengerPoolIdleTime directive. Many web applications are rarely accessed during the night, so what happens is that all application processes are shut down during the night and the first person in the morning who accesses the web application has to wait for some time while the first application process is being spawned. This problem could be solved by setting PassengerPoolIdleTime to 0 or to a large number which means that processes are kept around forever or for a long time, but this also means that application processes are not shut down during the night, which might still be desirable for resource utilization reasons.
Phusion Passenger 3 introduces a new, long-awaited configuration directive: PassengerMinInstances. As its name implies, PassengerMinInstances makes sure that at least the given number of processes will be started and kept around. This, in combination with asynchronous spawning, turns out to work so well that we’ve assigned a default value of 1 for PassengerMinInstances. With Phusion Passenger 3, spawning delays should become a thing of the past!
Smart spawning support for all Rack applications
Smart spawning is a core feature of Phusion Passenger since version 1.0. It can reduce the spawning time of Rails processes by as much as 90%, and in combination with Ruby Enterprise Edition it allows you to save 33% memory on average.
However, smart spawning was limited to Rails applications only, not for Rack applications. Starting from Rails 3, all Rails 3 applications are also Rack applications, and Phusion Passenger 2 only supports smart spawning of Rails 3 applications if you remove config.ru (thereby forcing Phusion Passenger to detect it as a Rails app, not a Rack app).
Phusion Passenger 3 now supports smart spawning for all Rack applications. This works transparently and without further configuration.
Ability to access individual application processes over HTTP

When one sends a request to the web server, Phusion Passenger routes the request to one of the application processes, but one never knows up front which one that’s going to be, nor is there a way to control it. This is fine for normal requests, but sometimes one wants to send a request to a specific application process, e.g. to debug something. Another use case for accessing individual application processes directly is broadcasting messages: e.g. telling every application process to clear some local in-memory caches.
This has always been possible with reverse proxy app servers like Mongrel and Thin because each of their processes listen on their own port. Phusion Passenger 3 now also allows one to access individual application processes directly. Each application process can now be accessed through its own TCP socket port. The port numbers are randomly allocated by the operating system and the protocol is plain HTTP, so you can use existing tools like ‘curl’.
We take security very seriously. These sockets are bound to 127.0.0.1 so it’s not possible to access them from remote computers. Furthermore, each socket is protected by its own unique randomly generated secure password. One can use ‘passenger-status’ to query the ports and passwords.
Global queuing now on by default
Many people with web apps that have long-running requests are familiar with the “slow Mongrel queue problem”. When one sets up Mongrel or Thin behind Nginx or Apache, it’s possible for new requests to be queued behind a Mongrel/Thin instance which is currently processing a long-running request. When other Mongrel/Thin instances become available, this new request is already queued behind the long-running request and cannot migrate to the other free instances. The more long-running requests one has the bigger of a problem this can become, resulting to very long response times for some users.
There are multiple ways to solve this problem, but Phusion Passenger has already solved this problem for a long time with a feature that we call global queuing. It was disabled by default because turning global queuing off would yield a little bit of extra performance in microbenchmarks. However for version 3.0 we’ve decided to turn it on by default: rather than saving those few milliseconds in benchmarks, we believe it’s much more important that all users get to have fair response times.
Ability to disable friendly error pages
One of the innovations that Phusion Passenger has brought us is the ability to show friendly error messages directly in the browser, e.g. when the web application fails to spawn because of a syntax error. This dramatically improves usability for developers, new and experienced alike, because one doesn’t have to manually dig into log files anymore. However this is not always desirable in production: although the error page is developer-friendly, it isn’t necessarily user-friendly. It might also expose information that the system administrator would rather not expose such as filenames.
Phusion Passenger 3 introduces a new configuration directive for controlling whether friendly error pages should be shown: ‘PassengerFriendlyErrorPages’. When turned off, Phusion Passenger will display the standard Apache/Nginx 500 Internal Server Error page instead, but all error messages are still logged to the web server error log file.
Nginx-specific improvements

In Phusion Passenger 3, Nginx is compiled with SSL support by default due to popular demand.
We’ve also introduced the following new configuration options:
- passenger_set_cgi_param (name) (value)
- This is the Phusion Passenger equivalent of proxy_module’s proxy_set_header. It allows you to pass arbitrary CGI environment parameters the web application.
- passenger_buffer_response (on|off)
- This is the Phusion Passenger equivalent of proxy_buffer. It was and still is off by default to allow streaming responses, but when streaming responses aren’t necessary one can turn this option on so that Nginx can gracefully handling clients that are slow at receiving responses.
- passenger_ignore_client_abort (on|off)
- This is the Phusion Passenger equivalent of proxy_ignore_client_abort.
In other news
In Technology Preview 3 we unveiled Phusion Passenger Lite. Based on various feedback that we’ve gotten since then, we’ve decided to rename this thing to Phusion Passenger Standalone in order to reduce confusion about what it is.
Towards the future
If there is one thing we’ve come to understand over time is that different businesses have different needs and constraints when it comes to deploying their applications. In order to provide the best experience and support to these businesses, we’re working on different versions of Phusion Passenger to accommodate them even better in their respective environments. In light of this, we want to underline that the technology previews we’re currently writing about will and have described the cool stuff that will be incorporated in the version intended for the most high demand environments. More information on this will follow. Having said that, almost everything we’ve blogged about up till this point will be included in the version that’s available for everybody.
The Road to Passenger 3: Technology Preview 3 – Closing the gap between development and production & rethinking the word “easy”
Before Phusion Passenger came along, the most widely used Ruby app servers all implemented the same model which we refer to as the reverse proxy model. In this model, the user had to manually setup a bunch of app server processes and had to configure the web server to proxy requests to the app server processes. The technically inclined understand this model, but it is confusing to e.g. newcomers and to people who in general don’t have a lot of system administration skills or a reasonable understanding of HTTP. Most people were and still are much more familiar with PHP’s model, where you tell the web server where your app is and then have the web server take care of the rest for you. It was this confusion that caused all the uproar about sucky Rails deployment back in 2008.
While developing Phusion Passenger for Apache, we decided to follow a PHP-like model because ease of use was one of our main goals. No manual setups of app servers. No manual proxy configuration. Upload and go. For Phusion Passenger for Nginx, we continued to follow this model. Let’s call this the automatic model. As of 2010, Phusion Passenger appears to be the only widely-used Ruby app server that implements this model; the other widely-used Ruby app servers implement the reverse proxy model.
Reverse proxy vs automatic model: which one is better?
Ever since Phusion Passenger was first released, debates popped up about which one is superior. We believe that no model is inherently superior to the other. They are just different, meaning that both models have their own pros and cons. Which one is better for you depends a lot on your server infrastructure and your system administrators’ preferences.
Phusion Passenger’s automatic model:
- Integrated into the web server. Processes are managed along with the web server itself, and configuration happens in the web server config file.
- Easier to comprehend for most people. Appears more “standard stack” to system administrators who are not familiar with Ruby specifically.
- Can spawn and shutdown processes dynamically according to traffic patterns.
- Processes are automatically monitored: if they crash they are automatically restarted.
- Less manual control over individual processes because they can come and go at any time.
Reverse proxy model as implemented by most other Ruby app servers:
- App server is a separate entity. Processes are managed distinctly from the web server itself. Configuration happens outside the web server.
- Many people have a hard time comprehending this and they generally find setups like this cumbersome, but to experts this model can be seen as simple, elegant and sensible.
- Most app servers do not automatically restart crashed processes and one needs to monitor processes separately with things like Monit.
- One needs to specify the number of processes up front: no dynamic process count scaling according to traffic.
- Allows fine-grained manual control over individual processes.
We are not commenting on which points are supposed to be pros and which points are supposed to be cons because they are highly subjective. For us, integration into the web server is a strong plus because we host dozens of apps on our server(s) and we don’t like to spend time managing app server processes for each app, but other people are uncomfortable with having the web server manage things automatically and would prefer to keep a close eye on everything.
The automatic model can also be problematic to people who were on the reverse proxy model because they already had their web servers and infrastructures configured in a certain way. Switching to Phusion Passenger could mean changing a lot of web server configuration.
The hidden but unutilized potential
Reverse proxy model app servers can potentially have an extra advantage, but for some reason this hasn’t been implemented to its full potential so far:
Reverse proxy app servers are just easier to get started with. When you’ve just created a new Rails app, you can start it with script/server and you’re ready to go.
This works great in development but totally blows up in production. Reverse proxy model app servers must be put behind a reverse proxy e.g. Nginx or HAProxy for a variety of reasons such as security, load balancing between processes, handling of slow clients, etc. In production environments nobody exposes Mongrel or Thin directly to the Internet. Unicorn even explicitly documents that it is designed to be put behind a reverse proxy and that it doesn’t bother with slow clients at all.
In contrast, Phusion Passenger 2.x requires one to configure the web server, meaning the user must first install a web server. This is cumbersome when you’re in development and just want to get started. It is also cumbersome if you’re a newcomer and aren’t familiar with Apache or Nginx, and you just want to get your app running on your server.
Do you type script/server in development instead of creating a virtual host in the Apache or Nginx? Well you’re not the only one: we also do this until we eventually get sick of it, but there’s always a mental blockade that tells us that editing the web server configuration file is too much work to bother with.
Well, until Phusion Passenger 3 comes along.
Phusion Passenger Lite: fusion between the reverse proxy and the automatic model
In addition to Phusion Passenger for Apache and Phusion Passenger for Nginx, Phusion Passenger 3 introduces a new component to the existing lineup: Phusion Passenger Lite.
When it comes to usage, its interface is almost identical to that of Mongrel and Thin. To run your Ruby web app, just type this in the terminal and you’re ready to go:
passenger start
Closing the gap between development and production
Phusion Passenger Lite consists of an Nginx core. Nginx is known to be extremely scalable, high-performance and lightweight. You do not need to have Nginx already installed; this is automatically taken care of. You also do not need to have any Nginx experience: Nginx is hidden from the user but its power is automatically utilized.
Unlike Mongrel, Thin and Unicorn, Phusion Passenger Lite can be directly exposed to the Internet. It can serve static files at blazing speeds thanks to the Nginx core. Mongrel and Thin can serve static files but they aren’t very good at it. Unicorn doesn’t even try.
Easy migration from existing reverse proxy app servers
Because the interface is so similar, you can easily swap Mongrel, Thin or Unicorn in your existing reverse proxy setup and replace it with Phusion Passenger Lite. Unlike Mongrel and Thin, Phusion Passenger Lite only has to listen on a single socket instead of multiple, vastly simplifying your reverse proxy configurations. Phusion Passenger Lite can listen on a Unix domain socket instead of a TCP socket, just like Thin and Unicorn. In reverse proxy setups this can yield much higher performance than TCP sockets.
Advantages over existing reverse proxy app servers
Unlike Mongrel, Thin and Unicorn, Phusion Passenger Lite can dynamically spawn and shutdown processes according to traffic. However you can also configure it to use a static number of processes! In fact you can configure a minimum and a maximum and have Phusion Passenger Lite automatically figure out the number of processes to use for the current traffic.
Like Phusion Passenger for Apache/Nginx and Unicorn, worker processes that have crashed are automatically restarted.
That said, bear in mind that this advantage can be a disadvantage to some people. At its heart, Phusion Passenger Lite still manages processes for you, so you don’t have as much fine-grained control over the processes as you do with other reverse proxy app servers.
Advantage over Phusion Passenger for Apache and Phusion Passenger for Nginx
Another unintended advantage of Phusion Passenger Lite is that it runs as the same user as the shell and respects environment variables that are defined for your shell, e.g. things like PATH, LD_LIBRARY_PATH, RUBYOPT, GEM_PATH and GEM_HOME.
- Some people find that their app cannot load a certain library when the app is started in Phusion passenger, but can when the app is started with e.g. Mongrel or Thin. This is almost always caused by some environment variable that’s set in the shell but not in the web server: everything you set in /etc/bashrc and friends don’t have effect on processes started from the web server.
- Some people say that their application does not start on Phusion Passenger, but does under Mongrel/Thin. Very often this turns out to be just a permission issue or some web server configuration issue.
With Phusion Passenger Lite, even confusion like this will be a thing of the past.
Rethinking the word “easy”: automatic mass deployment
Phusion Passenger is considered by many people the easiest Ruby app server out there. But can it be easier? After some heavy thinking outside the box, we believe the answer is yes, it certainly can!
Imagine having a directory full of Ruby web apps, e.g. ~/Sites. To deploy an app, just drop your application’s directory into ~/Sites. To undeploy it, remove the application’s directory. The application directory’s name is used as the domain name. No manually signaling the web server for a restart.
This is exactly what we’ve done with Phusion Passenger Lite. We call this feature automatic mass deployment. Check out this demo.
So from now on, if you have a bunch of Ruby web apps in the same directory, just run this command in that directory
sudo passenger start -p 80 -u (some_unprivileged_username)
and you’ve immediately deployed every single app!
Conclusion
Phusion Passenger Lite does not replace Phusion Passenger for Apache or Phusion Passenger for Nginx. Rather, it is a complement to our existing lineup of Phusion Passenger products, optimized for different use cases. Phusion Passenger Lite closes the gap between development and production and can be used comfortably and easily in both. It can act as a drop-in replacement for your existing reverse proxy based setup. It makes Ruby web app deployment even easier than before: now you don’t even need a separate web server. On the other hand, if you need integration into the web server, then Phusion Passenger for Apache/Nginx is for you.
We hope you’ve enjoyed this Technology Preview. Please stay tuned for the next one because we have more exciting news for you!
Phusion Passenger 2.2.15 released
We know that many people are eagerly awaiting Phusion Passenger 3, but we ask these people to be patient for a while longer. We want to ensure that the initial version is of sufficient quality before we release it to the world. But for now, we haven’t forgotten about all the people who are still on 2.2, and so we’re releasing this bug fix release.
What’s new in version 2.2.15?
- [Apache] Fixed incorrect temp dir cleanup by passenger-status
-
On some systems, running passenger-status could print the following message:
*** Cleaning stale folder /tmp/passenger.1234
…after which Phusion Passenger breaks because that directory is necessary for it to function properly. The cause of this problem has been found and has been fixed.
- [Apache] Fixed some upload handling problems
- Previous versions of Phusion Passenger check whether the size of the received upload data matches the contents of the Content-Length header as sent by the client. It turns out that there could be a mismatch e.g. because of mod_deflate input compression, so we can’t trust Content-Length anyway and we were being too strict. The check has now been removed.
- [Nginx] Fixed compilation issues with Nginx >= 0.7.66
- Thanks to Potamianos Gregory for reporting this issue. Issue #500.
- [Nginx] Default Nginx version changed to 0.7.67
- The previous default version was 0.7.65.
- Fixed more Bundler problems
- Previous versions of Phusion Passenger would preload some popular libraries such as mysql and sqlite3 in order to utilize copy-on-write optimizations better. However this behavior conflicts with Bundler so we’ve removed it.
How do I upgrade to 2.2.15?
Via a gem
Please install it with the following command:
gem install passenger
Next, run:
passenger-install-apache2-module
Or, if you’re an Nginx user:
passenger-install-nginx-module
Please don’t forget to copy & paste the Apache/Nginx config snippet that the installer gives you.
Via a native Linux package
John Leach from Brightbox has kindly provided Ubuntu packages for Phusion Passenger. These packages are available from the Brightbox repository which you can find at:
http://apt.brightbox.net
Add the following line to the Third Party Software Sources:
deb http://apt.brightbox.net hardy main
(The simplest way to do that is to create a file in /etc/apt/sources.list.d/ containing the deb instruction, and then run ‘apt-get update’).
Once you’ve done this then you can install Phusion Passenger by running:
sudo apt-get install libapache2-mod-passenger
-or-
sudo apt-get install nginx-brightbox
(Note that John is currently packaging 2.2.15, so it might take a while before this release shows up in the apt repository.)
Final
Phusion Passenger is provided to the community for free. If you like Phusion Passenger, please consider sending us a donation. Thank you!
The Road to Passenger 3: Technology Preview 2 – Stability, robustness, availability, self-healing
In Technology Preview 1 we’ve shown that Phusion Passenger 3 can be up to 55% faster on OS X. Performance is good and all, but it won’t do you any good unless the software keeps running. When Phusion Passenger goes down it’s an annoyance at best, but in the worst case any amount of down time can cost your organization real money. Any HTTP request that’s dropped can mean a lost transaction.
Although stability, robustness and availability aren’t as hot and fashionable as performance, for Phusion Passenger 3 we have not neglected these areas. In fact we’ve been working on hard on implementing additional safeguards, as well as refactoring our designs to make things more stable, robust and available.
Self-healing
In Phusion Passenger 2.2′s architecture, there are a number of processes that work together. At the very front there is the web server, which could consist of multiple processes. If you’ve ever typed passenger-memory-stats then you’ve seen the web server processes at work. Apache typically has a dozen of processes (prefork MPM) and Nginx typically has 3.
For Phusion Passenger to work, there must be some kind of global state, shared by all web server processes. In this global state information is stored such as which Ruby app processes exist, which ones are currently handling requests, etc. This allows Phusion Passenger to make decisions such as which Ruby app process to route a request to, whether there is any need to spawn another process, etc. This global state exists in a separate process which all web server processes communicate with. On Apache this is the ApplicationPoolServerExecutable, on Nginx this is the HelperServer. For simplicity’s sake, let’s call both of them HelperServer (in Passenger 3 they’ve both been renamed to PassengerHelperAgent). The HelperServer is written in C++ and is extremely fast and lightweight, consuming only about 500 KB of real memory.
As you can see the HelperServer is essentially the core of Phusion Passenger. The problem with 2.2 is that if the HelperServer goes down, Phusion Passenger goes down with it entirely. Phusion Passenger will stay down until the web server is restarted. For various architectural reasons in Apache and Nginx, it is not easily possible to restart the HelperServer upon a crash in a reliable way.
Now, why would the HelperServer ever crash?
- Bugs. We are humans too and we can make mistakes, so it’s possible that there are crasher bugs in the HelperServer. In the past 2 years we’ve spent a lot of effort into making the HelperServer stable. For example we check all system calls for error results, and we’ve spent a lot of effort into making sure that uncaught exceptions are properly logged and handled. However one can never prove that a system is entirely bug free. We aren’t aware of any crasher bugs at this time but they might still exist.
- Limited system resources. For example if the system is very low on memory, the kernel will invoke the Out-Of-Memory Killer (OOM Killer). Properly selecting a process to kill in low-memory conditions is actually a pretty hard problem, and more often than not the OOM Killer selects the wrong process, e.g. our HelperServer.
- System administrator mistakes. Passing the wrong PID to the kill command and things like that.
- System configuration problems, hardware problems (faulty RAM and stuff) and operating system bugs.
There are some people who have reported problems with the HelperServer. Their HelperServer crashes tend to happen sporadically and they usually cannot reproduce the problem reliably themselves. For many people (2) is often the cause of HelperServer crashes, and increasing the amount of swap is reported to help, but for other people the problems lie elsewhere.
The crashes aren’t always our fault (i.e. not bugs), but they are always our problem. It saddens us to say that we’ve been unable to help these people so far because we simply cannot reproduce their problems even when we mimic their system configuration.
But this is going to change.
Enter Phusion Passenger 3 with self-healing architecture
Phusion Passenger 3 now introduces a lightweight watchdog process into the architecture. It monitors both the web server and the HelperServer. If the HelperServer crashes, then the watchdog restarts the HelperServer immediately.
Of course if the watchdog is killed then it’s still game over, but we’ve taken extra care in terms of code to try to make this extremely unlikely to happen. The watchdog for starters is extremely lightweight, even more so than the HelperServer. It is written in C++ and uses about 150-200 KB of memory. Its only job is to start the HelperServer and other Phusion Passenger helper processes and to monitor them. The codebase extensively uses C++ idioms that promote code stability, such as smart pointers and RAII. By employing heavy testing as well, we’re expecting to have brought the possibility that the watchdog contains crashing bugs to a minimum. The small footprint and the fact that it does nothing most of the time minimizes the chances that it’s killed by the OOM Killer. In fact, if the watchdog is running on Linux and has root access, it will register itself as not OOM-killable.
No longer will HelperServer crashes take down Phusion Passenger, even if the crash isn’t our fault.
Restarts are fast
It only takes a few hundred miliseconds to restart the HelperServer.
Crashing signals are logged
If the HelperServer crashes then the watchdog will tell you whether it crashed because of a signal, e.g. SIGSEGV. This makes it much easier for system administrators to see why a component crashed so that they might fix the underlying cause. In Phusion Passenger 2.2 this was not possible.
Guaranteed cleanup
Upon shutting down or restarting the web server, Phusion Passenger 2.2 gracefully notifies application processes to shut down. It does not force them to. This would pose a problem for broken web applications that don’t shut down properly, e.g. web applications that are stuck in an infinite loop, stuck in a database call, etc.
Phusion Passenger 3 guarantees that all application processes are properly shut down when you shutdown/restart the web server. It gives application processes a deadline of 30 seconds to shutdown gracefully; if any of them fail to do that, they’ll be terminated with SIGKILL.
This mechanism works so well that it even extends to background processes that have been spawned off by the web application processes. All of those processes belong to the same process group. Phusion Passenger sends SIGKILL to the entire process group and terminates everything. No longer will you have to manually clean up processes; you can be confident that everything is gone if you shutdown/restart the web server.
Zero-downtime web server restart
In Phusion Passenger 2.2, whenever you restart the web server, HTTP requests that are currently in progress are dropped and the clients receive ugly “Connection reset by server” or similar error messages. This can be a major problem for large websites, because during the 1 second that Phusion Passenger is restarting hundreds of people could be getting errors. If your visitor happens to be clicking on that “Buy” button, well, tough luck.
In Phusion Passenger 3 we’ve implemented zero-downtime web server restart. Phusion Passenger and the web server are restarted in the background, and while this is happening, the old web server instance (with the old Phusion Passenger instance) will continue to process requests.
The architecture is actually a little bit more complicated than what’s shown in the diagram because behind the web server there are a bunch of Phusion Passenger processes, but you get the gist of it.
When the new web server (along with the new Phusion Passenger) has been started, it will immediately begin accepting new requests. Old requests that aren’t finished yet will continue to be processed by the old web server and Phusion Passenger instance. The old instance will shut down 5 seconds after all requests have been finished, to counter the possibility that the kernel still has leftover requests in the socket backlog that hit the old instance after its done processing everything already in its queue.
This works so well that we can restart the web server while running an ‘ab’ benchmark with 100 concurrent users without dropping a single request!
Zero-downtime application shutdown
Suppose that a Ruby application process has gone rogue and you want to shut it down. The most obvious way to do that is by sending it a SIGTERM or SIGKILL signal. However this would also abort whatever request it is currently processing.
In Phusion Passenger 2.2, you could also send SIGUSR1 to the process, causing it to shut down gracefully after it has processed all requests in its socket backlog. However this introduces two problems:
- If the website is very busy then the process’s socket backlog will never be empty, and so the process will never exit.
- Exiting after the process has detected an empty backlog can introduce a race condition. Suppose that, right after the process has determined that its backlog is empty but before it has shutdown completely, Phusion Passenger tries to send another request to the process. This request would be lost.
In Phusion Passenger 3, SIGUSR1 will now cause the application process to first unregister itself so that Phusion Passenger won’t route any new requests to it anymore. It will then proceed with exiting 5 seconds after its socket backlog has become empty. This way you can gracefully shutdown a process without losing a single request.
Conclusion
Although Phusion Passenger has been powering many high-traffic Ruby websites for a while now, some people still have some doubts about whether Phusion Passenger is fit for production. Instead of using words convince them, we would rather convince them with real results. Phusion Passenger 3 raises the bar in the areas of performance, stability, robustness and availability yet higher, but it doesn’t stop here. Please stay tuned for the next Technology Preview in which will unveil even more of Phusion Passenger 3.
The Road to Passenger 3: Technology Preview 1 – Performance
It has already been two years since we’ve first released Phusion Passenger. Time sure flies and we’ve come a long way since then. We were the first to implement a working Ruby web app deployment solution that integrates seamlessly in the web server, and all the features that we’ve developed over time – smart spawning and memory reduction, upload buffering, Nginx support, etc – have served us for a long time. Nevertheless, it is time to say goodbye to the old Phusion Passenger 2.2 codebase. In the past we had focused primarily on three things:
- Ease of use.
- Stability.
- Robustness.
Notice that “performance” is not on the above list. We strived to make Phusion Passenger “fast enough”, e.g. not ridiculously slower than the alternatives. Lately it would appear that competitors are once again focusing on performance. We can of course not afford to stay behind. We’ve been working on Phusion Passenger 3 for a while now. Today we will begin unveiling the technology behind this new major Phusion Passenger version. This blog post is the first of the multiple technology previews to come.
Read on…
Making Ruby threadable: properly handling context switching in native extensions
In the previous article Does Rails Performance Need an Overhaul? we had discussed the fact that proper Ruby threading is hindered by various broken native extensions. Writing a native extension for Ruby is pretty easy, however writing it right can not only be difficult, but can also be an obscure practice that requires l33t sk1llz because of the lack of documentation in this area. We’ve written several native extensions so far and in the process of figuring out how to make threading-friendly native extensions we had to wade through tons of Ruby source code. In this article I want to teach some best practices in the writing of threading-friendly native extensions.
Threading basics
As discussed in the previous article, Ruby 1.8 implements userspace threads, meaning that no matter how many Ruby threads you have, only one can run at a time, and only on a single CPU core. The threads are scheduled by Ruby itself, not by the operating system.
Ruby 1.9 implements native operating system threads. However it has a global interpreter lock which must be locked when the thread is running Ruby code. This effectively makes Ruby 1.9 single threaded most of the time.
With both Ruby 1.8 and 1.9 threads, system calls such as I/O operations can block the thread and preventing Ruby from context switching to another. Thus, system calls require special attention. Expensive calculations that do not involve system calls can also block the thread, but something can be done those as well, as you will read later on.
Handling I/O
Suppose that you have a file descriptor on which you want to perform some potentially blocking I/O. The naive approach is to perform the I/O command anyway and risk blocking the entire Ruby process. This is exactly what makes the mysql extension thread-unfriendly: while waiting on MySQL no other threads can run, grinding your multi-threaded Rails web app to a halt.
However there are a number of functions in your arsenal that you can use to combat this problem. And as a general rule, you should set to file descriptors to non-blocking mode.
rb_thread_wait_fd(fd)
Just before performing a blocking read, you should call rb_thread_wait_fd() on the file descriptor that you’re reading from. On 1.8, this function marks the current thread as waiting for readable data on this file descriptor and then invokes the scheduler. The scheduler uses the select() system call to check which file descriptors are readable and then selects a thread which may continue. If the file descriptor that you were waiting on is not readable, then your thread will be suspended until the next time the scheduler is invoked and selects your thread. But even if the file descriptor is immediately readable, the scheduler does not guarantee that your thread will be selected immediately.
On 1.9, rb_thread_wait_fd() simply unlocks the global interpreter lock, calls the select() system call on the given file descriptor, and re-acquires the global interpreter lock when select() returns. While select() is blocking, other threads can run.
As an optimization, if only the main thread exists then this function does nothing. This applies to both 1.8 and 1.9.
rb_thread_fd_writable(fd)
This works the same as rb_thread_fd(), but waits until the given file descriptor becomes writable. The single-thread optimization applies here too. You should call rb_thread_fd_writable() just before you perform a write I/O operation.
rb_thread_select()
To wait on multiple file descriptors, use this function instead of select() or poll(). Unlike the native system calls, this function will take care of invoking the scheduler or unlocking the global interpreter lock. Unlike rb_thread_wait_fd() and rb_thread_fd_writable(), there is no do-nothing-when-there’s-only-one-thread optimization here so it will always invoke the scheduler and call select().
rb_io_wait_readable()
I/O system calls can return a variety of error codes that indicate that you should restart the system call, such as EINTR (system call interrupted by signal) and EAGAIN (the file descriptor is set to non-blocking mode and the data is not yet available). You should therefore always call I/O system calls in a loop until it returns success or a different error code. You must however not forget to call rb_thread_wait_fd() or rb_thread_select() before you restart the system call, or you will risk blocking the thread again.
Ruby provides a function rb_io_wait_readable() to aid you in writing restart code. This function should be called right after your I/O reading system call has returned. It checks whether the system call should be restarted (returning Qtrue) or whether you should report an error (returning Qfalse). Here’s a code example:
int done = 0;
int ret;
/* Have the Ruby scheduler suspend this thread until the file descriptor becomes
* readable; or if this is the only thread in the system, rb_thread_wait_fd() does
* nothing and we immediately continue to the 'do' loop.
*/
rb_thread_wait_fd(fd);
do {
/* Actually you should surround your system call with some more code, but
* we'll get to this later. This example code is only partial. */
ret = ...your read system call here...
if (ret == -1) {
if (rb_io_wait_readable(fd) == Qfalse) {
...throw an exception here...
} /* else restart loop */
} else {
done = 1;
}
} while (!done);
rb_io_wait_readale() checks whether errno equals EINTR or ERESTART, in which case it will call rb_thread_wait_for() on the file descriptor and return Qtrue. If errno is EAGAIN or EWOULDBLOCK then it calls rb_thread_select() on the file descriptor and returns true. Otherwise it returns false.
The difference between calling rb_thread_wait_for() and rb_thread_select() here is subtle, but important. The former only blocks (calls select() on the file descriptor) when there are multiple Ruby threads in the Ruby process, while the latter always blocks no matter what. This behavior is important because EAGAIN and EWOULDBLOCK occur when a non-blocking file descriptor is not yet readable; if we don’t block here on a select() then the code will enter a 100% CPU busy loop.
rb_io_wait_writable()
Works the same way as rb_io_wait_readable(). Use this for I/O write operations instead.
Sleeping
Use rb_thread_wait_for() instead of sleep() or usleep(). On 1.8 rb_thread_wait_for() marks the current thread as sleeping for a period of time and then invokes the scheduler, which does not select this thread until the period of time has expired. On 1.9 Ruby unlocks the global interpreter lock, calls some sleeping function, and then re-locks it after that function returns.
Other non-I/O blocking system calls
Sometimes you will want to wait on a blocking system call that isn’t related to I/O, such as waitpid(). There are several ways to deal with these kind of system calls.
Blocking outside the global interpreter lock
This method only works on Ruby 1.9. Unlock the global interpreter lock, do your thing, then re-locks it. Dealing with the global interpreter lock will be discussed later.
Non-blocking polling
Some system calls have non-blocking equivalents which return a certain error instead of blocking. For example waitpid() blocks by default, but it can be set to non-blocking by passing the WNOHANG flag, which causes it to return immediately with an error instead of blocking. You must call the non-blocking version in a loop. Upon detecting a blocking error, you must call rb_thread_polling(). On 1.8 this function lets the scheduler put the current thread to sleep for 60 msec, on 1.9 for 100 msec.
For example, Ruby’s Process#waitpid function does not block other threads. On 1.9 it simply unlocks the global interpreter lock while blocking on waitpid(). On 1.8 it is implemented as follows (simplified version):
retry:
int result = waitpid(..., WNOHANG);
if (result < 0) {
if (errno == EINTR) {
/* Process isn't ready yet. Tell the scheduler and then restart the call. */
rb_thread_polling();
goto retry;
} else {
...throw exception...
}
}
The actual code is actually more optimized than this. For example if there's only a single thread in the system then it calls waitpid() without WNOHANG and just have it block.
Calling the system call in a native OS thread and use I/O to report results
This is probably the most complex way but on 1.8 sometimes you don't have any choice. On 1.9 you should always prefer unlocking the global interpreter lock over this method.
Create a pipe, then spawn a native OS thread which calls the system call. When the system call is done, have your native thread report the result back via the pipe. On the Ruby side, use rb_thread_wait_fd() and friends to block on the pipe and then receive the results. Be sure to join the thread after you've read the result because rb_thread_wait_fd() does not necessarily block until there is data, so when rb_thread_wait_fd() returns it is not guaranteed that the thread has returned yet.
Another thing to watch out for is that your thread must not refer to data that's on the Ruby thread's stack. This is because Ruby overwrites the main OS thread's C stack upon context switching to another Ruby thread. For example code like this is not OK:
static void thread_main(int *value) {
/* 'value' here refers to the 'value' variable on foobar's stack, but
* that data is overwritten when Ruby context switches, so we
* really can't use 'value' here!
*/
}
/* Native extension Ruby method. */
static void foobar() {
int value = 1234;
thread_t thread = create_a_thread(thread_main, &value);
...do something which can cause a Ruby thread context switch...
join_thread(thread);
}
To pass data to the thread, you should put the data on the heap instead of the stack. This is OK:
typedef struct {
...
} Data;
static void thread_main(Data *data) {
/* 'data' is safe to access. */
}
/* Native extension Ruby method. */
static void foobar() {
Data *data = malloc(sizeof(Data));
thread_t thread = create_a_thread(thread_main, data);
...do something which can cause a Ruby thread context switch...
join_thread(thread);
free(data);
}
Heavy CPU computations
Not only blocking system calls can block other threads, CPU-heavy computation code can also do that. While executing non-Ruby-API C code, context switching to other threads is not possible. Calls to Ruby APIs may sometimes cause context switching. However there are several ways to make context switching possible while running CPU-heavy computations.
Unlocking the global interpreter lock
This only works on 1.9. Unlock the global interpreter lock and then call the computation code, and relock when done. Consider BCrypt-Ruby as an example. BCrypt is a very heavy hashing algorithm used for securely hashing passwords; depending on the configured cost it could need several minutes to calculate a hash. We've recently patched BCrypt-Ruby to unlock the global interpreter lock while running the BCrypt algorithm, so that when you run BCrypt-Ruby in multiple threads the algorithms can be spread across multiple CPU cores.
However, be aware of the fact that unlocking and relocking the global interpreter lock comes with some overhead as well. Unlocking and relocking the global interpreter lock is only worth it if you know that the computation is going to take a while (say, longer than 50 msec). If the computation time is short then you will actually make your code slower because of all the locking overhead. Therefore BCrypt-Ruby only unlocks the global interpreter lock if the BCrypt cost is set to 9 or higher.
Explicit yielding
You can call rb_thread_schedule() once in a while to force context switching to another thread. However this approach does not allow your code to make use of multiple cores even if you're on 1.9.
Running the C code in a native OS thread
This is pretty much the same approach as described by "Calling the system call in a native OS thread and use I/O to report results". In my opinion, unless your computation takes a very long time, implementing this is almost never worth the trouble. For BCrypt-Ruby we didn't bother: if you want multi-core support in BCrypt-Ruby you need to be on 1.9.
TRAP_BEG/TRAP_END and the global interpreter lock
TRAP_BEG and TRAP_END
On 1.8, you should surround system calls with calls to TRAP_BEG and TRAP_END. TRAP_BEG performs some preparation work. TRAP_END performs a variety of things:
- It checks whether there are any pending signals, e.g. whether the user pressed Ctrl-C. If so it will raise an appropriate SignalException.
- It also calls the scheduler of a certain amount of time has been spent on the current thread.
On 1.9 TRAP_BEG and TRAP_END are macros that unlock and lock the global interpreter lock. However these macros are deprecated and are likely to disappear in the future so you should not use them on 1.9. Instead, you should use rb_thread_blocking_region().
On 1.9 TRAP_BEG and TRAP_END are defined in ruby/backward/rubysig.h.
rb_thread_blocking_region()
This is a 1.9-specific function which allows you to call a function outside the global interpreter lock. Its declaration is as follows:
rb_thread_blocking_region(rb_blocking_function_t *func, void *data1,
rb_unblock_function_t *ubf, void *data2);
func is a pointer to a function that is to be called outside the global interpreter lock. This function must look similar to:
VALUE foobar(void *data)
The data passed via the data1 parameter is passed to the function.
ubf is either RUBY_UBF_IO (indicating that you're performing some kind of I/O operation) or RUBY_UBF_PROCESS (indicating that you're calling some kind of process management system call). However I'm not sure what this parameter exactly does. data2 is supposedly passed to ubf when it's called.
The return value of this function is the return value of func.
Global interpreter lock caveats
Do not call any Ruby API functions while the global interpreter lock is unlocked! No rb_yield(), rb_str_new(), or anything. The entirety of the Ruby API is only safe to call when the global interpreter lock is obtained.
Does Rails Performance Need an Overhaul?
Igvita.com has recently published the article Rails Performance Needs an Overhaul. Rails performance… no, Ruby performance… no Rails scalability… well something is being criticized here. From my experience, talking about scalability and performance can be a bit confusing because the terms can mean different things to different people and/or in different situations, yet the meanings are used interchangeably all the time. In this post I will take a closer look at Igvita’s article.
Performance vs scalability
Let us first define performance and scalability. I define performance as throughput; number of requests per second. I define scalability as the amount of users a system can concurrently handle. There is a correlation between performance and scalability. Higher performance means each request takes less time, and so is more scalable, right? Sometimes yes, but not necessarily. It is entirely possible for a system to be scalable, yet manages to have a lower throughput than a system that’s not as scalable, or for a system to be uber-fast yet not very scalable. Throughout this blog post I will show several examples that highlight the difference.
“Scalability” is an extremely loaded word and people often confuse it with “being able to handle tons and tons of traffic”. Let’s use a different term that better reflects what Igvita’s actually criticizing: concurrency. Igvita claims that concurrency in Ruby is pathetic while referring to database drivers, Ruby application servers, etc. Some practical examples that demonstrate what he means are as follows.
Limited concurrency at the app server level
Mongrel, Phusion Passenger and Unicorn all use a “traditional” multi-process model in which multiple Ruby processes are spawned, each process handling a single request per second. Thus, concurrency is (assuming that the load balancer has infinite concurrency) limited by the number of Ruby processes: having 5 processes allow you to handle 5 users concurrently.
Threaded servers, where the server spawns multiple threads, each handling 1 connection concurrently, allow more concurrency because because it’s possible to spawn a whole lot more threads than processes. In the context of Ruby, each Ruby process needs to load its own copy of the application code and other resources, so memory increases very quickly as you spawn additional processes. Phusion Passenger with Ruby Enterprise Edition solves this problem somewhat by using copy-on-write optimizations which save memory, so you can spawn a bit more processes, but not significantly (as in 10x) more. In contrast, a multi-threaded app server does not need as much memory because all threads share application code with each other so you can comfortably spawn tens or hundreds of threads. At least, this is the theory. I will later explain why this does not necessarily hold for Ruby.
When it comes to performance however, there’s no difference between processes and threads. If you compare a well-written multi-threaded app server with 5 threads to a well-written multi-process app server with 5 processes, you won’t find either being more performant than the other. Context switch overhead between processes and threads are roughly the same. Each process can use a different CPU core, as can each thread, so there’s no difference in multi-core utilization either. This reflects back on the difference between scalability/concurrency and performance.
Multi-process Rails app servers have a concurrency level that can be counted with a single hand, or if you have very beefy hardware, a concurrency level in the range of a couple of tens, thanks to the fact that Rails needs about 25 MB per process. Multi-threaded Rails app servers can in theory spawn a couple of hundred of threads. After that it’s also game over: an operating system thread needs a couple MB of stack space, so after a couple hundreds of threads you’ll run out of virtual memory address on 32-bit systems even if you don’t actually use that much memory.
There is another class of servers, the evented ones. These servers are actually single-threaded, but they use a reactor style I/O dispatch architecture for handling I/O concurrency. Examples include Node.js, Thin (built on EventMachine) and Tornado. These servers can easily have a concurrency level of a couple of thousand. But due to their single-threaded nature they cannot effectively utilize multiple CPU cores, so you need to run a couple of processes, one per CPU core, to fully utilize your CPU.
The limits of Ruby threads
Ruby 1.8 uses userspace threads, not operating system threads. This means that Ruby 1.8 can only utilize a single CPU core no matter how many Ruby threads you create. This is why one typically needs multiple Ruby processes to fully utilize one’s CPU cores. Ruby 1.9 finally uses operating system threads, but it has a global interpreter lock, which means that each time a Ruby 1.9 thread is running it will prevent other Ruby threads from running, effectively making it the same multicore-wise as 1.8. This is also explained in an earlier Igvita article, Concurrency is a Myth in Ruby.
On the bright side, not all is bad. Ruby 1.8 internally uses non-blocking I/O while Ruby 1.9 unlocks the global interpreter lock while doing I/O. So if one Ruby thread is blocked on I/O, another Ruby thread can continue execution. Likewise, Ruby is smart enough to cause things like sleep() and even waitpid() to preempt to other threads.
On the dark side however, Ruby internally uses the select() system call for multiplexing I/O. select() can only handle 1024 file descriptors on most systems so Ruby cannot handle more than this number of sockets per Ruby process, even if you are somehow able to spawn thousands of Ruby threads. EventMachine works around this problem by bypassing Ruby’s I/O code completely.
Naive native extensions and third party libraries
So just run a couple of multi-threaded Ruby processes, one process per core and multiple threads per process, and all is fine and we should be able to have a concurrency level of up to a couple hundred, right? Well not quite, there are a number of issues hindering this approach:
- Some third party libraries and Rails plugins are not thread-safe. Some aren’t even reentrant. For example Rails < 2.2 suffered from this problem. The app itself might not be thread-safe.
- Although Ruby is smart enough not to let I/O block all threads, the same cannot be said of all native extensions. The MySQL extension is the most infamous example: when executing queries, other threads cannot run.
Mongrel is actually multi-threaded but in practice everybody uses in multi-process mode (mongrel_cluster) exactly because of these problems. It is also the reason why Phusion Passenger has also gone the multi-process route.
And even though Thin is evented, a typical Ruby web application running on Thin cannot handle thousands of concurrent users. This is because evented servers typically require a special evented programming style, such as the one seen in Node.js and EventMachine. A Ruby web app that is written in an evented style running on Thin can definitely handle a large number of concurrent users.
When is limited application server concurrency actually a problem?
Igvita is clearly disappointed at all all the issues that hinder Ruby web apps from achieving high concurrency. For many web applications I would however argue that limited concurrency is not a problem.
- Web applications that are slow, as in CPU-heavy, max out CPU resources pretty quickly so increasing concurrency won’t help you.
- Web applications that are fast are typically quick enough at handling the load so that even large number of users won’t notice the limited concurrency of the server.
Having a concurrency of 5 does not mean not mean that the app server can only handle 5 requests per second; it’s not hard to serve hundreds of requests per second with only a couple of single-threaded processes.
The problem becomes most evident for web applications that have to wait a lot for I/O (besides its own HTTP request/response cycle). Examples include:
- Apps that have to spend a lot of time waiting on the database.
- Apps that perform a lot of external HTTP calls that respond slowly.
- Chat apps. These apps typically have thousands of users, most of them doing nothing most of the time, but they all require a connection (unless your app uses polling, but that’s a whole different discussion).
We at Phusion have developed a number of web applications for clients that fall in the second category, the most recent one being a Hyves gadget. Hyves is the most popular social network in the Netherlands and they get thousands of concurrent visitors during the day. The gadget that we’ve developed has to query external HTTP servers very often, and these servers can take 10 seconds to respond in extreme cases. The servers are running Phusion Passenger with maybe a couple tens of processes. If every request to our gadget also causes us to wait 10 seconds for the external HTTP call then we’d soon run out of concurrency.
But even suppose that our app and Phusion Passenger can have a concurrency level of a couple of thousand, all of those visitors will still have to wait 10 seconds for the external HTTP calls, which is obviously unacceptable. This is another example that illustrates the difference between scalability and performance. We had solved this problem by aggressively caching the results of the HTTP calls, minimizing the number of external HTTP calls that are necessary. The result is that even though the application’s concurrency is fairly limited, it can still comfortably serve many concurrent users with a reasonable response time.
This anecdote should explain why I believe that web apps can get very far despite having a limited concurrency level. That said, as Internet usage continues to increase and websites get more and more users, we may at some time come to a point where much a larger concurrency level is required than most of our current Ruby tools allow us to (assuming server capacity doesn’t scale quickly enough).
What was Igvita.com criticizing?
Igvita.com does not appear to be criticizing Ruby or Rails for being slow. It doesn’t even appear to be criticizing the lack of Ruby tools for achieving high concurrency. It appears to be criticizing these things:
- Rails and most Ruby web application servers don’t allow high concurrency by default.
- Many database drivers and libraries hinder concurrency.
- Although alternatives exist that allow concurrency, you have to go out of your way to find them.
- There appears to be little motivation in the Ruby community for making the entire stack of web frame work + web app server + database drivers etc scalable by default.
This is in contrast to Node.js where everything is scalable by default.
Do I understand Igvita’s frustration? Absolutely. Do I agree with it? Not entirely. The same thing that makes Node.js so scalable is also what makes it relatively hard to program for. Node.js enforces a callback style of programming and this can eventually make your code look a lot more complicated and harder to read than regular code that uses blocking calls. Furthermore, Node.js is relatively young – of course you won’t find any Node.js libraries that don’t scale! But if people ever use Node.js for things other than high-concurrency servers apps, then non-scalable libraries will at some time pop up. And then you will have to look harder to avoid these libraries. There is no silver bullet.
That said, all would be well if at least the preferred default stack can handle high concurrency by default. This means e.g. fixing the MySQL extension and have the fix published by upstream. The mysqlplus extension fixes this but for some reason their changes aren’t accepted and published by the original author, and so people end up with a multi-thread-killing database driver by default.
Is Node.js innovative? Is Ruby lacking innovation?
A minor gripe that I have with the article is that Igvita calls Node.js innovative while seemingly implying that the Ruby stack isn’t innovating. Evented servers like Node.js actually have been around for years and the evented pattern is well-known long before Ruby or Javascript have become popular. Thin is also evented and predates Node.js by several years. Thin and EventMachine also allow Node.js-style evented programming. The only innovation that Node.js brings, in my opinion, is the fact that it’s Javascript. The other “innovation” is the lack of non-scalable libraries.
Conclusion
Igvita appears to be criticizing something other than Rails performance, as his article’s title would imply.
I don’t think the concurrency levels that the Rails stack provides by default is that bad in practice. But as a fellow programmer, it does intuitively bother me that our laptops, which are a million times more powerful than supercomputers from two decades ago, cannot comfortably handle a couple of thousand concurrent users. We can definitely work towards something better, but in the mean time let’s not forget that the current stack is more than capable of Getting Work Done(tm).
The Road to Passenger 3: Technology Preview 1 – Performance
This post has been moved: http://blog.phusion.nl/2010/06/10/the-road-to-passenger-3-technology-preview-1-performance-2/
Ruby Enterprise Edition 1.8.7-2010.02 released
It has been a while since the last REE release. We apologize for not releasing earlier, it’s been very busy for us lately. Nonetheless, a number of important issues have motivated us to release again, including various Rails 3 compatibility issues. Read on for more information.
What is Ruby Enterprise Edition?
Ruby Enterprise Edition (REE) is a server-oriented distribution of the official Ruby interpreter, and includes various additional enhancements, such as:
- A “copy-on-write friendly” garbage collector, capable of reducing Ruby on Rails applications’ memory usage by 33% on average.
- The tcmalloc memory allocator, which lowers overall memory usage and boosts memory allocation speed.
- The ability to performance tune the garbage collector.
- The MBARI patch set, for improved garbage collection efficiency.
- The zero-copy context switching patch, included as an experimental feature.
- Various analysis and debugging features.
REE can be easily installed in parallel to your existing Ruby interpreter, allowing you switch to REE with minimal hassle or risk. REE has been out for about a year now and is already used by many high-profile websites and organizations, such as New York Times, Shopify and 37signals.
“We switched to enterprise ruby to get the full benefit of the [copy-on-write] memory characteristics and we can absolutely confirm the memory savings of 30% some others have reported. This is many thousand dollars of savings even at today’s hardware prices.”
– Tobias Lütke (Shopify)
Ruby Enterprise Edition is 100% open source.
Changes
- Upgraded to Ruby 1.8.7-p249
- The previous REE release was based on 1.8.7-p248. p249 hasn’t changed much: it only includes some WEBrick fixes.
- Upgraded to RubyGems 1.3.7
- The previous REE release included RubyGems 1.3.5. 1.3.7 is required by the latest version of Bundler as well as Rails 3.
- Backported various bug fixes
- The following bug fixes are fixed by upstream Ruby, but not yet released, i.e. these fixes are not part of the latest Ruby 1.8.7-p249 release. We’ve backported these fixes because they solve important compatibility issues.
- Fixed a Marshal bug that was apparently caused by GCC optimizations. This is a major bug that appears to be responsible for all the REE crash bug reports of late. It is so severe that the Rails 3 documentation actually recommends not using 1.8.7-p248 and 1.8.7-p249:
“Note that Ruby 1.8.7 p248 and p249 has marshaling bugs that crash Rails 3.0.0. Ruby 1.9.1 outright segfaults on Rails 3.0.0, so if you want to use Rails 3 with 1.9.x, jump on 1.9.2 trunk for smooth sailing.”
Ruby bug #2557. Given that Ruby 1.8 is still so widely used, being forced to use Ruby 1.9.2 (-dev version even) is not such a good thing. With these backports Rails 3 should be once again usable on 1.8, at least until upstream releases a new version with the fix.
- Fixed an “undefined method `closed?’ for nil:NilClass” Net::HTTP bug. Ruby issue #2708 and REE issue #35.
- Fixed a bug where the ‘super’ keyword doesn’t behave correctly. Ruby issue #2537 and REE issue #40.
- Fixed a Marshal bug that was apparently caused by GCC optimizations. This is a major bug that appears to be responsible for all the REE crash bug reports of late. It is so severe that the Rails 3 documentation actually recommends not using 1.8.7-p248 and 1.8.7-p249:
- Fixed various FreeBSD issues
-
- REE on FreeBSD would occasionally crash with a bizarre “Illegal Instruction” error. After some though investigations, it would appear that the problem is caused by the MBARI patch set in combination with some FreeBSD oddities. MBARI tries to reserve the upper part of the system stack for the garbage collector. In order to do this, it queries the OS for the size of the stack. FreeBSD reports a large size (on 64-bit FreeBSD it reports 512 MB by default), but in reality only about 4 MB could be used: if you go over it then the process will crash. We’ve fixed this issue by limiting the stack usage to 4 MB when on FreeBSD.
- Fixed some long-standing iconv installation bugs. The iconv Ruby extension is used by various important parts of Ruby and Rails. FreeBSD installs the iconv .h headers files into /usr/local/include, but gcc doesn’t look in this location by default, and neither does the iconv extension’s extconf.rb. We’ve modified the REE installer to force the compiler to look in /usr/local/include while installing the iconv extension.
- Added a bootstrap binary for x86_64 FreeBSD 8. This means that on this platform you don’t have to install Ruby first before you can run the REE installer (which is written in Ruby).
- Rational and gcd performance improvement patches
- Kurt Stephens has contributed a set of patches which dramatically improve the performance of the Rational class and the #gcd method. Rational performance has been improved by over 50%. Ruby issue #2561 and REE issue #23
- Various other minor bug fixes
-
- GEM_HOME, GEM_PATH and RUBYOPT are unset before running the installer so that those options can’t interfere with installation.
- RUBY_HEAP_SLOTS_GROWTH_FACTOR wasn’t properly parsed as a floating point number. This has now been fixed.
- Fixed OpenSSL compilation problems. Patch contributed by hso@nosneros.net. REE issue #39.
- More Ubuntu packages
- We now provide packages for:
- Ubuntu 8.04 32-bit
- Ubuntu 8.04 64-bit
- Ubuntu 10.04 32-bit
- Ubuntu 10.04 64-bit
Download & upgrade
To install Ruby Enterprise Edition, please visit the download page. To upgrade from a previous version, simply install into the same prefix that you installed to last time. Please also refer to the documentation for upgrade instructions.
Phusion Passenger 2.2.14 released
Just hours after releasing 2.2.12 some changes have been made that would warrant a new release. And so we uploaded the 2.2.13 gem, but before we could post the announcement some people contributed patches that would warrant another release. So we’ve decided to skip the 2.2.13 announcement altogether and jump straight to 2.2.14.
Changes since 2.2.12
- Fixed some Rails 3 compatibility issues that were recently introduced.
- About a week ago the Rails team committed a change which broke our Rails loader. This has now been fixed. Rails 3 remains to be a moving target but we’ll keep moving along with it.
- [Nginx] Fix a localtime() crash on FreeBSD
- This was caused by insufficient stack space for threads. Issue #499.
- Added support for Rubinius
- Patch contributed by Evan Phoenix.
- Fixed a mistake in the SIGQUIT backtrace message.
- Patch contributed by Christoffer Sawicki.
- Fixed a typo that causes config/setup_load_paths.rb not to be loaded correctly.
- This is related to the new Bundler support.






Phusion. All rights reserved.