Phusion white papers Phusion overview

Phusion Passenger 4.0.2 released

By Hongli Lai on May 7th, 2013

Phusion Passenger is software that deploys Ruby and Python web apps, by integrating into Apache and Nginx and turning them into a fully-featured application server. It is very fast, stable and robust and thus used by the likes of New York Times, AirBnB, Symantec, Pixar, etc. It comes with many features that makes your life easier and your application perform better.

We are releasing an emergency release in response to a recently discovered remote code execution vulnerability in Nginx (CVE-2013-2028). Many versions of Nginx 1.3, as well as Nginx 1.4.0, are affected. Phusion Passenger 4.0.2 installs Nginx 1.4.1 by default. There are no other code changes.

Installing 4.0.2

Quick install/upgrade

Phusion Passenger Enterprise users can download the Enterprise version of 4.0.2 from the Customer Area.

Open source users can install the open source version of 4.0.2 with the following commands:

gem install passenger
passenger-install-apache2-module
passenger-install-nginx-module

You can also download the tarball at Google Code. We strongly encourage you to cryptographically verify files after downloading them.

In-depth instructions

In-depth installation and upgrade instructions can be found in the Installation section of the documentation. The documentation has been updated to cover 4.0 changes, including Enterprise features. You can view them online here:

Final

If you would like to stay up to date with Phusion news, please fill in your name and email address below and sign up for our newsletter. We won’t spam you, we promise.



Phusion Passenger 4.0.1 final release

By Hongli Lai on May 6th, 2013

Phusion Passenger 4

Phusion Passenger is software that deploys Ruby and Python web apps, by integrating into Apache and Nginx and turning them into a fully-featured application server. It is very fast, stable and robust and thus used by the likes of New York Times, AirBnB, Symantec, Pixar, etc. It comes with many features that makes your life easier and your application perform better.

After a period of being in beta, we’re proud to announce the first stable release of the Phusion Passenger 4 series. The 4.x series is a huge improvement over the 3.x series: during the development of 4.0, we’ve introduced a myriad of changes which we’ve covered in past beta preview articles:



The beta period took a while because we wanted to ensure that the first stable release is indeed rock solid. People tend to say that one should skip “x.0.0″ releases and wait until “x.0.1″ for the first bug fixes. But we’re confident enough about the stability of the 4.x series that we gave this first release the version number 4.0.1.

Changes in 4.0.1

Compared to 4.0.0 RC 6, the following changes have been introduced:

  • Fixed a crasher bug in the Deployment Error Resistance feature.
  • Fixed a bug in PassengerDefaultUser and PassengerDefaultGroup.
  • Fixed a bug which could cause application processes to exit before they’ve finished their request.
  • Fixed some small file descriptor leaks.
  • Bumped the preferred Nginx version to 1.4.0.
  • Editing the Phusion Passenger Standalone Nginx config template is no longer discouraged.
  • Improved documentation.

Installing and testing 4.0.1

Quick install/upgrade

Phusion Passenger Enterprise users can download the Enterprise version of 4.0.1 from the Customer Area.

Open source users can install the open source version of 4.0.1 with the following commands:

gem install passenger
passenger-install-apache2-module
passenger-install-nginx-module

You can also download the tarball at Google Code. All our gems and tarballs can be cryptographically verified.

In-depth instructions

In-depth installation and upgrade instructions can be found in the Installation section of the documentation. The documentation has been updated to cover 4.0 changes, including Enterprise features. You can view them online here:

Final

We would like to thank everybody who has helped with testing the betas and release candidates so far, and we would like to thank our Enterprise customers. We couldn’t have done it without you!

4.0.1 is just the beginning though. We have many excited changes on the pipeline. Want to stay up to date? Fill in your name and email address below and sign up for our newsletter. We won’t spam you, we promise.



Phusion Passenger 4.0 Release Candidate 6

By Hongli Lai on April 9th, 2013

Phusion Passenger turns Apache and Nginx into a full-featured application server for Ruby and Python web apps. It has a strong focus on ease of use, stability and performance. Phusion Passenger is built on top of tried-and-true, battle-hardened Unix technologies, yet at the same time introduces innovations not found in most traditional Unix servers. Since mid-2012, it aims to be the ultimate polyglot application server.

Today we are pleased to announce Release Candidate 6 of Phusion Passenger 4.0. The 4.x series is a huge improvement over the 3.x series: during the development of 4.0, we’ve introduced a myriad of changes which we’ve covered in past beta preview articles:

Release Candidate 5 was a private interim release for Phusion Passenger Enterprise customers only.

Changes in 4.0 RC 5 and RC 6

The most important changes in RC 5 and RC 6 are as follows:

  • The default config snippet for Apache has changed! It must now contain a PassengerDefaultRuby option. The installer has been updated to output this option. The PassengerRuby option still exists, but it’s only used for configuring different Ruby interpreters in different contexts. Please refer to the manual for more information.
  • We now provide GPG digital signatures for all file releases by Phusion. More information can be found in the manual.
  • WebSocket support on Nginx. Requires Nginx >= 1.3.15.
  • passenger-status now displays process memory usage and time when it was last used. The latter fixes issue #853.
  • Exceptions in Rack application objects are now caught to prevent application processes from exiting.
  • The passenger-config tool now supports the --ruby-command argument, which helps the user with figuring out the correct Ruby command to use in case s/he wants to use multiple Ruby interpreters. The manual has also been updated to mention this tool.
  • Fixed streaming responses on Apache.
  • Worked around an OS X Unix domain socket bug. Fixes issue #854.
  • Out-of-Band Garbage Collection now works properly when the application has disabled garbage collection. Fixes issue #859.
  • Fixed support for /usr/bin/python on OS X. Fixes issue #855.
  • Fixed looping-without-sleeping in the ApplicationPool garbage collector if PassengerPoolIdleTime is set to 0. Fixes issue #858.
  • Fixed some process memory usage measurement bugs.
  • Fixed process memory usage measurement on NetBSD. Fixes issue #736.
  • Fixed a file descriptor leak in the Out-of-Band Work feature. Fixes issue #864.
  • The PassengerPreStart helper script now uses the default Ruby interpreter specified in the web server configuration, and no longer requires a ruby command to be in $PATH.
  • Updated preferred PCRE version to 8.32.
  • Worked around some RVM bugs and generally improved RVM support.
  • The ngx_http_stub_status_module is now enabled by default.
  • Performance optimizations.

Installing and testing 4.0.0 Release Candidate 6

Quick install/upgrade

Phusion Passenger Enterprise users can download the Enterprise version of 4.0 RC 6 from the Customer Area.

Open source users can install the open source version of 4.0 RC 6 with the following commands:

gem install passenger --pre
passenger-install-apache2-module
passenger-install-nginx-module

You can also download the tarball at Google Code.

In-depth instructions

In-depth installation and upgrade instructions can be found in the Installation section of the documentation. The documentation has been updated to cover 4.0 changes, including Enterprise features. You can view them online here:

Final

We are excited about the final release. You can help us by testing RC 6 and reporting any bugs. Please submit bug reports to our bug tracker.

We at Phusion are regularly updating our products. Want to stay up to date? Fill in your name and email address below and sign up for our newsletter. We won’t spam you, we promise.


Tuning Phusion Passenger’s concurrency settings

By Hongli Lai on March 12th, 2013

Phusion Passenger turns Apache and Nginx into a full-featured application server for Ruby and Python web apps. It has a strong focus on ease of use, stability and performance. Phusion Passenger is built on top of tried-and-true, battle-hardened Unix technologies, yet at the same time introduces innovations not found in most traditional Unix servers. Since mid-2012, it aims to be the ultimate polyglot application server.

We recently received a support inquiry from a Phusion Passenger Enterprise customer regarding excessive process creation activity. During peak times, Phusion Passenger would suddenly create a lot of processes, making the server slow or unresponsive for a period of time. This is because Phusion Passenger spawns and shuts down application processes according to traffic, but they apparently had irregular traffic patterns during peak times. Since their servers were dedicated for 1 application only, the solution was to make the number of processes constant regardless of traffic. This could be done by setting PassengerMinInstances to a value equal to PassengerMaxPoolSize.

The customer then raised the question: what is the best value for PassengerMaxPoolSize? This is a non-trivial question, and the answer encompasses more than just PassengerMaxPoolSize. In this article we’re going to shed more light on this topic.

For simplicity reasons, we assume that your server only hosts 1 web application. Things become more complicated when more web applications are involved, but you can use the principles in this article to apply to multi-application server environments.

Aspects of concurrency tuning

The goal of tuning is usually to maximize throughput. Increasing the number of processes or threads increases the maximum throughput and concurrency, but there are several factors that should be kept in mind.

  • Memory. More processes implies a higher memory usage. If too much memory is used then the machine will hit swap, which slows everything down. You should only have as many processes as memory limits comfortably allow. Threads use less memory, so prefer threads when possible. You can create tens of threads in place of one process.
  • Number of CPUs. True (hardware) concurrency cannot be higher than the number of CPUs. In theory, if all processes/threads on your system use the CPUs constantly, then:

    • You can increase throughput up to NUMBER_OF_CPUS processes/threads.
    • Increasing the number of processes/threads after that point will increase virtual (software) concurrency, but will not increase true (hardware) concurrency and will not increase maximum throughput.

    Having more processes than CPUs may decrease total throughput a little thanks to context switching overhead, but the difference is not big because OSes are good at context switching these days.

    On the other hand, if your CPUs are not used constantly, e.g. because they’re often blocked on I/O, then the above does not apply and increasing the number of processes/threads does increase concurrency and throughput, at least until the CPUs are saturated.

  • Blocking I/O. This covers all blocking I/O, including hard disk access latencies, database call latencies, web API calls, etc. Handling input from the client and output to the client does not count as blocking I/O, because Phusion Passenger has buffering layers that relief the application from worrying about this.

    The more blocking I/O calls your application process/thread makes, the more time it spends on waiting for external components. While it’s waiting it does not use the CPU, so that’s when another process/thread should get the chance to use the CPU. If no other process/thread needs CPU right now (e.g. all processes/threads are waiting for I/O) then CPU time is essentially wasted. Increasing the number processes or threads decreases the chance of CPU time being wasted. It also increases concurrency, so that clients do not have to wait for a previous I/O call to be completed before being served.

With these in mind, we give the following tuning recommendations. These recommendations assume that your machine is dedicated to Phusion Passenger. If your machine also hosts other software (e.g. a database) then you should look at the amount of RAM that you’re willing to reserve for Phusion Passenger and Phusion Passenger-served applications.

Tuning the application process and thread count

In our experience, a typical single-threaded Rails application process uses 100 MB of RAM on a 64-bit machine, and by contrast, a thread would only consume 10% as much. We use this fact in determining a proper formula.

Step 1: determine the system’s limits

First, let’s define the maximum number of (single-threaded) processes, or the number of threads, that you can comfortably have given the amount of RAM you have. This is a reasonable upper limit that you can reach without degrading system performance. We use the following formulas.

In purely single-threaded multi-process deployments, the formula is as follows:

max_app_processes = (TOTAL_RAM * 0.75) / RAM_PER_PROCESS

This formula is derived as follows:

  • (TOTAL_RAM * 0.75): We can assume that there must be at least 25% of free RAM that the operating system can use for other things. The result of this calculation is the RAM that is freely available for applications.
  • / RAM_PER_PROCESS: Each process consumes a roughly constant amount of RAM, so the maximum number of processes is a single devision between the aforementioned calculation and this constant.

In multithreaded deployments, the formula is as follows:

max_app_threads_per_process =
  ((TOTAL_RAM * 0.75) - (NUMBER_OF_PROCESSES * RAM_PER_PROCESS * 0.9)) /
  (RAM_PER_PROCESS / 10)

Here, NUMBER_OF_PROCESSES is the number of application process you want to use. In case of Ruby or Python, this should be equal to NUMBER_OF_CPUS. This is because both Ruby and Python have a Global Interpreter Lock so that they cannot utilize multicore no matter how many threads they’re using. By using multiple processes, you can utilize multicore. If you’re using a language runtime that does not have a Global Interpreter Lock, e.g. JRuby or Rubinius, then NUMBER_OF_PROCESSES can be 1.

This formula is derived as follows:

  • (TOTAL_RAM * 0.75): The same as explained earlier.
  • - (NUMBER_OF_PROCESSES * RAM_PER_PROCESS): In multithreaded deployments, the application processes consume a constant amount of memory, so we deduct this from the RAM that is available to applications. The result is the amount of RAM available to application threads.
  • / (RAM_PER_PROCESS / 10): A thread consumes about 10% of the amount of memory a process would, so we divide the amount of RAM available to threads with this number. What we get is the number of threads that the system can handle.

On 32-bit systems, max_app_threads_per_process should not be higher than about 200. Assuming an 8 MB stack size per thread, you will run out of virtual address space if you go much further. On 64-bit systems you don’t have to worry about this problem.

Step 2: derive the applications’ needs

The earlier two formulas were not for calculating the number of processes or threads that application needs, but for calculating how much the system can handle without getting into trouble. Your application may not actually need that many processes or threads! If your application is CPU-bound, then you only need a small multiple of the number of CPUs you have. Only if your application performs a lot of blocking I/O (e.g. database calls that take tens of milliseconds to complete, or you call to Twitter) do you need a large number of processes or threads.

Armed with this knowledge, we derive the formulas for calculating how many processes or threads we actually need.

  • If your application performs a lot of blocking I/O then you should give it as many processes and threads as possible:

    # Use this formula for purely single-threaded multi-process deployments.
    desired_app_processes = max_app_processes
    
    # Use this formula for multithreaded deployments.
    desired_app_threads_per_process = max_app_threads_per_process
    
  • If your application doesn’t perform a lot of blocking I/O, then you should limit the number of processes or threads to a multiple of the number of CPUs to minimize context switching:

    # Use this formula for purely single-threaded multi-process deployments.
    desired_app_processes = min(max_app_processes, NUMBER_OF_CPUS)
    
    # Use this formula for multithreaded deployments.
    desired_app_threads_per_process = min(max_app_threads_per_process, 2 * NUMBER_OF_CPUS)
    

Step 3: configure Phusion Passenger

You should put the number for desired_app_processes into the PassengerMaxPoolSize option. Whether you want to make PassengerMinInstances equal to that number or not is up to you: doing so will make the number of processes static, regardless of traffic. If your application has very irregular traffic patterns, response times could drop while Passenger spins up new processes to handle peak traffic. Setting PassengerMinInstances as high as possible prevents this problem.

If desired_app_processes is 1, then you should set PassengerSpawnMethod conservative (on Phusion Passenger 3 or earlier) or PassengerSpawnMethod direct (on Phusion Passenger 4 or later). By using direct/conservative spawning instead of smart spawning, Phusion Passenger will not keep an ApplicationSpawner/Preloader process around. This is because an ApplicationSpawner/Preloader process is useless when there’s only 1 application process.

In order to use multiple threads you must use Phusion Passenger Enterprise 4. The open source version of Phusion Passenger does not support multithreading, and neither does version 3 of Phusion Passenger Enterprise. At the time of writing, Phusion Passenger Enterprise 4.0 is on its 4th Release Candidate. You can download it from the Customer Area.

You should put the number for desired_app_threads_per_process into the PassengerThreadCount option. If you do this, you also need to set PassengerConcurrencyModel thread in order to turn on multithreading support.

Possible step 4: configure Rails

Only if you’re on a multithreaded deployment do you need to configure Rails.

Rails is thread-safe since version 2.2, but you need to enable thread-safety by setting config.thread_safe! in config/environments/production.rb.

You should also increase the ActiveRecord pool size because it limits concurrency. You can configure it in config/database.yml. Set the pool size to the number of threads. But if you believe your database cannot handle that much concurrency, keep it at a low value.

Example 1: purely single-threaded multi-process deployment with lots of blocking I/O

Suppose you have 1 GB of RAM and lots of blocking I/O, and you’re on a purely single-threaded multi-process deployment.

# Use this formula for purely single-threaded multi-process deployments.
max_app_processes = (1024 * 0.75)  / 100 = 7.68
desired_app_processes = max_app_processes = 7.68

Conclusion: you should use 7 or 8 processes. Phusion Passenger should be configured as follows:

PassengerMaxPoolSize 7

However a concurrency of 7 or 8 is way too low if your application performs a lot of blocking I/O. You should use a multithreaded deployment instead, or you need to get more RAM so you can run more processes.

Example 2: multithreaded deployment with lots of blocking I/O

Consider the same machine and application (1 GB RAM, lots of blocking I/O), but this time you’re on a multithreaded deployment with 2 application processes. How many threads do you need per process?

Let’s assume that we’re using Ruby and that we have 4 CPUs. Then:

# Use this formula for multithreaded deployments.
max_app_threads_per_process
= ((1024 * 0.75) - (4 * 100)) / (100 / 10)
= 368 / 10
= 36.8

Conclusion: you should use 4 processes, each with 36-37 threads, so that your system ends up with . Phusion Passenger Enterprise should be configured as follows:

PassengerMaxPoolSize 4
PassengerConcurrencyModel thread
PassengerThreadCount 36

Configuring the web server

If you’re using Nginx then it does not need configuring. Nginx is evented and already supports a high concurrency out of the box.

If you’re using Apache, then prefer the worker MPM (which uses a combination of processes and threads) or the event MPM (which is similar to the worker MPM, but better) over the prefork MPM (which only uses processes) whenever possible. PHP requires prefork, but if you don’t use PHP then you can probably use one of the other MPMs. Make sure you set a low number of processes and a moderate to high number of threads.

Because Apache performs a lot of blocking I/O (namely HTTP handling), you should give it a lot of threads so that it has a lot of concurrency. The number of threads should be at least the number of concurrent clients that you’re willing to serve with Apache. A small website can get away with 1 process and 100 threads. A large website may want to have 8 processes and 200 threads per process (resulting in 1600 threads in total).

If you cannot use the event MPM, consider putting Apache behind an Nginx reverse proxy, with response buffering turned on on the Nginx side. This reliefs a lot of concurrency problems from Apache. If you can use the event MPM then adding Nginx to the mix does not provide many advantages.

Conclusion

  • If your application performs a lot of blocking I/O, use lots of processes/threads. You should move away from single-threaded multiprocessing in this case, and start using multithreading.
  • If your application is CPU-bound, use a small multiple of the number of CPUs.
  • Do not exceed the number of processes/threads your system can handle without swapping.
We at Phusion are regularly updating our products. Want to stay up to date? Fill in your name and email address below and sign up for our newsletter. We won’t spam you, we promise.


Phusion Passenger 4.0 beta 1 and 2: arbitrary file deletion vulnerability

By Hongli Lai on March 5th, 2013

The Phusion Passenger 4.0 betas contain a vulnerability which allows arbitrary files to be deleted on the system. The vulnerability is local and cannot be exploited remotely. The vulnerability can only be triggered during application startup (e.g. during evaluation of config.ru). Environments that are at risk include, but may not be limited to:

  • Environments that host arbitrary untrusted applications, e.g. shared hosting environments.
  • Applications which contain vulnerabilities that allow their own code to be modified.
  • Environments in which untrusted non-root users can modify application code.

Affected users are advised to upgrade to 4.0.0 RC 4.

Affected versions

  • Phusion Passenger open source 4.0.0 beta 1
  • Phusion Passenger open source 4.0.0 beta 2
  • Phusion Passenger Enterprise 4.0.0 beta 1
  • Phusion Passenger Enterprise 4.0.0 beta 2

Unaffected versions

  • Phusion Passenger open source 3.x and earlier
  • Phusion Passenger open source 4.0.0 RC 1 and later
  • Phusion Passenger Enterprise 3.x and earlier
  • Phusion Passenger Enterprise 4.0.0 RC 1 and later

Phusion Passenger 4.0 Release Candidate 4

By Hongli Lai on March 5th, 2013

Phusion Passenger turns Apache and Nginx into a full-featured application server for Ruby and Python web apps. It has a strong focus on ease of use, stability and performance. Phusion Passenger is built on top of tried-and-true, battle-hardened Unix technologies, yet at the same time introduces innovations not found in most traditional Unix servers. Since mid-2012, it aims to be the ultimate polyglot application server.

Today we are pleased to announce Release Candidate 4 of Phusion Passenger 4.0. Last week we said that the open source release of Release Candidate 1 will be out today. However because of the helpful feedback and bug reports we’ve received from Enterprise customers, we’ve decided to push out these bug fixes to the open source version earlier. Release Candidate 3 was only available for Enterprise customers in order to test bug fixes, so it hasn’t been announced publicly.

The 4.x series is a huge improvement over the 3.x series: during the development of 4.0, we’ve introduced a myriad of changes which we’ve covered in past beta preview articles:

Changes in 4.0 RC 3 and RC 4

The focus of RC 3 and RC 4 have yet again been on improving stability. We’ve closed over 50 issues in our issue tracker.

The most important changes in RC 3 and RC 4 are as follows:

  • Fixed Rake autodetection.
  • Fixed compilation on systems where /tmp is mounted noexec.
  • Fixed some memory corruption bugs.
  • Phusion Passenger Standalone now sets underscores_in_headers. Fixes issue #708.
  • Fixed some process spawning compatibility problems, as reported in issue #842.
  • The Python WSGI loader now correctly shuts down client sockets even when there are child processes that keep the socket open.
  • A new configuration option PassengerPython (Apache) and passenger_python (Nginx) has been added so that users can customize the Python interpreter on a per-application basis. Fixes issue #852.
  • The Apache module now supports file uploads larger than 2 GB when on 32-bit systems. Fixes issue #838.
  • The Nginx version now supports the passenger_temp_dir option.
  • Environment variables set in the Nginx configuration file (through the env config option) are now correctly passed to all application processes. Fixes issue #371.
  • Fixed support for RVM mixed mode installations. Fixes issue #828.
  • Phusion Passenger now outputs the Date HTTP header in case the application didn’t already do that (and was violating the HTTP spec). Fixes issue #485.
  • Phusion Passenger now checks whether /dev/urandom isn’t broken. Fixes issue #516.
  • Improved debugging messages.

Installing and testing 4.0.0 Release Candidate 4

Quick install

Phusion Passenger Enterprise users can download the Enterprise version of 4.0 RC 4 from the Customer Area.

Open source users can install the open source version of 4.0 RC 4 with the following commands:

gem install passenger --pre
passenger-install-apache2-module
passenger-install-nginx-module

You can also download the tarball at Google Code.

In-depth

In-depth installation and upgrade instructions can be found in the Installation section of the documentation. The documentation has been updated to cover 4.0 changes, including Enterprise features. You can view them online here:

Final

We are excited about the final release. You can help us by testing RC 4 and reporting any bugs. Please submit bug reports to our bug tracker.

We at Phusion are regularly updating our products. Want to stay up to date? Fill in your name and email address below and sign up for our newsletter. We won’t spam you, we promise.


Phusion Passenger 4.0 Release Candidate 2

By Hongli Lai on February 27th, 2013

Phusion Passenger is an Apache and Nginx module for deploying Ruby and Python web applications. It has a strong focus on ease of use, stability and performance. Phusion Passenger is built on top of tried-and-true, battle-hardened Unix technologies, yet at the same time introduces innovations not found in most traditional Unix servers. Since mid-2012, it aims to be the ultimate polyglot application server.

We know many users are eagerly awaiting the final release of Phusion Passenger 4.0. The 4.x series is a huge improvement over the 3.x series: during the development of 4.0, we’ve introduced a myriad of changes which we’ve covered in past beta preview articles:

Today we are proud to announce Release Candidate 2 of Phusion Passenger 4.0. Release Candidate 1 has been skipped because a few bug fixes were applied right after RC 1 was tagged.

Changes in 4.0 RC 1 and RC 2

The focus of RC 1 and RC 2 have been on improving stability and on refining previously introduced features. We’ve closed over 100 issues in our issue tracker. We couldn’t have done this without the fantastic feedback from our users, especially those from many Phusion Passenger Enterprise customers who have beta tested the RC previews in their staging environments.

The changes in RC 1 and RC 2 are as follows:

  • The Nginx version now supports the passenger_app_root configuration option.
  • The Enterprise memory limiting feature has been extended to work with non-Ruby applications as well.
  • Application processes that have been killed are now automatically detected within 5 seconds. Previously Phusion Passenger needed to send a request to the process before detecting that it’s gone. This change means that when you kill a process by sending it a signal, Phusion Passenger will automatically respawn it within 5 seconds (provided that the process limit settings allow respawning).
  • Phusion Passenger Standalone’s HTTP client body limit has been raised from 50 MB to 1 GB.
  • Python 3 support has been added.
  • The build system has been made compatible with JRuby and Ruby 2.0. This does not mean that Phusion Passenger works on Ruby 2.0; please read on for more about this subject.
  • The installers now print a lot more information about detected system settings so that the user can see whether something has been wrongly detected.
  • Some performance optimizations. These involve further extending the zero-copy architecture, and the use of hash table maps instead of binary tree maps.
  • Many potential crasher and freezer bugs have been fixed.
  • Error diagnostics have been further improved.
  • Many documentation improvements.

What about Ruby 2.0?

We are just as excited about Ruby 2.0 as many of you are. Since 2.0 was released a few days ago, we’ve been testing Phusion Passenger on it. We really wanted to release RC 2 with Ruby 2.0 support, but a few things stood in our way so we had to postpone this goal.

  • We couldn’t get Ruby 2.0.0 installed on OS X Mountain Lion. The compiled Ruby crashes during Ruby 2.0.0′s build process with a low-level error ([BUG] Stack consistency error). Apparently we aren’t the only ones.
  • We were able to get it installed on a Debian VM, but it does not pass all the Phusion Passenger unit tests. It fails on some tests with obscure errors that seem to indicate bugs in Ruby, e.g. errors in which Ruby cannot figure out where the exception came from.

We recommend sticking with 1.9.3 in the mean time until the next Ruby 2.0 patchlevel release.

Release Candidate 2 timeline & download

Phusion Passenger Enterprise customers are given priority access to Release Candidate 2. They can download RC 2 from the Customer Area immediately.

Phusion Passenger Enterprise customers: download RC 2 from Customer Area

The release of the open source version will follow in one week, on March 5 2013. Of course, open source users who want to stay on the bleeding edge are free to obtain the latest sources from the open source Phusion Passenger git repository at any time.

When the open source version is released, users can install it by following the in-depth installation and upgrade instructions in the Installation section of the documentation. The manual also covers installation of beta releases.

Final

We are excited about the final release. You can help us by testing RC 2 and reporting any bugs. Please submit bug reports to our bug tracker.

We at Phusion are regularly updating our products. Want to stay up to date? Fill in your name and email address below and sign up for our newsletter. We won’t spam you, we promise.


Phusion Passenger 4.0 beta 2: Syscall failure simulation framework, focus on stability

By Hongli Lai on January 24th, 2013

Phusion Passenger is an Apache and Nginx module for deploying Ruby and Python web applications. It has a strong focus on ease of use, stability and performance. Phusion Passenger is built on top of tried-and-true, battle-hardened Unix technologies, yet at the same time introduces innovations not found in most traditional Unix servers. Since mid-2012, it aims to be the ultimate polyglot application server.

Development of the Phusion Passenger 4.x series is progressing steadily. The 4.x series is a huge improvement over the 3.x series: in the announcement for Phusion Passenger 4.0.0 beta 1, we introduced a myriad of changes such as support for multiple Ruby versions, Python WSGI support, multithreading (Enterprise only), improved zero-copy architecture, better error diagnostics and more. That was just the beginning, because soon after we announced JRuby and Rubinius support, Out-of-Band Work and the Rack socket hijacking API.

Today we are proud to announce Phusion Passenger 4.0 beta 2, which brings us closer to a final release.

Better stability, documentation, test coverage

Beta 1 was usable, but not yet production-ready. While it worked well most of the time, there were some bugs that could cause crashes. So for beta 2 we haven’t introduced too many new features. Instead we’ve been focussing a lot on fixing bugs, improving stability, improving documentation and improving test coverage. It is easy to fall into the trap of constantly adding features, but we want a rock-solid product that our users and customers can rely on.

How do we ensure quality? There are a few tools and techniques that we use:

System call failure simulation framework

SQLite
Our inspiration

The system call failure simulation framework is a new developer feature in 4.0 beta 2 and allows us to simulate random system call failures so that we can test whether error handling in Phusion Passenger is done correctly. Although we already test for many error handling scenarios in our unit tests, test coverage is not perfect. This framework gives us another tool to ensure quality.

A few months ago we sat down with a customer who was experiencing seemingly random crashes with Phusion Passenger. These crashes could not be reproduced on any of our systems, but could be reliably reproduced on theirs. The crashes would only manifest under high concurrency scenarios. After a day of intensive investigation, we found that the crash was caused because their systems’ file descriptor limit is much lower than any of our systems’. Phusion Passenger did not always catch out-of-file-descriptors errors, so those errors caused Phusion Passenger to crash. Due to other unrelated issues, relevant error messages could not be printed to the log file and were lost.

All of those issues have since been fixed, but it made us realize that our testing tools were not adequate. That situation could and should have been prevented. Thus, the system call failure simulation framework was born. This framework allows us to specify which system calls should fail, and with what probability. For example, the following configuration simulates the “out of file descriptors” error in the helper agent with a probability of 1%.

export PASSENGER_SIMULATE_SYSCALL_FAILURES=PassengerHelperAgent=ENFILES=0.01

Different runs will produce different errors, but you can force determinism by specifying the same random seed that was used in the last run. The random seed is printed as a debugging message during startup and during crash.

export PASSENGER_RANDOM_SEED=...

The system call failure simulation framework was inspired by SQLite’s testing process. Real hardware, network or OS-level errors are difficult to create, so simulating them is the next best thing. SQLite has an internal virtual filesystem layer, and it is in that layer that they simulate failures. In our case we have a similar layer, namely the system call interruption framework which was originally written to facilitate interrupting threads that are blocked on blocking system calls.

Continuously expanding and improving our test suite

We already had an extensive test suite which consists of a hybrid of C++ and Ruby RSpec code. In 4.0 beta 2 we’ve improved the test suite by modernizing some dependencies, testing more edge cases, testing more failure conditions, etc.

Setting up Continuous integration

Before today our extensive test suite was run on our development machines as well as an army of virtual machines with different OSes. We have now setup Travis CI so that we would have an additional quality assurance tool. The test suite has also been extended to cover more cases.

Ruby 1.8 is now considered legacy

Ruby 1.9 is the future Ruby 1.8 is no longer supported by its authors, and Ruby Enterprise Edition has been End-Of-Lifed a while ago. Many gems these days are Ruby 1.9-only. It is more than apparent that Ruby 1.8 is considered legacy by the community, and for good reasons. We too are joining the community by considering Ruby 1.8 legacy. This has the following implications:

  • Phusion Passenger 4.x will continue to support Ruby 1.8. Our support goes as far back as Ruby 1.8.5.
  • We will optimize performance for Ruby 1.9. Phusion Passenger will still work on Ruby 1.8, but we will no longer put in any effort to make it work fast on Ruby 1.8.

Installing and testing 4.0.0 beta 2

Quick install

Phusion Passenger Enterprise users can download the Enterprise version of 4.0 beta 2 from the Customer Area.

Open source users can install the open source version of 4.0 beta 2 with the following commands:

gem install passenger --pre
passenger-install-apache2-module
passenger-install-nginx-module

You can also download the tarball at Google Code.

In-depth

In-depth installation and upgrade instructions can be found in the Installation section of the documentation. The documentation has been updated to cover 4.0 changes, including Enterprise features. You can view them online here:

Final

We are excited about the final release. You can help us by testing beta 2 and reporting any bugs. Please submit bug reports to our bug tracker.

We at Phusion are regularly updating our products. Want to stay up to date? Fill in your name and email address below and sign up for our newsletter. We won’t spam you, we promise.


The new Rack socket hijacking API

By Hongli Lai on January 23rd, 2013

Yesterday saw the release of Rack 1.5.0, which adds a new feature to the Rack specification dubbed socket hijacking. This feature allows applications to take over the client socket and perform arbitrary operations on it, e.g. implementing WebSockets, streaming data to the client, etc.

Did Rack not support streaming? Actually yes it did, you can do it by returning a body object that outputs body chunks in the #each method, as explained in our past article Why Rails 4 Live Streaming is a Big Deal. But this API is a bit clunky. The socket hijacking API provides access to a Ruby IO object-like API.

Support for socket hijacking has been added to Phusion Passenger 4 yesterday. The upcoming Phusion Passenger 4 has been covered here, here and here. Phusion Passenger Enterprise customers can already test and enjoy a preview of this feature by downloading the “3.9.2 beta preview (4.0.0 beta 2)” file from the Customer Area.

The socket hijacking API was surprisingly easy to implement, but unfortunately poorly documented at this time. The application-level API is not immediately obvious, and the Rack specification documentation has not yet been updated to cover the hijacking API. In this article we’ll introduce the API and provide an example program.

What the socket hijacking API is not

Some of you may have heard of efforts to develop a “Rack 2.0″ specification which properly covers things such as streaming and evented servers. According to the hijacking API developer, this API is not an attempt towards Rack 2.0. It is a “good enough” solution that works within the confines of the Rack 1.x specification. Things may change in Rack 2.0, though at this time it’s unclear what the progress towards Rack 2.0 is.

It is also unclear whether the API is supposed to be final or not. While implementing this API and writing this article we’ve discovered some room for improvement. The suggestions (which you can find later in this article) have been submitted to the developers.

Overview of the API

The hijacking API provides two modes:

  1. A full hijacking API, which gives the application complete control over what goes over the socket. In this mode, the application server doesn’t send anything over the socket, and lets the application take care of it. This mode is useful if you want to implement arbitrary (even non-HTTP) protocols over the socket. This is subject to limitations: if your application is behind a web server or an HTTP load balancer then those components dictate which protocols you can implement.
  2. A partial hijacking API, which gives the application control over the socket after the application server has already sent out headers. This mode is mostly useful for streaming.

The hijacking API is accessible through the Rack env hash. You can check whether the application server supports the hijacking API by checking env['rack.hijack?'], which returns a boolean value.

Full hijacking

You can perform a full hijack by calling env['rack.hijack'].call. You can access the hijacked socket object through env['rack.hijack_io']. Phusion Passenger’s implementation of env['rack.hijack'] returns the socket object, but it is unclear whether this is supposed to be standard behavior.

You are responsible for:

  • Outputting any HTTP headers, if applicable.
  • Closing the IO object when you no longer need it.

You should output the “Connection: close” header unless you plan on implementing HTTP keep-alive yourself.

Here’s am example of the full hijacking API in action:

# encoding: utf-8
require 'thread'

# Streams the response "Line 1" .. "Line 10", with
# 1 second sleep time between each line.
# 
# Non-Phusion Passenger users may have to turn off their
# web servers' buffering options for streaming to work.
# Phusion Passenger 4 users don't have to do anything, it
# works out-of-the-box thanks to our real-time response
# buffering feature.
app = lambda do |env|
  # Fully hijack the client socket.
  env['rack.hijack'].call
  io = env['rack.hijack_io']
  begin
    io.write("Status: 200\r\n")
    io.write("Connection: close\r\n")
    io.write("Content-Type: text/plain\r\n")
    io.write("\r\n")
    10.times do |i|
      io.write("Line #{i + 1}!\n")
      io.flush
      sleep 1
    end
  ensure
    io.close
  end
end

run app

Partial hijacking

You can perform a partial hijack by assigning a lambda to the rack.hijack response header. This lambda will be called after the application server has sent out headers. The application server will ignore the body part of the Rack response, and will call the ‘rack.hijack’ lambda, passing it the client socket. You are responsible for closing the socket when it’s no longer needed.

It is unclear what the value of the Rack response body should be. Phusion Passenger’s implementation doesn’t care: you can return a two-array response, or a three-array response where where the body can be anything. If the ‘rack.hijack’ response header is set, the body will be completely ignored.

Example:

# encoding: utf-8
require 'thread'

# Streams the response "Line 1" .. "Line 10", with
# 1 second sleep time between each line.
# 
# Non-Phusion Passenger users may have to turn off their
# web servers' buffering options for streaming to work.
# Phusion Passenger 4 users don't have to do anything, it
# works out-of-the-box thanks to our real-time response
# buffering feature.
app = lambda do |env|
  response_headers = {}
  response_headers["Content-Type"] = "text/plain"
  response_headers["rack.hijack"] = lambda do |io|
    # This lambda will be called after the app server has outputted
    # headers. Here we can output body data at will.
    begin
      10.times do |i|
        io.write("Line #{i + 1}!\n")
        io.flush
        sleep 1
      end
    ensure
      io.close
    end
  end
  [200, response_headers, nil]
end

run app

Issues with the hijacking API

Here’s how we think the hijacking API can be improved.

  • env['rack.hijack?'] appears to be unnecessary. You can already check for hijacking support by checking env['rack.hijack'].
  • The partial hijacking API should not involve assigning a lambda to the response headers. As far as we can see, you can just return the lambda as the body. That would be a much more elegant solution.
  • The return value for env['rack.hijack'] should be well-defined.

Conclusion

The Rack hijacking API, while having some quirks in our opinion, gets the job done. We hope that the usage of the hijacking API has become more clear after reading this article. If you have any comments, questions, suggestions or corrections, please let us know.

We at Phusion are working feverishly at the upcoming Phusion Passenger 4 (covered here, here and here). Implementing the hijacking API so quickly is our way of showing you how dedicated we are. Together with Phusion Passenger Enterprise, we aim to deliver the most stable, performant and feature rich polyglot application server out there. If you’re interested in future updates, please subscribe to our newsletter. Until next time!



Phusion Passenger 4 Technology Preview: Out-Of-Band Work

By Hongli Lai on January 22nd, 2013

Phusion Passenger is an Apache and Nginx module for deploying Ruby and Python web applications. It has a strong focus on ease of use, stability and performance. Phusion Passenger is built on top of tried-and-true, battle-hardened Unix technologies, yet at the same time introduces innovations not found in most traditional Unix servers. Since mid-2012, it aims to be the ultimate polyglot application server.

Development of the Phusion Passenger 4.x series is progressing steadily. The 4.x series is a huge improvement over the 3.x series: in the announcement for Phusion Passenger 4.0.0 beta 1, we introduced a myriad of changes such as support for multiple Ruby versions, Python WSGI support, multithreading (Enterprise only), improved zero-copy architecture, better error diagnostics and more. That was just the beginning, because soon after we announced JRuby and Rubinius support. Today we are announcing another cool feature.

Out-of-Band Work

The Out-of-Band Work feature allows one to perform arbitrary long-running work outside request cycles without blocking HTTP clients. The primary use case is to run the garbage collector in between request cycles so that your requests will finish faster because they will be interrupted less by the garbage collector.

Normally the garbage collector fires up as soon as the Ruby interpreter thinks it needs to, which possibly results in hundreds of milliseconds of latency. With the Out-of-Band Work feature, you can run the garbage collector outside the request cycles so that garbage collection runs inside cycles are much less expensive. While out-of-band work is running, Phusion Passenger will not route any requests to said process.

Cool properties of this feature:

  • If the process that triggered Out-Of-Band Work is the only process for that application, then Phusion Passenger will first spawn up a new process before performing the out-of-band work. When the out-of-band work has finished, the process will be eligible for idle timeout cleaning. Thus you can use this feature in any scenario, and Phusion Passenger will do the right thing for you.
  • It even works in multithreaded setups (an Enterprise-only feature). Normally the Ruby garbage collector will block all threads while doing work. So before performing the out-of-band work, Phusion Passenger will let all existing requests to the application process finish.
Before Out-Of-Band GC After Out-Of-Band GC Before and after applying out of band GC at AppFolio

This awesome feature has been contributed by AppFolio. They’ve been running it in production for a while now with quite some success. Average response time has gone down by 100 ms.

Compared to Unicorn’s OOBGC

Users who are familiar with Unicorn’s Out-of-Band GC (OOBGC) might notice the similarities. Our Out-of-Band Work feature (OOBW) is more general and more flexible:

  • Unicorn’s OOBGC requires a static number of single-threaded workers. OOBW is designed to be able to handle a dynamic number of workers that may even be multithreaded.
  • OOBW is designed to be able to perform arbitrary long-running work, including work that may block all threads. Unicorn’s OOBGC only works with garbage collection.

Using Out-Of-Band Work

Phusion Passenger 4.0 beta 2 provides a simple Rack middleware that you can use to enable out-of-band GC:

if defined?(PhusionPassenger)
  require 'phusion_passenger/rack/out_of_band_gc'
  # Trigger out-of-band GC every 5 requests.
  use PhusionPassenger::Rack::OutOfBandGc, 5
  ## Optional: disable normal GC triggers and only GC outside
  ## request cycles. Not recommended though, see section
  ## "What Ruby can do to improve out-of-band garbage collection"
  # GC.disable
end

It also provides a simple API to perform Out-Of-Band Work. For example out-of-band GC may be implemented as follows without using the Rack middleware:

# Somewhere in a controller method:
# Tell Phusion Passenger we want to perform OOB work.
response.headers["X-Passenger-Request-OOB-Work"] = "true"

# Somewhere during application initialization:
if defined?(PhusionPassenger)
  PhusionPassenger.on_event(:oob_work) do
    # Phusion Passenger has told us that we're ready to perform OOB work.
    t0 = Time.now
    GC.start
    Rails.logger.info "Out-Of-Bound GC finished in #{Time.now - t0} sec"
  end
end    

## Optional: disable normal GC triggers and only GC outside
## request cycles. Not recommended though, see section
## "What Ruby can do to improve out-of-band garbage collection"
# GC.disable

Inside Out-Of-Band Work: a more general mechanism

The Out-Of-Band Work feature is actually built on top of an even more general mechanism: the enable/disable process feature. This is a new feature in Phusion Passenger 4 and is, for now, internal only. Internal Phusion Passenger code can mark a process as disabled, so that Phusion Passenger will no longer route requests to it. But the actual process is kept alive. Internal code can reenable the process later, making it eligible again for processing requests.

This feature is simple to use and simple to understand, but was tricky to implement. Phusion Passenger works in a heavily concurrent environment so it may not be able to disable a process immediately. The process might be handling requests, it might be restarting, another process might be spawning, etcetera. The entire API follows an asynchronous design. If the to-be-disabled process is the only process for that application, then Phusion Passenger will spawn another process. Disabling the original process will complete when the new process has been spawned, and the original process is done processing all its requests.

Once the enable/disable feature was in place, implementing Out-Of-Band Work was almost trivial. When an application wants to perform Out-Of-Band Work, it sends a signal to Phusion Passenger. We currently use the X-Passenger-OOB-Work header to do this, which is filtered out by Phusion Passenger and will never reach the client. Phusion Passenger will then try to disable the process. Once disabled, Phusion Passenger will send a signal to the application, telling it that it may proceed to perform out-of-band work. At this point the process is guaranteed not to be processing any requests, so it can freely do whatever it wants. Once the out-of-band work has finished, Phusion Passenger will reenable the process.

This simple mechanism opens the door to many other possibilities that are currently not implemented:

  • In the future we can add an admin command to access the API, so that the administrator can disable/enable processes. That way the administrator can temporarily isolate a process for debugging without disrupting production traffic.
  • Phusion Passenger Enterprise’s live IRB console feature can optionally disable the process before attaching itself, so that the administrator can debug the process without him being disrupted by traffic.

What Ruby can do to improve out-of-band garbage collection

The currently recommended mode is to run the out-of-band garbage collection with the normal Ruby garbage collector turned on. This significantly reduces the latency of normal garbage collection runs, but does not eliminate them. It is possible to completely eliminate normal garbage collection latency by disabling the garbage collector so that garbage collection is only performed out-of-band, but this will result in high peak memory usage because:

  • There’s currently no way to find out when the Ruby garbage collector needs to be run. The “every x requests” option is a suboptimal heuristic.
  • The MRI Ruby interpreter does not support heap compaction, so even when memory has been reclaimed by the garbage collector, Ruby may not be able to return that memory to the operating system.

It would be great if Ruby can address both issues. This will improve the usefulness of out-of-band garbage collection significantly.

Conclusion

Out-of-Band Work will become part of Phusion Passenger 4.0 beta 2, which will be released very soon. Phusion Passenger Enterprise customers can already test and enjoy this feature by downloading the “3.9.2 preview (4.0.0 beta 2)” file from the Customer Area.

Please stay tuned for further announcements on Phusion Passenger 4. If you like, you can subscribe to our newsletters and we’ll keep you up to date.