At Phusion we run a simple multithreaded HTTP proxy server written in Ruby (which serves our DEB and RPM packages). We've seen it consume 1.3 GB of memory. This is insane -- the server is stateless and doesn't do all that much!

Turns out we're not alone in experiencing this issue. Ruby apps can use a lot of memory. But why? According to Heroku and Nate Berkopec, a large part of excessive memory usage is caused by memory fragmentation, and by memory overallocation.

Nate Berkopec concludes that there are two solutions:

  1. Either use a completely different memory allocator than the one in glibc -- usually jemalloc, or:
  2. Set the magical environment variable MALLOC_ARENA_MAX=2.

Both the problem description and the given solutions bug me. Something feels off: I'm not convinced that the problem description is entirely correct, or that those are the only solutions available. I'm also bugged by the fact that so many people treat jemalloc as a magical silver bulllet.

Magic is simply science that we don't understand yet. So I set out on a research journey to find out the truth behind this matter.

In that article I'll cover:

  1. An introduction into memory allocation: how does it work?
  2. What is this "memory fragmentation" and "memory overallocation" that people speak of?
  3. What causes the high memory usage? Is the situation as people have described so far, or is there more? (hint: yes, and I'll share my research results)
  4. Are there alternative solutions available? (hint: I found one)