At Phusion we run a simple multithreaded HTTP proxy server written in Ruby (which serves our DEB and RPM packages). We've seen it consume 1.3 GB of memory. This is insane -- the server is stateless and doesn't do all that much!
Turns out we're not alone in experiencing this issue. Ruby apps can use a lot of memory. But why? According to Heroku and Nate Berkopec, a large part of excessive memory usage is caused by memory fragmentation, and by memory overallocation.
Nate Berkopec concludes that there are two solutions:
- Either use a completely different memory allocator than the one in glibc -- usually jemalloc, or:
- Set the magical environment variable
MALLOC_ARENA_MAX=2
.
Both the problem description and the given solutions bug me. Something feels off: I'm not convinced that the problem description is entirely correct, or that those are the only solutions available. I'm also bugged by the fact that so many people treat jemalloc as a magical silver bulllet.
Magic is simply science that we don't understand yet. So I set out on a research journey to find out the truth behind this matter.
In that article I'll cover:
- An introduction into memory allocation: how does it work?
- What is this "memory fragmentation" and "memory overallocation" that people speak of?
- What causes the high memory usage? Is the situation as people have described so far, or is there more? (hint: yes, and I'll share my research results)
- Are there alternative solutions available? (hint: I found one)