Bringing HBB into the Retro-Future
For the last while we haven't been able to build native extensions for the latest versions of Ruby (2.5 and later) because they simply do not build on Centos 5. As some of you may know, we build our Ruby gem's native extensions and precompiled binaries for Linux using the oldest possible OS. We do this because while libc is backwards compatible, it is not forwards compatible. You can use an executable or library built against an old libc on a new system but not an executable or library built against a new system on an old system. Since CentOS 5 is ancient in OS terms (first release was in 2007), that's what we were using up until now.
I began the process of upgrading the Binary Automation project by looking at the base images for the docker images we use when building the binaries. I saw that they are based on the Holy-Build-Box project (HBB), which in turn uses the centos:5 and phusion/centos-5-32 repos (for 64 and 32 bit respectively).
Holy Build Box is a system for building "portable" binaries for Linux: binaries that work on pretty much any Linux distribution.
While the Docker ecosystem has come a long way since we set this build process up the first time, and there are now some 32 bit images available, it's still not obvious how to run a 32 bit image on a 64 bit host using Docker's multi-architecture support; so I decided to simply repeat the process we used last time to create a 32 bit centos 6 image that will run on a 64 bit host. That involves updating the phusion-docker-classic-linux project which is how we build our 32 bit CentOS images.
Updating Centos
The first thing I did was copy the build-centos5.sh script as build-centos6.sh, and add entries to the gitignore and make files. This step was trivial because I was only replacing the number 5 with 6. I attempted to build the image at this point but the url for the centos release rpm followed a different pattern, and so the build failed. After a bit of googling and poking around centos's mirrors I found the correct url. I also set the basearch in the yum repo lists in the image to i686 thinking that was a good idea, but I was wrong about that and later had to revert that change because that's not how the repos are organized. After that the image built successfully, and I could move on to the next step.
Updating HBB itself to CentOS 6 involved a bit more. I started out by updating the makefile and dockerfiles, with straightforward 5→6 substitutions, and bumped the project's version number. Next I bumped the versions of the libraries which we make available in the image for static linking. I switched a bunch of http urls (and one ftp url) to https (newer curl ftw!). I then attempted a build and... everything blew up in my face.
One thing that was wrong is that the new baseimage didn't include tar, so I had to install that. Also it turns out we bootstrap things with a local copy of the curl source code, so I had to replace the old version with the updated version to match the curl version number I had updated the image to. However upon rerunning the build, yum was periodically blowing up, upon investigation it turns out that some yum versions and Docker do not get along. However the solution of touching the rpm database files before running yum is enough to stabilize things.
The next thing to deal with was that CentOS 6 is not yet EOL, so the vault url isn't needed yet. So I put in a test to check if the image is being built after the support cutoff, and only rewrite the yum repos to use the vault url if the build date is after the support cutoff. Since I couldn't find a guarantee of what the last Centos release version number would be I replaced the hardcoded version number with the version I scrape from the /etc/centos-release
file. After that the images built successfully.
Including C++11
At that point I could have been done, but there was one more thing I wanted to achieve. I've wanted to be able to use C++11 in the Passenger codebase for a long time, which requires a newer compiler than comes with the devtoolset-2
suite of packages. Lucky for me: previously, as part of a separate investigation Hongli built the devtoolset-7
tools for 32 bit systems. While that sounds like it'd be a lot of work, given that they don't provide a 32 bit release officially, it turned out to not be so bad. He had to download all the source RPMs, then build them inside an i386 container. After attempting to build each package, he obtained a list of packages that won't build (e.g. Valgrind because they're not compatible with x86 anymore O.O) or had to remove a %test section from the specfile for because the tests don't pass on i386. He published his work to our packagecloud repo, so I could just grab the final RPMs from there.
So my next step was to replace the installation of devtoolset-2
with devtoolset-7
. In order to install devtoolset-7
I had to setup the packagecloud.io repo on i686 systems and install centos-release-scl
on x86_64 systems. Luckily packagecloud provides a setup script, and centos-release-scl
is available from yum without extra work. After those were setup I thought I could install the packages I wanted from the available repos using the same command on both architectures. Attempting to build the images at this point however did not work. I had forgotten to change the script used to setup the build environment to use the new devtoolset, and yum was not finding the packages I expected in the 32 bit build.
I realized after a bit of trial and error that yum was looking for 64 bit packages in the repo on the 32 bit side because I hadn't edited the repo file that packagecloud's script had installed for me. One sed one-liner later and yum was back to being oblivious to the arch of the kernel and docker-host. I also fixed a few file extensions at this point. Upon rerunning the build, everything installed correctly (YAY!) but the build still blew up when it got to the gcc libstdcxx library with some really weird errors about a bunch of code being deprecated. Whelp we did update the compiler so we should probably update the stdlib to match! Once the versions were matching the build succeeded and we had our updated HBB images ready to go!
After the so-manieth broken build
We've finally come full circle and wound up back at the Passenger Binary automation project. First I updated the versions of all the included packages (even Ruby and rubygems YAY!), as well as the HBB base image versions, and bumped this project's versions as well (there are two, one for each of macOS and Linux, but when you upgrade the source package versions you have to bump both). Next I tried a build and this time everything blew up in my face again! So I put my debugging hat back on and applied the yum fix from earlier to get it playing nice with Docker in this image too. On the next build I saw that pinentry wasn't building because it couldn't find the tinfo library from ncurses. It turns out to be an issue with pkg_config but we can work around it by setting an envvar to include -ltinfo
when linking. After that pinentry builds, but we don't get very far because gnupg, the very next package does not.
It turns out that the glibc version included in CentOS 6 has a broken iconv library and gnupg won't use it. Rather than removing the check for whether iconv works from gnupg, I grabbed the libiconv source (adding the libiconv version to the list of libraries to be updated regularly) and built it as a standalone library, and provided gnupg with the path to find it. Voilà gnupg builds! And... git doesn't, luckily though it turns out that git also needs to use the standalone libiconv library, as well as avoiding passing the -R
flag to the linker. With a couple of environment variables we are able to configure our build environment suitably to get a git executable! Running the build again almost works, we make it all the way to the point of installing various rubies with rvm, but then things break again because RVM added an additional signing key out of the blue, and only used that key to sign the newest release.
Then, after adding the new key to the gpg import command we can install rvm again and everything builds correctly. Just kidding, I also had to update rubygems to 3.0.1 (which was released during this process) to include this bugfix.
Next up, trying to actually build Passenger's Ruby native extensions, using the new binary automation setup. I started a run, and quickly ran into an issue where the Passenger Ruby native extension was failing our libcheck
test. The newly built extension was linked to an "unknown" library called libfreebl3
. Since we can't link to anything that might not be on your system, we have a check in place that calls out any unexpected linking. I investigated via google and manually with ldd
and readelf
, and found that both libruby
and glibc
on CentOS 6 link against this libfreebl3
which is part of the nss
library by Mozilla. The strange thing is that when you link a dynamic library, unless otherwise specified, you only link against your direct dependencies. In this case we do not directly depend on nss
, but do depend on glibc
and libruby
, and it seemed that the linker was linking against their dependencies despite us not passing the -Wl,--no-undefined
flag to gcc
. I tried adding -Wl,--allow-shlib-undefined
and -Wl,-unresolved-symbols=ignore-in-shared-libs
and -Wl,--as-needed
and none of them made any difference. I checked with readelf
that libfreebl3
wasn't a direct dependency, and therefore shouldn't, by all accounts, be included in the linkage, but ldd
still said it was there. That's when I realized that ldd
tells you everything that is LOADED by the library, not what's LINKED by the library. It makes sense that anything linked to libruby
or libc
on CentOS would show libfreebl3
as loaded, since those libraries depend on it. So I dumped my compiled library out of the CI into an Ubuntu install, checked with ldd
and lo-and-behold no libfreebl3
listed. With that knowledge I could finally get this train moving again. I rewrote libcheck
to use readelf
instead of ldd
, so that it no longer complains about non-linked libraries. I then cut a new HBB release and rebuilt the binary automation docker images on top of it, annnnnnd since the last run bundler
released an update that isn't compatible with older Rubies that we still support! Yaaaaaaay! So I updated the script to install bundler 1.17 specifically which still runs on a reasonable range of Rubies, and started a new build. Nginx also released a new version, so that's another update and build cycle.
After that the Passenger Nginx module also failed to build with gcc 7, as well as the license check for passenger enterprise, so those needed fixing. And after all that we were able to add Ruby 2.5 and 2.6 to our release system.