On Build Reproducibility
Published on 2025-03-23
Let's say hypothetically, I am about to write a series of articles on creating your own scheduler for x64 machines. The plan would be to make different articles along the lines of setting up the build environment, printing "Hello World", setting up GDT and IDT, loading up a program and implementing the scheduler. Excitedly, I would dive into the first topic - setting up build environment - intending to hold reader's hand throughout the process. But 300 lines in, I would be overwhelmed with resentment. No matter how carefully I would explain it, I fear it won’t be clear enough, thanks to factors beyond my control.
Kernel development is a dependency nightmare. It relies on tools like QEMU, a C compiler, GDB, Bochs and Xorrisso - each along its own tangled web of dependencies. These, in turn, dependent upon yet more dependencies, which might be platform dependent. To install them, you are forced to install them via package managers that loves to scatter install everything system-wide. The version of binutils that GCC depends on my system? Very likely different than yours. The consequence of all of this is a fragile house of cards, teetering on the edge of catastrophic collapse - potentially bricking your system if you are not meticulous. My early days of using Linux was primarily spent on reinstalling Linux so many times that, at some point, I could compile Gentoo blindfolded. I have spent a lot of time debugging these issues. But now I am older (and do not compile Gentoo) and my spider senses tingle at the first whiff of dependency quagmire. I know I'm about to waste a gargantuan amount of time for nothing.
So, I avoid installing dependencies through the global package managers. I tend to compile stuff from scratch and I have deep respect for projects that have very quick compile times and dependencies. But this is a mirage because the compilers that I build these projects with is installed system-wide. The mirage breaks when I want to build compilers from scratch. Why? I am building a kernel that runs on x64 machines, which means kernel should be built with a compiler that spits out x64 binaries. My current Clang compiler that comes on my m-series Mac does not do this. Even if I were to target ARM, not building a cross compiler still be an issue. This is because not doing so would mean passing a lot of compiler options to the clang compiler, telling it that we are building for the correct target platform, and telling it not to use the host's system headers and libraries, which is a pain in the ass. This is an even bigger pain in the ass when you are compiling user-land programs. There is a reason why kernel's toolchain (see Serenity) tend to build compilers from scratch. The OSDev Wiki sells my point quite well.
But compiling a cross-compiler? That’s its own nightmare. Every project has its quirks—GCC, for instance, demands
flags like --with-gmp
, --with-mpfr
, and --with-mpc
if you don’t have those libraries in standard locations. I avoid
standard locations to keep my system clean, but that choice bites me later. Build steps evolve, documentation lags,
and there’s no governing body ensuring GCC’s process stays backward-compatible. Whose fault is this? The C/C++
committees for ignoring build woes, OSS authors for piling on dependencies, and me for choosing C in the first
place. But my fate was sealed—building a kernel in C was my destiny. Alternatives like Zig tempt me with built-in
cross-compilers, but C’s warts-and-all charm still holds me captive.
Back to my article on setting up the build system: I swore off maintaining toolchain scripts long ago. My goal is to learn kernel development, not wrestle with build chains. So, I scrapped my scripts and turned to shell.nix. Ironic, right? After railing against dependencies, I add another. But Nix is different—it’s a reproducible1 build system and package manager that avoids global installs. Eelco Dolstra’s thesis sold me on it; it’s worth the learning curve over cobbling together straw-hut build systems. Still, I can’t help but think modern OSes should handle this natively. Fuchsia comes close, tackling package isolation from the ground up—no Docker bandaids here. It’s comforting to know someone’s working on it.
Footnotes:
It is not fully reproducible in some circumstances.