NixOS migration with Flakes (Part 1)

Why I am moving to flakes Link to heading

I wouldn’t be moving to flakes if they didn’t solve a problem I had or for a feature I deemed necessary for myself or my sanity.

Up until a few months ago, this is how I installed NixOS:

Boot the minimal ISO.
Install a few dependencies of the installer (git, tmux, parted, etc).
Clone my installer + NixOS config repo.
Install NixOS by calling my installer.sh script which takes in:
- (a) target drive
- (b) hostname
- (c) “installation type” ([desktop|rpi|virt])

You: But that’s normal though, right?
Me: Kind of. Keep reading on how I made a huge mistake (read unmaintainable) in my installer script.
You: Surely flakes can’t help with a bad decision in your installer script.
Me: Correct, they don’t.
You: ???
Me: They help me with the problem I was trying to solve, which led to said unmaintainable mess.

Before I delve into how my installer script works, let’s first look into how a generic installer works.

Tip

The following section on how a generic installer works isn’t technically necessary. Feel free to skip it if you already know it or if you find it boring while reading.

How does your Linux distribution’s installer work? Link to heading

If you have installed Arch Linux, you know how to bootstrap a Linux system. It essentially boils down to the following steps:

Partition the drive with at-least two partitions, one EFI, to be mounted on /boot/efi (or /efi) and one with a normal filesystem like Ext4/XFS, to be mounted on /.
Format those partitions with mkfs.$fs. Don’t forget to turn on the bootable property for the EFI partition.
Mount them in the hierarchy that you need to some sane mount point. This is usually /mnt.
If your distribution does not provide a “chroot script” (like Arch Linux’s arch-chroot), manually mount (bind) /dev, /proc and /sys so that the bootloader’s install script(s) don’t complain.
Start a package bootstrap (read: installation). Either install all packages or only a minimal set of packages, enough to chroot into it.
Once step 4 is complete, chroot into the fully/minimally bootstrapped system and perform additional non-package-related steps like:
- (a) setup timezone
- (b) generate locale(s)
- (c) set machine’s hostname
- (d) perform non-root user setup (add user, set shell, add to groups, set password, etc)
- (e) perform setup for the root user (disable password, etc)
- (f) modify /etc/sudoers
- (g) enable/disable systemd units (services, targets, timers, etc)
- (h) [re]generate initramfs
- (i) install bootloader
Unmount all drives from the mountpoint (/mnt).
Reboot into the installed system.

It doesn’t matter what Linux distribution you are installing, or even which OS you are installing. These are the general steps that you or your installer will perform to bootstrap a system from the installer medium.

Back to my installer Link to heading

Now that you know what a typical OS bootstrap looks like, you might assume that my installer script works the same or smiliar-ish way. And you’d be right. But me being me, I overdid it by adding something called a “installer scan” to make the script “more non-interactive”. This wasn’t over engineered in its initial form. I did have more than one NixOS machine with different enough hardware that this felt like a good idea.

This would scan for the presence/absence of select hardware. If some “special” hardware–which, to work properly, needs a config option enabled–is found, this script would append the necessary NixOS configuration options to a host-specific-configuration.nix file which is always imported by the configuration.nix file. (For anyone curious, here is one of the hackiest hack of all hacks.)

Even if there was nothing specific with a given system compared to another system, the host-specific-configuration.nix file would still be populated by at-least two NixOS Configuration options:

This is a guarantee from me to the configuration.nix file that no matter what, host-specific-configuration.nix will never be empty and by extension, won’t error out upon import due to being empty.

But as you might have imagined, it was getting quite tedious to keep the host-specific-configuration.nix file in sync with the improvements I made on one machine, with another machine. This is because of the following factors:

This file is not included in the git tree, which means that I need to copy it from one machine to another to compare it.
I had a systemd service that would:
- (a.) Pull the configuration to make sure that it is up-to-date.
- (b.) Copy the updated configuration files and unconditionally overwrite the ones in /etc/nixos. This meant that I couldn’t include host-specific-configuration in the git tree no matter how much I actually wanted to.
Not included in the git tree means that for any reason, if I modify it on one machine, rebuild, reboot and forget to document that change, or worse, that change breaks down the road, I have no way of reverting back to the previous state of that file. Remember, turning on system.copySystemConfiguration will only copy the configuration.nix file even if it imports other files.

I wanted to get out of this mess without moving the if-else hell from the shell script into a NixOS configuration file.

The solution? Flakes. Not exactly. But, flakes do address a lot of my issues with my installer script and give me some nice-to-have features too.

“But why male models flakes?” Link to heading

As you might’ve understood by now, the current, non-flake situation is anything but a good situation. In fairness to me, this was at a time when I had only 3 Linux machines to install NixOS on. One x86 PC (no workey anymore; Hari Om) and two Raspberry Pi 4. So the setup had to be architecture agnostic from the start.

Over time, I have obtained 3 more ARM SBCs and 2 more RISC-V SBCs. In the same time, I have also discovered that something called home-manager exists which essentially allows me to setup my non-root user’s home environment on non-NixOS Unixes as if it were NixOS (kind of). So the need for managing the “host-specific-configuration.nix” file for every host, in a git repository, without submitting to an insane chain of if-then-elses in the NixOS configuration file was extremely high.

With flakes, instead of an if-else ladder, I can simply import nixos-configurations/hosts/${hostname}/default.nix from the flake.nix file in the nixosConfiguration for a given host. Though this would mean that I needed a default.nix file for every system. It would also not allow me to use an arbitrary hostname for, say a VM. But, both of these trade-offs were worth the cost to me.

At the moment, I am managing 5 systems on which NixOS is installed. If my dead PC worked, it would be 1 more. And, as soon as the binary caches for riscv64-linux on cache.nixos.org are available, that number will be bumped by 2. At the moment, I have 8 NixOS machines defined, but so far only 5 are what one would call “actively deployed”. Plus, I have a work-provided x86 mac that has home-manager on it.

Therefore, this complexity is a necessary evil.

Conclusion Link to heading

At the end of the day, flakes do solve a problem for me. Now all that remains is the actual transition. In the successive post, I will document how I performed this transition from that mess of an installer to something more maintainable (in the context of easily adding an extra machine or two; yay RK3588-based SBCs!).

Given all of this, beware that my solution is very likely over-engineered for you. That doesn’t mean that you cannot learn from it. ;)