Immutable nodes for Kubernetes
Regen QR Code!
- Linux for 20+ years
- Microsoft Azure doing Upstream Linux
- Using Azure today because that's what I have
- Works on AWS, GCP, bare metal, or this 2017 laptop in front of me
- Linux guy, not a Kubernetes guy
- Vanilla kubeadm - no special sauce
- Using Headlamp
- Errors in Kube are mine
- Slides at the QR code and on my website
- This is the question
- Most people can't answer this correctly
- Edge cases make most wrong
- Let me explain why.
- We use mutable nodes under Kube
- Mutable nodes ... mutate
- Old nodes differ from new nodes
- Mutable systems are non-deterministic at scale
- The order of package installations matters
- This is why enterprise Linux getting rid of scriptlets
- Perfect with Gold images and Infra as Code?
- Engineers SSH in to debug problems and leave things behind.
- Logs or dumps that create disk issues
- Replaced binaries that haven't been CI'ed
- Divergence is invisible
- Same kernel, same OS version string, same kubelet version
- Still different
- package-level/file-level audit of every node
- At 100 nodes? 1000? continuously?
- Start from same base .. day 2 is different from Day 1
- This isn't a Kube problem
- Kubernetes is great at managing workloads
- Zero visibility into the node OS
- kubelet health check tells alive, workload ready
- NOT correct/consistent
- from where you look, nodes look identical
- So what do we do about this?
- Kubernetes Security community described "Shift Down"
- "Shift Left" — move security earlier in the pipeline
- "Shift Down" push security guarantees deeper into the platform
- The white paper talks about embedding security policies deeper into Kubernetes
- I'm taking it one step further:
- push it into the node os
- We have history here
- CNCF ecosystem has done amazing work on container image supply chain
- SLSA levels, Sigstore for signing, SBOMs for transparency
- But this is above Kube - in the workload
- Most drift defenses detect changes after the fact
- detection is not prevention
- What if we made it so drift COULDN'T happen?
- Push integrity into the platform
- Flatcar is a special-purpose OS. It runs containers. That's it.
- It's a CNCF project — vendor-neutral, community-governed.
- The brand, the assets, the IP — all CNCF.
- Incubating doesn't tell the whole story
- Long lineage: CoreOS made concepts from ChromiumOS into Container Linux
- Red Hat bought CoreOS and changed the tech
- Kinvolk forked and continued it as Flatcar
- Microsoft acquired Kinvolk and donated Flatcar to CNCF.
- /usr is read-only and dm-verity protected
- dm-verity means cryptographic hash verification at the block level
- Every block is verified against a hash at read
- If anything has been tampered with — it won't run
- Not "alert" — won't execute.
- Not even a cosmic bit-flip gets through.
- There is no package manager. You cannot install something on /usr
- Updates are atomic, deterministic
- more to come on updates and adding software
- Two checks give you full userspace identity:
- VERSION_ID and a similar string for added software
Together, these tell you the complete userspace state
- /etc is still writable — more to come
- This is the "Shift Down" made concrete and pushed deeper
- Userspace drift isn't detected — it's impossible
- Flatcar is an OS you deploy, not configure
- This Butane yaml defines the contract between workload and OS
- transpile to json and deploy on your platform of choice
- we provide a system that meets this contract
- This is a simplified view of the Butane config I am using.
- SSH key, kubernetes sysext, kubeadm join command.
- One declarative file. Same file = same machine. Every time.
- This is the Azure CLI command.
- One command, one config file.
- --custom-data passes our Ignition JSON. Flatcar reads it on first boot.
- I'll add a worker to an existing cluster
- bring up workload on it
- prove it is identical to an existing worker
- flip to headlamp
- Flatcar doesn't do in-place package updates
- The entire OS is downloaded, verified image and written
- While this happens, the running system is completely untouched
- any problems, no change in run time system
- Reboot swaps to the new partition. Health check runs.
- If it fails, next reboot goes back to the old partition
- In Kubernetes context: drain the node, update, reboot, rejoin, uncordon.
- FLUO (Flatcar Linux Update Operator) or Kured or whatever comes next automate this across a fleet.
- Respects your policies and rules
- If /usr is read-only, how do you add anything?
- systemd-sysext provides an overlay mechanism.
Think of it like a container layer — but for the OS.
- Each sysext is a self-contained, immutable image (usually squashfs).
It overlays its contents onto /usr at boot.
- Three tiers of sysexts:
- opt out - shipped in image
- opt in - shipped from our repos
- community - bakery - tailscale, kube
- On THIS cluster, kubelet came from a community sysext.
- sysext images have the same supply chain guarantees
- Versioned, verifiable, immutable.
- Nothing is "installed." Everything is composed.
- This is the config
- look for kubernetes v1.33 patch releases at this URL
- Match on z streams in this case
- downloads the new sysext image next to old one
- flips the symlink like flipping partitions
- Unlike with the os, no reboot required
- This isn't a research project
- These tools are current, maintained, and in production.
- Cluster API standard
- What I showed you by hand today — done by Kube tooling
- FLUO and Kured watch for update_engine's "reboot needed" signal,
- drain the node, reboot it, and uncordon it — automatically.
- Follow your node policy
- We are working on making /etc/ the same
- confext: today /usr is immutable but /etc is still writable.
- systemd-confext will bring the same overlay model to /etc —
- Your implementation will vary based on your needs
- Thank you
- Flatcar: flatcar.org — docs, community, everything
- Chat: Matrix channel, CNCF Slack (#flatcar)
- Office Hours: every 2nd Wednesday, 14:30 UTC — Europe-friendly
- Slides online at the QR code and my website
- Happy to take questions now or find me in the hallway
Build Notes:
Build: marp --pptx --html slides.md -o slides.pptx --allow-local-files
QR Code: qrencode -o qr-bexelbie.png -s 10 -m 2 "https://www.bexelbie.com"