
Every homelabber’s worst nightmare is the day their network goes dark because of a single point of failure. For me, that day arrived when my UDM Pro’s SFP+ port failed silently — taking down Plex, HDHomeRun, Home Assistant, and every other service dependent on DHCP. The device had become the bottleneck for my entire local network.
That failure sparked two parallel projects: replacing the built-in DHCP server with a high-availability ISC Kea pair running in LXC containers across two unRAID servers, and later building a web-based subnet editor to replace the Mac terminal scripts I’d been using for day-to-day management. Together, these projects transformed my network from fragile to resilient — and made managing it actually enjoyable.
This post walks through both projects: why they were built, how they work, key decisions along the way, and lessons learned that might save you time if you’re tackling something similar.
The UDM Pro was serving DHCP across five VLANs (Core, Network, IoT, Backend, and Home Assistant), but its single SFP+ port failure proved that it wasn’t truly redundant. When that port died, DHCP stopped working entirely — devices couldn’t get addresses, and the network effectively went dark.
The goal was clear: build a DHCP solution that survives the loss of either server independently, while providing better visibility and management capabilities than the UDM Pro’s built-in interface.
The solution uses two LXC containers running ISC Kea 2.6.5 in hot-standby mode across separate unRAID hosts:
DockOfTheBay (unRAID) SpaceDock (unRAID)
┌─────────────────────┐ ┌─────────────────────┐
│ kea-primary │◄──HA heartbeat──►│ kea-secondary │
│ 10.10.10.10 │ :8080 │ 10.10.10.11 │
│ (active/primary) │ │ (standby) │
└─────────────────────┘ └─────────────────────┘
│ │
▼ ▼
kea-ctrl-agent kea-ctrl-agent
:8000 :8000
│ │
▼ ▼
isc-stork-agent isc-stork-agent
:8082 :8082
Both nodes serve the same five VLANs:
| VLAN | Name | Subnet | Pool Range |
|---|---|---|---|
| 40 | Core | 10.10.0.0/16 | 10.10.100.0–255.254 |
| 1 | Network | 172.16.1.0/24 | 172.16.1.100–254 |
| 30 | IoT | 10.30.30.0/24 | 10.30.30.100–254 |
| 42 | Backend | 10.42.42.0/24 | 10.42.42.100–254 |
| 55 | HASS | 10.55.55.0/24 | 10.55.55.100–199 |
This was a critical architectural decision. Kea binds directly to network interfaces — one per VLAN. LXC containers get real network interfaces via the ich777 plugin, giving them direct access to the host’s networking stack. Docker would require either --net=host or macvlan, both adding complexity and fragility for this use case.
Kea supports two HA modes: load-balancing (splitting address pools between nodes) and hot-standby (full pool on primary, secondary takes over only if the primary fails). For a homelab with infrequent new-device events, hot-standby proved simpler and avoided pool-split complexity. The tradeoff is that the standby node’s resources sit idle until needed — but for this scale, that’s an acceptable cost for simplicity.
ISC Stork 2.4.0 provides a monitoring dashboard for both Kea nodes. The architecture runs:
– Stork agent natively inside each LXC container (not in Docker) because the Docker agent can’t discover Kea running in a different process namespace via Unix socket
– Stork server as a Docker Compose stack on SpaceDock with PostgreSQL backend
The Stork UI at `http://10.10.20.79:8080` shows both nodes’ health, lease counts, and HA state in real-time.
Before the web editor, all DHCP management happened through a suite of shell scripts on my Mac:
| Script | Purpose |
|---|---|
kea-leases.sh |
List active leases with VLAN selection menu |
kea-reservations.sh |
List all fixed reservations across VLANs |
kea-add-reservation.sh |
Add reservation, sync both nodes, reload |
kea-remove-reservation.sh |
Remove reservation, sync both nodes, reload |
kea-set-hostname.sh |
Update hostname on active lease or reservation |
kea-subnet-info.sh |
Show pool, gateway, DNS per VLAN |
kea-reload.sh |
Reload config on both nodes without restart |
The sync workflow was: Mac pulls config from primary → applies sed swap for secondary name → pushes to secondary. This avoided cross-LXC SSH host key issues that plagued early attempts at direct primary-to-secondary sync.
Port conflicts are silent killers. Kea’s HA hook binds port 8080, and Stork agent defaults to the same port. They collide silently — no error, just broken communication. Fix: set STORK_AGENT_PORT=8082 in /etc/stork/agent.env.
Certificate permissions matter. After running stork-agent register, always fix ownership:
chown -R stork-agent:stork-agent /var/lib/stork-agent/certs/ \
/var/lib/stork-agent/tokens/
The registration runs as root, but the service runs as stork-agent user.
Kea’s API returns arrays. The REST API always wraps responses in a JSON array [{}]. Access arguments with data[0]['arguments'], not data['arguments'].
LXC systemctl is unreliable via lxc-attach. Use pgrep -x kea-dhcp4 instead of systemctl is-active when checking service status through LXC containers.
The Mac operation scripts worked, but they had limitations:
– Required SSH access and terminal familiarity
– No visual feedback on current values before editing
– Manual sync between nodes prone to human error
– Harder for team members or family to use
The solution was a Flask-based web application (kea-web) that provides an in-browser editor for Kea subnet settings. Previously, changing a VLAN’s gateway, DNS servers, NTP, IP pool, or lease time required running shell scripts from the Mac terminal. Now it’s a form in the Subnets tab of the dashboard.
Browser → kea-web (SpaceDock:8085) → SSH/SFTP → kea-primary & kea-secondary
→ HTTP Control Agent → Config reload
The application is built as a Docker container deployed on SpaceDock, with the following flow for subnet edits:
POST /subnets/edit with subnet_id and changed fieldskea-dhcp4.conf as JSON, patches the target subnet’s pools/valid-lifetime/option-data entries in-place, writes backthis-server-name is never corruptedIndependent node patching vs scp + sed. The earlier shell scripts synced by pulling from primary, running sed 's/kea-primary/kea-secondary/', and pushing to secondary. The web app takes a safer approach: each node’s config is read and patched locally. This preserves any node-specific fields beyond this-server-name and avoids the assumption that the only difference between configs is that one string.
Empty field behavior. The modal pre-populates all fields with live running values from the Kea Control Agent. Submitting with any field unchanged preserves the current value. An empty field removes the option-data entry at the subnet level (the global default then takes over). This matches the mental model of the shell scripts — you only change what you intend to change.
Single pool range support. The form supports one pool range per VLAN (e.g., 10.10.100.0 - 10.10.199.255). All production VLANs have exactly one pool, so multi-pool support was deferred to keep the UI simple.
The application is managed through a custom Komodo stack:
– Source code in Gitea repository (homelab-containers)
– Build job creates Docker image gitea.cossaboon.net/kcossabo/kea-web:latest
– ResourceSync pushes config to SpaceDock
– Deploy runs on SpaceDock at `http://10.10.20.84:8085`
Watch for invisible bugs in TOML. A clone_path = " " (two spaces, not empty) in resources.toml was silently breaking deployments until ResourceSync overwrote the valid path with the invalid one. Spaces-only values are invisible in many editors — always check carefully if a deploy fails at Stage 1.
Jinja2 template limitations. You can’t call dict.update() directly in Jinja2 templates. Register custom filters on the Flask app for list-to-dict conversions instead of trying inline mutations.
Building high-availability DHCP and a web management interface transformed my homelab from fragile to resilient, and more importantly, made network management actually enjoyable rather than a chore.
Hot-standby beats load-balancing for homelabs. Simpler to configure, simpler to debug, and your standby resources are available when you actually need them.
Web interfaces beat terminal scripts for shared environments. Even if you’re the only operator, a browser-based editor with pre-populated values and visual feedback reduces errors and makes changes more intuitive.
Monitor everything. ISC Stork’s real-time visibility into both Kea nodes’ health and lease counts was invaluable during troubleshooting and gave confidence that the HA setup actually worked as intended.
The result is a DHCP infrastructure that survives server failures, provides clear monitoring, and can be managed from any browser — not just my Mac terminal. If you’re running a homelab with multiple VLANs and devices depending on DHCP, these patterns are worth considering for your own setup.