Kubernetes homelab server with Ubuntu Server 24.04 (octavo)
Need. More. Server. Need. More. POWER!!!
Just a bit more, maybe quite a bit more, to run services that are more CPU-intensive than those already running in the current Single-node Kubernetes cluster on an Intel NUC: lexicon.
Hardware
Ubuntu's list of Recommended and Certified Hardware is just as short and ancient as it was 2 years ago, but NUC systems have a now long history of being well suppored under Linux, so this time I'm taking chances:
- ASUS NUC 13 Pro Tall PC Kit RNUC13ANHI700000I w/ Intel Core i7-1360P ($570)
- Kingston FURY Impact 1x 32GB, 3200 MHz, DDR4-RAM, SODIMM ($110)
- Kingston FURY Renegade 4000 GB, M.2 2280 ($250)
Bootable USB stick
Get Ubuntu Server (24.04.2 LTS) and create a bootable USB stick on Ubuntu.
Install Ubuntu Server 24.04
Installing Ubuntu Server 24.04 went smoothly and without any problems, the NUC booted from the USB stick and secure boot, enabled by default, never presented any problem. Screen and USB keyword worked seamlessly through a DisplayPort+USB+Ethernet Cable Matters Hub on a Thunderbolt port in the NUC.
Once the intaller boots, the installation steps were:
- Choose language and keyboard layout.
- Choose Ubuntu Server (default, not (minimized)).
- Checked the option to Search for third-party drivers.
- Networking: DHCP on wired network.
- IP address
.155is assgined to theenx5c857e3e1129interface, the RTL8153 Gigabit Ethernet NIC in the Cable Matters Hub. - The
enp86s0interface is the NUC's integrated 2.5Gbps NIC (Intel I226-V), which during the installation had no network cable attached.
- IP address
- Pick a local Ubuntu mirror to install packages from.
- Setup a Custom storage layout as follows
- Select the disk (Kingston FURY Renegade 4000 GB) to Use As Boot Device.
This automatically creates a 1GB partition for
/boot/efi(formatted asbtrfs). - Create a 60G partition to mount as
/(formatted asbtrfs). - Create a 60G partition to reverse for a future OS.
- Create a 60G partition to mount as
/var/lib(formatted asbtrfs). - Create a partition using all remaining space (3.46T) to mount as
/home(formatted asbtrfs).
- Select the disk (Kingston FURY Renegade 4000 GB) to Use As Boot Device.
This automatically creates a 1GB partition for
- Confirm partitions & changes.
- Set up a Profile: username (
ponder), hostname (octavo) and password. - Skip Upgrade to Ubuntu Pro (to be done later).
- Install OpenSSH server and allow password authentication (for now).
- A selection of snap packages is available at this point, none were selected.
- Confirm all previous choices and start to install software.
- Once the installation is complete, remove the UBS stick and hit Enter to reboot.
Disable swap
With 32 GB on a single DIMM and the possiblity to double that in the future, there is
no need for swap and it can be
problematic for Kubernetes later, so this should be
disabled: remove the relevant line in /etc/fstab, reboot and delete the swap file
(typically /swap.img or /swapfile).
Tweak OpenSSH server
Set the root password by first escalating with sudo su - and then using the
passwd command to set the password. This password will hardly ever be used, but
should be set so that it can be use in case of emergency.
Copy SSH public keys into the .ssh/authorized_keys of both ponder and root,
then disable password authentication in the OpenSSH server:
Test SSH connections as both ponder and root from all the relevant hosts in the LAN.
Setup Fail2Ban
On top of disabling password authentication, the SSH server will be less busy is those pesky bad actors are blocked from reaching it's port. To do this, install fail2ban:
apt install fail2ban -y
# apt install fail2ban -y
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
python3-pyasyncore python3-pyinotify whois
Suggested packages:
mailx monit sqlite3 python-pyinotify-doc
The following NEW packages will be installed:
fail2ban python3-pyasyncore python3-pyinotify whois
0 upgraded, 4 newly installed, 0 to remove and 1 not upgraded.
Need to get 496 kB of archives.
After this operation, 2,572 kB of additional disk space will be used.
Get:1 http://ch.archive.ubuntu.com/ubuntu noble/main amd64 python3-pyasyncore all 1.0.2-2 [10.1 kB]
Get:2 http://ch.archive.ubuntu.com/ubuntu noble-updates/universe amd64 fail2ban all 1.0.2-3ubuntu0.1 [409 kB]
Get:3 http://ch.archive.ubuntu.com/ubuntu noble/main amd64 python3-pyinotify all 0.9.6-2ubuntu1 [25.0 kB]
Get:4 http://ch.archive.ubuntu.com/ubuntu noble/main amd64 whois amd64 5.5.22 [51.7 kB]
Fetched 496 kB in 0s (2,059 kB/s)
Selecting previously unselected package python3-pyasyncore.
(Reading database ... 86910 files and directories currently installed.)
Preparing to unpack .../python3-pyasyncore_1.0.2-2_all.deb ...
Unpacking python3-pyasyncore (1.0.2-2) ...
Selecting previously unselected package fail2ban.
Preparing to unpack .../fail2ban_1.0.2-3ubuntu0.1_all.deb ...
Unpacking fail2ban (1.0.2-3ubuntu0.1) ...
Selecting previously unselected package python3-pyinotify.
Preparing to unpack .../python3-pyinotify_0.9.6-2ubuntu1_all.deb ...
Unpacking python3-pyinotify (0.9.6-2ubuntu1) ...
Selecting previously unselected package whois.
Preparing to unpack .../whois_5.5.22_amd64.deb ...
Unpacking whois (5.5.22) ...
Setting up whois (5.5.22) ...
Setting up python3-pyasyncore (1.0.2-2) ...
Setting up fail2ban (1.0.2-3ubuntu0.1) ...
/usr/lib/python3/dist-packages/fail2ban/tests/fail2banregextestcase.py:224: SyntaxWarning: invalid escape sequence '\s'
"1490349000 test failed.dns.ch", "^\s*test <F-ID>\S+</F-ID>"
/usr/lib/python3/dist-packages/fail2ban/tests/fail2banregextestcase.py:435: SyntaxWarning: invalid escape sequence '\S'
'^'+prefix+'<F-ID>User <F-USER>\S+</F-USER></F-ID> not allowed\n'
/usr/lib/python3/dist-packages/fail2ban/tests/fail2banregextestcase.py:443: SyntaxWarning: invalid escape sequence '\S'
'^'+prefix+'User <F-USER>\S+</F-USER> not allowed\n'
/usr/lib/python3/dist-packages/fail2ban/tests/fail2banregextestcase.py:444: SyntaxWarning: invalid escape sequence '\d'
'^'+prefix+'Received disconnect from <F-ID><ADDR> port \d+</F-ID>'
/usr/lib/python3/dist-packages/fail2ban/tests/fail2banregextestcase.py:451: SyntaxWarning: invalid escape sequence '\s'
_test_variants('common', prefix="\s*\S+ sshd\[<F-MLFID>\d+</F-MLFID>\]:\s+")
/usr/lib/python3/dist-packages/fail2ban/tests/fail2banregextestcase.py:537: SyntaxWarning: invalid escape sequence '\['
'common[prefregex="^svc\[<F-MLFID>\d+</F-MLFID>\] connect <F-CONTENT>.+</F-CONTENT>$"'
/usr/lib/python3/dist-packages/fail2ban/tests/servertestcase.py:1375: SyntaxWarning: invalid escape sequence '\s'
"`{ nft -a list chain inet f2b-table f2b-chain | grep -oP '@addr-set-j-w-nft-mp\s+.*\s+\Khandle\s+(\d+)$'; } | while read -r hdl; do`",
/usr/lib/python3/dist-packages/fail2ban/tests/servertestcase.py:1378: SyntaxWarning: invalid escape sequence '\s'
"`{ nft -a list chain inet f2b-table f2b-chain | grep -oP '@addr6-set-j-w-nft-mp\s+.*\s+\Khandle\s+(\d+)$'; } | while read -r hdl; do`",
/usr/lib/python3/dist-packages/fail2ban/tests/servertestcase.py:1421: SyntaxWarning: invalid escape sequence '\s'
"`{ nft -a list chain inet f2b-table f2b-chain | grep -oP '@addr-set-j-w-nft-ap\s+.*\s+\Khandle\s+(\d+)$'; } | while read -r hdl; do`",
/usr/lib/python3/dist-packages/fail2ban/tests/servertestcase.py:1424: SyntaxWarning: invalid escape sequence '\s'
"`{ nft -a list chain inet f2b-table f2b-chain | grep -oP '@addr6-set-j-w-nft-ap\s+.*\s+\Khandle\s+(\d+)$'; } | while read -r hdl; do`",
Created symlink /etc/systemd/system/multi-user.target.wants/fail2ban.service → /usr/lib/systemd/system/fail2ban.service.
Setting up python3-pyinotify (0.9.6-2ubuntu1) ...
Processing triggers for man-db (2.12.0-4build2) ...
Scanning processes...
Scanning processor microcode...
Scanning linux images...
Running kernel seems to be up-to-date.
The processor microcode seems to be up-to-date.
No services need to be restarted.
No containers need to be restarted.
No user sessions are running outdated binaries.
No VM guests are running outdated hypervisor (qemu) binaries on this host.
Enable the service so it starts after each system restart:
# systemctl enable --now fail2ban
Synchronizing state of fail2ban.service with SysV service script with /usr/lib/systemd/systemd-sysv-install.
The default configuration should be enough because this system will probably not expose its port 22 to the Internet, but instead accessed remotely via Tailscale. If port 22 is later expoed to the Internet, see lexicon's Fail2ban setup.
Tweak network config
This system will use static IP addresses .8 so they can be added to the /etc/hosts
file of the relevant hosts in the LAN, then the file can be copied into the new server,
preserving the line that points 127.0.0.1 to itself.
While using the Cable Matters hub, the LAN IP address is setup on the enx5c857e3e1129
interface:
# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host noprefixroute
valid_lft forever preferred_lft forever
2: enp86s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 48:21:0b:6d:3e:9b brd ff:ff:ff:ff:ff:ff
3: enx5c857e3e1129: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 5c:85:7e:3e:11:29 brd ff:ff:ff:ff:ff:ff
inet 192.168.0.155/24 metric 100 brd 192.168.0.255 scope global dynamic enx5c857e3e1129
valid_lft 85920sec preferred_lft 85920sec
inet6 fe80::5e85:7eff:fe3e:1129/64 scope link
valid_lft forever preferred_lft forever
4: wlo1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether d0:65:78:a5:8b:dd brd ff:ff:ff:ff:ff:ff
altname wlp0s20f3
dmesg lines for Intel I226-V 2.5Gbps NIC (igc) and USB NIC (r8152):
[ 1.736823] Intel(R) 2.5G Ethernet Linux Driver
[ 1.738154] Copyright(c) 2018 Intel Corporation.
[ 1.741460] igc 0000:56:00.0: enabling device (0000 -> 0002)
[ 1.744312] igc 0000:56:00.0: PTM enabled, 4ns granularity
[ 1.756421] ahci 0000:00:17.0: version 3.0
[ 1.756759] xhci_hcd 0000:00:0d.0: xHCI Host Controller
--
[ 1.821412] intel-lpss 0000:00:15.1: enabling device (0004 -> 0006)
[ 1.823953] idma64 idma64.1: Found Intel integrated DMA 64-bit
[ 1.842492] igc 0000:56:00.0: 4.000 Gb/s available PCIe bandwidth (5.0 GT/s PCIe x1 link)
[ 1.843094] igc 0000:56:00.0 eth0: MAC: 48:21:0b:6d:3e:9b
[ 2.056525] typec port0: bound usb3-port6 (ops connector_ops)
[ 2.058975] typec port0: bound usb2-port1 (ops connector_ops)
[ 2.061054] usb 3-1: new full-speed USB device number 2 using xhci_hcd
[ 2.086270] ata2: SATA link down (SStatus 4 SControl 300)
[ 2.092729] igc 0000:56:00.0 enp86s0: renamed from eth0
[ 2.100868] ucsi_acpi USBC000:00: UCSI_GET_PDOS failed (-95)
[ 2.114965] nvme 0000:01:00.0: platform quirk: setting simple suspend
[ 2.116031] nvme nvme0: pci function 0000:01:00.0
[ 2.125670] nvme nvme0: Shutdown timeout set to 10 seconds
[ 2.129358] nvme nvme0: 16/0/0 default/read/poll queues
--
[ 4.164720] usbcore: registered new device driver r8152-cfgselector
[ 4.352207] r8152-cfgselector 3-6.2: reset high-speed USB device number 6 using xhci_hcd
[ 4.581005] r8152 3-6.2:1.0: load rtl8153a-4 v2 02/07/20 successfully
[ 4.621215] r8152 3-6.2:1.0 eth0: v1.12.13
[ 4.624728] usbcore: registered new interface driver r8152
[ 4.660226] usbcore: registered new interface driver cdc_ether
[ 4.669749] r8152 3-6.2:1.0 enx5c857e3e1129: renamed from eth0
[ 4.720488] usb 3-6.4.1: new high-speed USB device number 8 using xhci_hcd
[ 4.808357] raid6: avx2x4 gen() 29500 MB/s
[ 4.825355] raid6: avx2x2 gen() 36875 MB/s
[ 4.842356] raid6: avx2x1 gen() 35157 MB/s
[ 4.842362] raid6: using algorithm avx2x2 gen() 36875 MB/s
Servers are better setup with static IP addresses in a know range, and for this system
the .8addresses have been reserved and can be setup in the Netplan configuration:
network:
version: 2
ethernets:
enx5c857e3e1129:
dhcp4: no
dhcp6: no
# Ser IP address & subnet mask
addresses: [ 10.0.0.7/24, 192.168.0.7/24 ]
# Set default gateway
routes:
- to: default
via: 192.168.0.1
# Set DNS name servers
nameservers:
addresses: [62.2.24.158, 62.2.17.61]
enp86s0:
dhcp4: no
dhcp6: no
# Ser IP address & subnet mask
addresses: [ 10.0.0.8/24, 192.168.0.8/24 ]
# Set default gateway
# Set DNS name servers
nameservers:
addresses: [62.2.24.158, 62.2.17.61]
Note
Adding routes to both Ethernet interfaces will result in warnings, and is not
necessary so long as one can SSH in from another host in the LAN.
After applying the configuration with netplan apply both Ethernet NICs will always have
static IP addresses:
# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host noprefixroute
valid_lft forever preferred_lft forever
2: enp86s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
link/ether 48:21:0b:6d:3e:9b brd ff:ff:ff:ff:ff:ff
3: enx5c857e3e1129: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 5c:85:7e:3e:11:29 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.7/24 brd 10.0.0.255 scope global enx5c857e3e1129
valid_lft forever preferred_lft forever
inet 192.168.0.7/24 brd 192.168.0.255 scope global enx5c857e3e1129
valid_lft forever preferred_lft forever
inet6 fe80::5e85:7eff:fe3e:1129/64 scope link
valid_lft forever preferred_lft forever
4: wlo1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether d0:65:78:a5:8b:dd brd ff:ff:ff:ff:ff:ff
altname wlp0s20f3
The server can now be relocated to the 2.5Gbps network, where no screen is available.
Once relocated to the 2.5Gbps network, no longer using the Cable Matters hub, and finished
booting, SSH back into the server using the .8 address:
# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host noprefixroute
valid_lft forever preferred_lft forever
2: enp86s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 48:21:0b:6d:3e:9b brd ff:ff:ff:ff:ff:ff
inet 10.0.0.8/24 brd 10.0.0.255 scope global enp86s0
valid_lft forever preferred_lft forever
inet 192.168.0.8/24 brd 192.168.0.255 scope global enp86s0
valid_lft forever preferred_lft forever
inet6 fe80::4a21:bff:fe6d:3e9b/64 scope link
valid_lft forever preferred_lft forever
3: wlo1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether d0:65:78:a5:8b:dd brd ff:ff:ff:ff:ff:ff
altname wlp0s20f3
To re-enable the server's access to the Internet, move the routes to the enp86s0
interface and netplan apply once again:
network:
version: 2
ethernets:
enx5c857e3e1129:
dhcp4: no
dhcp6: no
# Ser IP address & subnet mask
addresses: [ 10.0.0.7/24, 192.168.0.7/24 ]
enp86s0:
dhcp4: no
dhcp6: no
# Ser IP address & subnet mask
addresses: [ 10.0.0.8/24, 192.168.0.8/24 ]
# Set default gateway
routes:
- to: default
via: 192.168.0.1
# Set DNS name servers
nameservers:
addresses: [62.2.24.158, 62.2.17.61]
At this point the network configuration is finalized.
Set correct timezone
The installation process does not offer setting the system timezone and defaults to UTC:
# timedatectl
Local time: Sun 2025-04-13 20:47:28 UTC
Universal time: Sun 2025-04-13 20:47:28 UTC
RTC time: Sun 2025-04-13 20:47:28
Time zone: Etc/UTC (UTC, +0000)
System clock synchronized: yes
NTP service: active
RTC in local TZ: no
Since this is not the local timezone anywhere, it is better (more convenient) to set the local timezone, e.g.
# timedatectl set-timezone "Europe/Amsterdam"
# timedatectl
Local time: Sun 2025-04-13 22:48:45 CEST
Universal time: Sun 2025-04-13 20:48:45 UTC
RTC time: Sun 2025-04-13 20:48:45
Time zone: Europe/Amsterdam (CEST, +0200)
System clock synchronized: yes
NTP service: active
RTC in local TZ: no
Update system packages
Updating system packages is recommended after installing from USB media:
Tweak Bash prompt
Tweak Bash prompt for root to make user name
red,
host name
blue
and path
green;
with this in .bashrc:
Other users' Bash prompt is left as default, which renders all
green. The idea is that root's prompt is visually different,
to remind me that with great power comes great responsibility.
Upgrade to Ubuntu Pro
This step was skipped during the installation process so the server is not attached to an Ubuntu Pro account:
# pro security-status
684 packages installed:
684 packages from Ubuntu Main/Restricted repository
To get more information about the packages, run
pro security-status --help
for a list of available options.
This machine is receiving security patching for Ubuntu Main/Restricted
repository until 2029.
This machine is NOT attached to an Ubuntu Pro subscription.
Ubuntu Pro with 'esm-infra' enabled provides security updates for
Main/Restricted packages until 2034.
Try Ubuntu Pro with a free personal subscription on up to 5 machines.
Learn more at https://ubuntu.com/pro
Take the command and token from https://ubuntu.com/pro/dashboard and attach the server to an active Ubuntu Pro account:
# pro attach ______________________________
Enabling Ubuntu Pro: ESM Apps
Ubuntu Pro: ESM Apps enabled
Enabling Ubuntu Pro: ESM Infra
Ubuntu Pro: ESM Infra enabled
Enabling Livepatch
Livepatch enabled
This machine is now attached to 'Ubuntu Pro - free personal subscription'
SERVICE ENTITLED STATUS DESCRIPTION
anbox-cloud yes disabled Scalable Android in the cloud
esm-apps yes enabled Expanded Security Maintenance for Applications
esm-infra yes enabled Expanded Security Maintenance for Infrastructure
landscape yes disabled Management and administration tool for Ubuntu
livepatch yes enabled Canonical Livepatch service
realtime-kernel* yes disabled Ubuntu kernel with PREEMPT_RT patches integrated
usg yes disabled Security compliance and audit tools
* Service has variants
NOTICES
Operation in progress: pro attach
For a list of all Ubuntu Pro services and variants, run 'pro status --all'
Enable services with: pro enable <service>
Account: [email protected]
Subscription: Ubuntu Pro - free personal subscription
Weekly btrfs scrub
To keep BTRFS file systems healthy, it is recommended to
run a weekly scrub
to check everything for consistency. For this, run
the script
from crontab every Saturday morning, early enough that it will
be done by the time anyone wakes up:
/usr/local/bin/btrfs-scrub-all
#! /bin/bash
# By Marc MERLIN <[email protected]> 2014/03/20
# License: Apache-2.0
# http://marc.merlins.org/perso/btrfs/post_2014-03-19_Btrfs-Tips_-Btrfs-Scrub-and-Btrfs-Filesystem-Repair.html
which btrfs >/dev/null || exit 0
export PATH=/usr/local/bin:/sbin:$PATH
FILTER='(^Dumping|balancing, usage)'
test -n "$DEVS" || DEVS=$(grep '\<btrfs\>' /proc/mounts | awk '{ print $1 }' | sort -u)
for btrfs in $DEVS
do
tail -n 0 -f /var/log/syslog | grep -i "BTRFS" | grep -Evi '(disk space caching is enabled|unlinked .* orphans|turning on discard|device label .* devid .* transid|enabling SSD mode|BTRFS: has skinny extents|BTRFS: device label)' &
mountpoint="$(grep "$btrfs" /proc/mounts | awk '{ print $2 }' | sort | head -1)"
logger -s "Quick Metadata and Data Balance of $mountpoint ($btrfs)" >&2
# Even in 4.3 kernels, you can still get in places where balance
# won't work (no place left, until you run a -m0 one first)
btrfs balance start -musage=0 -v $mountpoint 2>&1 | grep -Ev "$FILTER"
btrfs balance start -musage=20 -v $mountpoint 2>&1 | grep -Ev "$FILTER"
# After metadata, let's do data:
btrfs balance start -dusage=0 -v $mountpoint 2>&1 | grep -Ev "$FILTER"
btrfs balance start -dusage=20 -v $mountpoint 2>&1 | grep -Ev "$FILTER"
# And now we do scrub. Note that scrub can fail with "no space left
# on device" if you're very out of balance.
logger -s "Starting scrub of $mountpoint" >&2
echo btrfs scrub start -Bd $mountpoint
ionice -c 3 nice -10 btrfs scrub start -Bd $mountpoint
pkill -f 'tail -n 0 -f /var/log/syslog'
logger "Ended scrub of $mountpoint" >&2
done
Stop apparmor spew in the logs
As seen in previous installs of Ubuntu 24.04 on Rapture and Raven, there is some
log spam from audit in the dmesg logs.
To stop this
simply install auditd:
apt install auditd
# apt install auditd -y
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
libauparse0t64
Suggested packages:
audispd-plugins
The following NEW packages will be installed:
auditd libauparse0t64
0 upgraded, 2 newly installed, 0 to remove and 1 not upgraded.
Need to get 274 kB of archives.
After this operation, 893 kB of additional disk space will be used.
Get:1 http://ch.archive.ubuntu.com/ubuntu noble-updates/main amd64 libauparse0t64 amd64 1:3.1.2-2.1build1.1 [58.9 kB]
Get:2 http://ch.archive.ubuntu.com/ubuntu noble-updates/main amd64 auditd amd64 1:3.1.2-2.1build1.1 [215 kB]
Fetched 274 kB in 0s (1,148 kB/s)
Selecting previously unselected package libauparse0t64:amd64.
(Reading database ... 87398 files and directories currently installed.)
Preparing to unpack .../libauparse0t64_1%3a3.1.2-2.1build1.1_amd64.deb ...
Adding 'diversion of /lib/x86_64-linux-gnu/libauparse.so.0 to /lib/x86_64-linux-gnu/libauparse.so.0.usr-is-merged by libauparse0t64'
Adding 'diversion of /lib/x86_64-linux-gnu/libauparse.so.0.0.0 to /lib/x86_64-linux-gnu/libauparse.so.0.0.0.usr-is-merged by libauparse0t64'
Unpacking libauparse0t64:amd64 (1:3.1.2-2.1build1.1) ...
Selecting previously unselected package auditd.
Preparing to unpack .../auditd_1%3a3.1.2-2.1build1.1_amd64.deb ...
Unpacking auditd (1:3.1.2-2.1build1.1) ...
Setting up libauparse0t64:amd64 (1:3.1.2-2.1build1.1) ...
Setting up auditd (1:3.1.2-2.1build1.1) ...
Created symlink /etc/systemd/system/multi-user.target.wants/auditd.service → /usr/lib/systemd/system/auditd.service.
Processing triggers for man-db (2.12.0-4build2) ...
Processing triggers for libc-bin (2.39-0ubuntu8.4) ...
Scanning processes...
Scanning processor microcode...
Scanning linux images...
Running kernel seems to be up-to-date.
The processor microcode seems to be up-to-date.
No services need to be restarted.
No containers need to be restarted.
No user sessions are running outdated binaries.
No VM guests are running outdated hypervisor (qemu) binaries on this host.
NAS NFS mount
Add the NFS mount
in /etc/fstab to mount /home/nas, create the directory and mount it.
Continuous Monitoring
Install Continuous Monitoring and
report metrics to lexicon on its NodePort (30086).
Remote Access
Remote access options for self-hosted services may no longer require opening any ports in the router for this server, although that option remains available should it become necessary.
Cloudflare Tunnel
Cloudflare Tunnels in alfred
proved to be a good solution for making web sites externally available but still
protected behind SSO with Zero Trust Web Access.
Since all that was already setup, and there are no services running in this server yet,
all there is to do here and now is just install cloudflared and join.
Install the latest cloudflared using the instructions provided for
Ubuntu 24.04:
# curl -fsSL https://pkg.cloudflare.com/cloudflare-main.gpg \
| tee /usr/share/keyrings/cloudflare-main.gpg >/dev/null
# echo 'deb [signed-by=/usr/share/keyrings/cloudflare-main.gpg] https://pkg.cloudflare.com/cloudflared noble main' \
| tee /etc/apt/sources.list.d/cloudflared.list
# install cloudflared
sudo apt-get update && sudo apt-get install cloudflared
Then run the command to connect to the new tunnel:
# cloudflared service install eyJhIjoiMD...
2025-04-18T21:45:27Z INF Using Systemd
2025-04-18T21:45:27Z INF Linux service for cloudflared installed successfully
Although a public hostname is not yet necessary for this tunnel, create one for https://kubernetes-octavo.very-very-dark-gray.top/ to be used later for the Kubernetes Dashboard, just pointing to https://localhost for now. Make sure to enable the TLS option No TLS Verify, so the tunnel can be used without HTTPS certificates since they do not seem to be necessary when using Cloudflare tunnels.
Tailscale
Install Tailscale and connect the server to the alredy existing tailnet:
Installation of Tailscale on octavo
# curl -fsSL https://tailscale.com/install.sh | sh
Installing Tailscale for ubuntu noble, using method apt
+ mkdir -p --mode=0755 /usr/share/keyrings
+ curl -fsSL https://pkgs.tailscale.com/stable/ubuntu/noble.noarmor.gpg
+ tee /usr/share/keyrings/tailscale-archive-keyring.gpg
+ chmod 0644 /usr/share/keyrings/tailscale-archive-keyring.gpg
+ curl -fsSL https://pkgs.tailscale.com/stable/ubuntu/noble.tailscale-keyring.list
+ tee /etc/apt/sources.list.d/tailscale.list
# Tailscale packages for ubuntu noble
deb [signed-by=/usr/share/keyrings/tailscale-archive-keyring.gpg] https://pkgs.tailscale.com/stable/ubuntu noble main
+ chmod 0644 /etc/apt/sources.list.d/tailscale.list
+ apt-get update
Hit:1 http://ch.archive.ubuntu.com/ubuntu noble InRelease
Get:2 http://ch.archive.ubuntu.com/ubuntu noble-updates InRelease [126 kB]
Hit:3 https://pkg.cloudflare.com/cloudflared noble InRelease
Hit:4 http://ch.archive.ubuntu.com/ubuntu noble-backports InRelease
Hit:5 http://security.ubuntu.com/ubuntu noble-security InRelease
Get:6 https://pkgs.tailscale.com/stable/ubuntu noble InRelease
Get:7 https://pkgs.tailscale.com/stable/ubuntu noble/main all Packages [354 B]
Get:8 https://pkgs.tailscale.com/stable/ubuntu noble/main amd64 Packages [12.8 kB]
Get:9 https://esm.ubuntu.com/apps/ubuntu noble-apps-security InRelease [7,595 B]
Get:10 https://esm.ubuntu.com/apps/ubuntu noble-apps-updates InRelease [7,480 B]
Get:11 https://esm.ubuntu.com/infra/ubuntu noble-infra-security InRelease [7,474 B]
Get:12 https://esm.ubuntu.com/infra/ubuntu noble-infra-updates InRelease [7,473 B]
Fetched 176 kB in 3s (64.1 kB/s)
Reading package lists... Done
+ apt-get install -y tailscale tailscale-archive-keyring
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following NEW packages will be installed:
tailscale tailscale-archive-keyring
0 upgraded, 2 newly installed, 0 to remove and 1 not upgraded.
Need to get 31.5 MB of archives.
After this operation, 59.1 MB of additional disk space will be used.
Get:2 https://pkgs.tailscale.com/stable/ubuntu noble/main all tailscale-archive-keyring all 1.35.181 [3,082 B]
Get:1 https://pkgs.tailscale.com/stable/ubuntu noble/main amd64 tailscale amd64 1.82.5 [31.5 MB]
Fetched 31.5 MB in 13s (2,462 kB/s)
Selecting previously unselected package tailscale.
(Reading database ... 125870 files and directories currently installed.)
Preparing to unpack .../tailscale_1.82.5_amd64.deb ...
Unpacking tailscale (1.82.5) ...
Selecting previously unselected package tailscale-archive-keyring.
Preparing to unpack .../tailscale-archive-keyring_1.35.181_all.deb ...
Unpacking tailscale-archive-keyring (1.35.181) ...
Setting up tailscale-archive-keyring (1.35.181) ...
Setting up tailscale (1.82.5) ...
Created symlink /etc/systemd/system/multi-user.target.wants/tailscaled.service → /usr/lib/systemd/system/tailscaled.service.
Scanning processes...
Scanning processor microcode...
Scanning linux images...
Running kernel seems to be up-to-date.
The processor microcode seems to be up-to-date.
No services need to be restarted.
No containers need to be restarted.
No user sessions are running outdated binaries.
No VM guests are running outdated hypervisor (qemu) binaries on this host.
+ [ false = true ]
+ set +x
Installation complete! Log in to start using Tailscale by running:
tailscale up
After installing the software, running tailscale up will provide a URL to
authenticate and add the server to the tailnet:
Once added to the tailnet, an SSH connection to octavo.royal-penny.ts.net instantly
connects to octavo and SSH key authentication just works (after accepting this new
hostname).
Additional setup will be needed later on, once services are running on Kubernetes:
- Install the Tailscale Kubernetes operator
and add an
Ingressto use it. - Optional: enable Public access through Funnel for services that need to be accessible from outside the tailnet.
Kubernetes
Kubernetes on Raspberry Pi 5 (alfred)
showed quite a few new hurdles caused by newer versions of Kubernetes (v1.32.2) and
a few components; this most recent installation will be the main guide this time. On top
of that, Applications Installed
and each application-specific post linked from there, will provide most of the guidance
to install those applications to be migrated (here) to the new server. The old
Single-node Kubernetes cluster on an Intel NUC: lexicon
may not add much at this point.
GitHub Repository
Use the same GitHub Repository as before and create a new directory for the new server:
$ git clone [email protected]:xxxx/kubernetes-deployments.git
$ cd kubernetes-deployments/
$ mkdir octavo
Initial Git+SSH setup
Each new server requires an initial setup to authenticate with GitHub, using
a new SSH key,
adding to the authorized SSH keys in the GitHub account, and coping .gitconfig
from a previous server.
Storage Requirements
Docker and containerd store images under /var/lib by default, which is why the
when installing Ubuntu Server a dedicated partition
is created for /var/lib, so that it will (should) not be necessary to move images
to the /home partition.
Install Helm
Install helm from APT because it will be required to install certain components later (e.g. Kubernetes Dashboard):
$ curl https://baltocdn.com/helm/signing.asc \
| sudo gpg --dearmor -o /etc/apt/keyrings/helm.gpg
$ echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/helm.gpg] https://baltocdn.com/helm/stable/debian/ all main" \
| sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
$ sudo apt-get update
$ sudo apt-get install -y helm
Install Kubernetes
Kubernetes current stable release is now v1.33.0 which is only the 1st patch in 1.33 and came out only a few days ago, so instead install the more stable v1.32. Install kubeadm, kubelet and kubectl from Debian packages:
$ curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.32/deb/Release.key \
| sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
$ sudo chmod 644 /etc/apt/keyrings/kubernetes-apt-keyring.gpg
$ echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.32/deb/ /' \
| sudo tee /etc/apt/sources.list.d/kubernetes.list
$ sudo apt-get update
Once the APT repository is ready, install the packages:
$ sudo apt-get install -y kubelet kubeadm kubectl
$ sudo apt-get install -y kubelet kubeadm kubectl
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
conntrack cri-tools kubernetes-cni
The following NEW packages will be installed:
conntrack cri-tools kubeadm kubectl kubelet kubernetes-cni
0 upgraded, 6 newly installed, 0 to remove and 1 not upgraded.
Need to get 92.7 MB of archives.
After this operation, 338 MB of additional disk space will be used.
Get:1 http://ch.archive.ubuntu.com/ubuntu noble/main amd64 conntrack amd64 1:1.4.8-1ubuntu1 [37.9 kB]
Get:2 https://prod-cdn.packages.k8s.io/repositories/isv:/kubernetes:/core:/stable:/v1.32/deb cri-tools 1.32.0-1.1 [16.3 MB]
Get:3 https://prod-cdn.packages.k8s.io/repositories/isv:/kubernetes:/core:/stable:/v1.32/deb kubeadm 1.32.4-1.1 [12.2 MB]
Get:4 https://prod-cdn.packages.k8s.io/repositories/isv:/kubernetes:/core:/stable:/v1.32/deb kubectl 1.32.4-1.1 [11.2 MB]
Get:5 https://prod-cdn.packages.k8s.io/repositories/isv:/kubernetes:/core:/stable:/v1.32/deb kubernetes-cni 1.6.0-1.1 [37.8 MB]
Get:6 https://prod-cdn.packages.k8s.io/repositories/isv:/kubernetes:/core:/stable:/v1.32/deb kubelet 1.32.4-1.1 [15.2 MB]
Fetched 92.7 MB in 1s (66.3 MB/s)
Selecting previously unselected package conntrack.
(Reading database ... 140718 files and directories currently installed.)
Preparing to unpack .../0-conntrack_1%3a1.4.8-1ubuntu1_amd64.deb ...
Unpacking conntrack (1:1.4.8-1ubuntu1) ...
Selecting previously unselected package cri-tools.
Preparing to unpack .../1-cri-tools_1.32.0-1.1_amd64.deb ...
Unpacking cri-tools (1.32.0-1.1) ...
Selecting previously unselected package kubeadm.
Preparing to unpack .../2-kubeadm_1.32.4-1.1_amd64.deb ...
Unpacking kubeadm (1.32.4-1.1) ...
Selecting previously unselected package kubectl.
Preparing to unpack .../3-kubectl_1.32.4-1.1_amd64.deb ...
Unpacking kubectl (1.32.4-1.1) ...
Selecting previously unselected package kubernetes-cni.
Preparing to unpack .../4-kubernetes-cni_1.6.0-1.1_amd64.deb ...
Unpacking kubernetes-cni (1.6.0-1.1) ...
Selecting previously unselected package kubelet.
Preparing to unpack .../5-kubelet_1.32.4-1.1_amd64.deb ...
Unpacking kubelet (1.32.4-1.1) ...
Setting up conntrack (1:1.4.8-1ubuntu1) ...
Setting up kubectl (1.32.4-1.1) ...
Setting up cri-tools (1.32.0-1.1) ...
Setting up kubernetes-cni (1.6.0-1.1) ...
Setting up kubeadm (1.32.4-1.1) ...
Setting up kubelet (1.32.4-1.1) ...
Processing triggers for man-db (2.12.0-4build2) ...
Scanning processes...
Scanning processor microcode...
Scanning linux images...
Running kernel seems to be up-to-date.
The processor microcode seems to be up-to-date.
No services need to be restarted.
No containers need to be restarted.
No user sessions are running outdated binaries.
No VM guests are running outdated hypervisor (qemu) binaries on this host.
And then, because updating Kubernetes is a rather involved process, hold them:
$ sudo apt-mark hold kubelet kubeadm kubectl
kubelet set on hold.
kubeadm set on hold.
kubectl set on hold.
Note that the latest patch at this time is v1.32.4:
# kubectl version --output=yaml
clientVersion:
buildDate: "2025-04-22T16:03:58Z"
compiler: gc
gitCommit: 59526cd4867447956156ae3a602fcbac10a2c335
gitTreeState: clean
gitVersion: v1.32.4
goVersion: go1.23.6
major: "1"
minor: "32"
platform: linux/amd64
kustomizeVersion: v5.5.0
The connection to the server localhost:8080 was refused - did you specify the right host or port?
Enabling shell autocompletion
for kubectl is very easy, since bash-completion is already installed:
$ kubectl completion bash | sudo tee /etc/bash_completion.d/kubectl > /dev/null
$ sudo chmod a+r /etc/bash_completion.d/kubectl
Enable the kubelet service
This step is only really necessary later, before
bootstrapping the cluster with kubeadm,
but it can be done any time; the service will just be waiting:
Install container runtime
Networking setup
Enabling IPv4 packet forwarding is required for Kubernetes network and is not enabled by default:
$ sudo sysctl net.ipv4.ip_forward
net.ipv4.ip_forward = 0
$ cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
EOF
$ sudo sysctl --system
* Applying /usr/lib/sysctl.d/10-apparmor.conf ...
* Applying /etc/sysctl.d/10-bufferbloat.conf ...
* Applying /etc/sysctl.d/10-console-messages.conf ...
* Applying /etc/sysctl.d/10-ipv6-privacy.conf ...
* Applying /etc/sysctl.d/10-kernel-hardening.conf ...
* Applying /etc/sysctl.d/10-magic-sysrq.conf ...
* Applying /etc/sysctl.d/10-map-count.conf ...
* Applying /etc/sysctl.d/10-network-security.conf ...
* Applying /etc/sysctl.d/10-ptrace.conf ...
* Applying /etc/sysctl.d/10-zeropage.conf ...
* Applying /usr/lib/sysctl.d/50-pid-max.conf ...
* Applying /usr/lib/sysctl.d/99-protect-links.conf ...
* Applying /etc/sysctl.d/99-sysctl.conf ...
* Applying /etc/sysctl.d/k8s.conf ...
* Applying /etc/sysctl.conf ...
kernel.apparmor_restrict_unprivileged_userns = 1
net.core.default_qdisc = fq_codel
kernel.printk = 4 4 1 7
net.ipv6.conf.all.use_tempaddr = 2
net.ipv6.conf.default.use_tempaddr = 2
kernel.kptr_restrict = 1
kernel.sysrq = 176
vm.max_map_count = 1048576
net.ipv4.conf.default.rp_filter = 2
net.ipv4.conf.all.rp_filter = 2
kernel.yama.ptrace_scope = 1
vm.mmap_min_addr = 65536
kernel.pid_max = 4194304
fs.protected_fifos = 1
fs.protected_hardlinks = 1
fs.protected_regular = 2
fs.protected_symlinks = 1
net.ipv4.ip_forward = 1
This alone was not enough for the successful deployment of the
Network plugin in alfred
required after bootstraping the cluster. Additional setup proved to be required to
let iptables see bridged traffic,
supposedly required only up to v1.29, when omitting these steps led to
the kube-flannel deployment failing to start up
(stuck in a crash loop):
$ sudo modprobe overlay
$ sudo modprobe br_netfilter
$ sudo tee /etc/modules-load.d/k8s.conf<<EOF
br_netfilter
overlay
EOF
$ sudo tee /etc/sysctl.d/k8s.conf<<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
Reboot the system to make sure that the changes are permanent:
$ sudo sysctl -a | egrep 'net.ipv4.ip_forward |net.bridge.bridge-nf-call-ip'
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
$ lsmod | egrep 'overlay|bridge'
overlay 212992 0
bridge 421888 1 br_netfilter
stp 12288 1 bridge
llc 16384 2 bridge,stp
Install containerd
Installing a container runtime
comes next, with containerd being the runtime of choice.
Install using the apt repository:
$ sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg \
-o /etc/apt/keyrings/docker.asc
$ sudo chmod a+r /etc/apt/keyrings/docker.asc
$ echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
$ sudo apt-get update
Once the APT repository is ready, install the packages:
$ sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
$ sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
docker-ce-rootless-extras libltdl7 libslirp0 pigz slirp4netns
Suggested packages:
cgroupfs-mount | cgroup-lite
The following NEW packages will be installed:
containerd.io docker-buildx-plugin docker-ce docker-ce-cli docker-ce-rootless-extras docker-compose-plugin
libltdl7 libslirp0 pigz slirp4netns
0 upgraded, 10 newly installed, 0 to remove and 1 not upgraded.
Need to get 120 MB of archives.
After this operation, 440 MB of additional disk space will be used.
Get:1 http://ch.archive.ubuntu.com/ubuntu noble/universe amd64 pigz amd64 2.8-1 [65.6 kB]
Get:2 https://download.docker.com/linux/ubuntu noble/stable amd64 containerd.io amd64 1.7.27-1 [30.5 MB]
Get:3 http://ch.archive.ubuntu.com/ubuntu noble/main amd64 libltdl7 amd64 2.4.7-7build1 [40.3 kB]
Get:4 http://ch.archive.ubuntu.com/ubuntu noble/main amd64 libslirp0 amd64 4.7.0-1ubuntu3 [63.8 kB]
Get:5 http://ch.archive.ubuntu.com/ubuntu noble/universe amd64 slirp4netns amd64 1.2.1-1build2 [34.9 kB]
Get:6 https://download.docker.com/linux/ubuntu noble/stable amd64 docker-buildx-plugin amd64 0.23.0-1~ubuntu.24.04~noble [34.6 MB]
Get:7 https://download.docker.com/linux/ubuntu noble/stable amd64 docker-ce-cli amd64 5:28.1.1-1~ubuntu.24.04~noble [15.8 MB]
Get:8 https://download.docker.com/linux/ubuntu noble/stable amd64 docker-ce amd64 5:28.1.1-1~ubuntu.24.04~noble [19.2 MB]
Get:9 https://download.docker.com/linux/ubuntu noble/stable amd64 docker-ce-rootless-extras amd64 5:28.1.1-1~ubuntu.24.04~noble [6,092 kB]
Get:10 https://download.docker.com/linux/ubuntu noble/stable amd64 docker-compose-plugin amd64 2.35.1-1~ubuntu.24.04~noble [13.8 MB]
Fetched 120 MB in 1s (94.1 MB/s)
Selecting previously unselected package pigz.
(Reading database ... 140777 files and directories currently installed.)
Preparing to unpack .../0-pigz_2.8-1_amd64.deb ...
Unpacking pigz (2.8-1) ...
Selecting previously unselected package containerd.io.
Preparing to unpack .../1-containerd.io_1.7.27-1_amd64.deb ...
Unpacking containerd.io (1.7.27-1) ...
Selecting previously unselected package docker-buildx-plugin.
Preparing to unpack .../2-docker-buildx-plugin_0.23.0-1~ubuntu.24.04~noble_amd64.deb ...
Unpacking docker-buildx-plugin (0.23.0-1~ubuntu.24.04~noble) ...
Selecting previously unselected package docker-ce-cli.
Preparing to unpack .../3-docker-ce-cli_5%3a28.1.1-1~ubuntu.24.04~noble_amd64.deb ...
Unpacking docker-ce-cli (5:28.1.1-1~ubuntu.24.04~noble) ...
Selecting previously unselected package docker-ce.
Preparing to unpack .../4-docker-ce_5%3a28.1.1-1~ubuntu.24.04~noble_amd64.deb ...
Unpacking docker-ce (5:28.1.1-1~ubuntu.24.04~noble) ...
Selecting previously unselected package docker-ce-rootless-extras.
Preparing to unpack .../5-docker-ce-rootless-extras_5%3a28.1.1-1~ubuntu.24.04~noble_amd64.deb ...
Unpacking docker-ce-rootless-extras (5:28.1.1-1~ubuntu.24.04~noble) ...
Selecting previously unselected package docker-compose-plugin.
Preparing to unpack .../6-docker-compose-plugin_2.35.1-1~ubuntu.24.04~noble_amd64.deb ...
Unpacking docker-compose-plugin (2.35.1-1~ubuntu.24.04~noble) ...
Selecting previously unselected package libltdl7:amd64.
Preparing to unpack .../7-libltdl7_2.4.7-7build1_amd64.deb ...
Unpacking libltdl7:amd64 (2.4.7-7build1) ...
Selecting previously unselected package libslirp0:amd64.
Preparing to unpack .../8-libslirp0_4.7.0-1ubuntu3_amd64.deb ...
Unpacking libslirp0:amd64 (4.7.0-1ubuntu3) ...
Selecting previously unselected package slirp4netns.
Preparing to unpack .../9-slirp4netns_1.2.1-1build2_amd64.deb ...
Unpacking slirp4netns (1.2.1-1build2) ...
Setting up docker-buildx-plugin (0.23.0-1~ubuntu.24.04~noble) ...
Setting up containerd.io (1.7.27-1) ...
Created symlink /etc/systemd/system/multi-user.target.wants/containerd.service → /usr/lib/systemd/system/containerd.service.
Setting up docker-compose-plugin (2.35.1-1~ubuntu.24.04~noble) ...
Setting up libltdl7:amd64 (2.4.7-7build1) ...
Setting up docker-ce-cli (5:28.1.1-1~ubuntu.24.04~noble) ...
Setting up libslirp0:amd64 (4.7.0-1ubuntu3) ...
Setting up pigz (2.8-1) ...
Setting up docker-ce-rootless-extras (5:28.1.1-1~ubuntu.24.04~noble) ...
Setting up slirp4netns (1.2.1-1build2) ...
Setting up docker-ce (5:28.1.1-1~ubuntu.24.04~noble) ...
Created symlink /etc/systemd/system/multi-user.target.wants/docker.service → /usr/lib/systemd/system/docker.service.
Created symlink /etc/systemd/system/sockets.target.wants/docker.socket → /usr/lib/systemd/system/docker.socket.
Processing triggers for man-db (2.12.0-4build2) ...
Processing triggers for libc-bin (2.39-0ubuntu8.4) ...
Scanning processes...
Scanning processor microcode...
Scanning linux images...
Running kernel seems to be up-to-date.
The processor microcode seems to be up-to-date.
No services need to be restarted.
No containers need to be restarted.
No user sessions are running outdated binaries.
No VM guests are running outdated hypervisor (qemu) binaries on this host.
In just a few moments docker is already running:
$ systemctl status docker
$ systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; preset: enabled)
Active: active (running) since Sat 2025-04-26 15:17:27 CEST; 52s ago
TriggeredBy: ● docker.socket
Docs: https://docs.docker.com
Main PID: 198205 (dockerd)
Tasks: 22
Memory: 25.1M (peak: 27.5M)
CPU: 335ms
CGroup: /system.slice/docker.service
└─198205 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
Apr 26 15:17:27 octavo dockerd[198205]: time="2025-04-26T15:17:27.329995883+02:00" level=info msg="detected 127.0.0.53 nameserver, assuming systemd-resolved, so using resolv.conf: /run/systemd/resolve/resolv.conf"
Apr 26 15:17:27 octavo dockerd[198205]: time="2025-04-26T15:17:27.356938837+02:00" level=info msg="Creating a containerd client" address=/run/containerd/containerd.sock timeout=1m0s
Apr 26 15:17:27 octavo dockerd[198205]: time="2025-04-26T15:17:27.391167891+02:00" level=info msg="Loading containers: start."
Apr 26 15:17:27 octavo dockerd[198205]: time="2025-04-26T15:17:27.623474369+02:00" level=info msg="Loading containers: done."
Apr 26 15:17:27 octavo dockerd[198205]: time="2025-04-26T15:17:27.651891549+02:00" level=info msg="Docker daemon" commit=01f442b containerd-snapshotter=false storage-driver=overlay2 version=28.1.1
Apr 26 15:17:27 octavo dockerd[198205]: time="2025-04-26T15:17:27.652003229+02:00" level=info msg="Initializing buildkit"
Apr 26 15:17:27 octavo dockerd[198205]: time="2025-04-26T15:17:27.686521864+02:00" level=info msg="Completed buildkit initialization"
Apr 26 15:17:27 octavo dockerd[198205]: time="2025-04-26T15:17:27.694113345+02:00" level=info msg="Daemon has completed initialization"
Apr 26 15:17:27 octavo dockerd[198205]: time="2025-04-26T15:17:27.694180217+02:00" level=info msg="API listen on /run/docker.sock"
Apr 26 15:17:27 octavo systemd[1]: Started docker.service - Docker Application Container Engine.
The server is indeed reachable and its version can be checked:
$ sudo docker version
$ sudo docker version
Client: Docker Engine - Community
Version: 28.1.1
API version: 1.49
Go version: go1.23.8
Git commit: 4eba377
Built: Fri Apr 18 09:52:14 2025
OS/Arch: linux/amd64
Context: default
Server: Docker Engine - Community
Engine:
Version: 28.1.1
API version: 1.49 (minimum version 1.24)
Go version: go1.23.8
Git commit: 01f442b
Built: Fri Apr 18 09:52:14 2025
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.7.27
GitCommit: 05044ec0a9a75232cad458027ca83437aae3f4da
runc:
Version: 1.2.5
GitCommit: v1.2.5-0-g59923ef
docker-init:
Version: 0.19.0
GitCommit: de40ad0
And the basic hello-world example just works:
$ sudo docker run hello-world
$ sudo docker run hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
e6590344b1a5: Pull complete
Digest: sha256:c41088499908a59aae84b0a49c70e86f4731e588a737f1637e73c8c09d995654
Status: Downloaded newer image for hello-world:latest
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
(amd64)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/
For more examples and ideas, visit:
https://docs.docker.com/get-started/
Configure containerd for Kubernetes
The default configutation that comes with containerd, at least when installed from
the APT repository, needs two adjustments to work with Kubernetes:
- Enable the use of
systemdcgroup driver, because Ubuntu (22.04+) uses bothsystemdand cgroup v2. - Enable CRI integration, disabled in
/etc/containerd/config.toml, but needed to usecontainerdwith Kubernetes.
The safest method to set these configurations is to do it based off of the default configuration:
$ containerd config default \
| sed 's/disabled_plugins.*/disabled_plugins = []/' \
| sed 's/SystemdCgroup = false/SystemdCgroup = true/' \
| sudo tee /etc/containerd/config.toml > /dev/null
$ sudo systemctl restart containerd
$ sudo systemctl restart kubelet
Resulting /etc/containerd/config.toml
The resulting configuration enables SystemdCgroup under
plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options:
disabled_plugins = []
imports = []
oom_score = 0
plugin_dir = ""
required_plugins = []
root = "/var/lib/containerd"
state = "/run/containerd"
temp = ""
version = 2
[cgroup]
path = ""
[debug]
address = ""
format = ""
gid = 0
level = ""
uid = 0
[grpc]
address = "/run/containerd/containerd.sock"
gid = 0
max_recv_message_size = 16777216
max_send_message_size = 16777216
tcp_address = ""
tcp_tls_ca = ""
tcp_tls_cert = ""
tcp_tls_key = ""
uid = 0
[metrics]
address = ""
grpc_histogram = false
[plugins]
[plugins."io.containerd.gc.v1.scheduler"]
deletion_threshold = 0
mutation_threshold = 100
pause_threshold = 0.02
schedule_delay = "0s"
startup_delay = "100ms"
[plugins."io.containerd.grpc.v1.cri"]
cdi_spec_dirs = ["/etc/cdi", "/var/run/cdi"]
device_ownership_from_security_context = false
disable_apparmor = false
disable_cgroup = false
disable_hugetlb_controller = true
disable_proc_mount = false
disable_tcp_service = true
drain_exec_sync_io_timeout = "0s"
enable_cdi = false
enable_selinux = false
enable_tls_streaming = false
enable_unprivileged_icmp = false
enable_unprivileged_ports = false
ignore_deprecation_warnings = []
ignore_image_defined_volumes = false
image_pull_progress_timeout = "5m0s"
image_pull_with_sync_fs = false
max_concurrent_downloads = 3
max_container_log_line_size = 16384
netns_mounts_under_state_dir = false
restrict_oom_score_adj = false
sandbox_image = "registry.k8s.io/pause:3.8"
selinux_category_range = 1024
stats_collect_period = 10
stream_idle_timeout = "4h0m0s"
stream_server_address = "127.0.0.1"
stream_server_port = "0"
systemd_cgroup = false
tolerate_missing_hugetlb_controller = true
unset_seccomp_profile = ""
[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/opt/cni/bin"
conf_dir = "/etc/cni/net.d"
conf_template = ""
ip_pref = ""
max_conf_num = 1
setup_serially = false
[plugins."io.containerd.grpc.v1.cri".containerd]
default_runtime_name = "runc"
disable_snapshot_annotations = true
discard_unpacked_layers = false
ignore_blockio_not_enabled_errors = false
ignore_rdt_not_enabled_errors = false
no_pivot = false
snapshotter = "overlayfs"
[plugins."io.containerd.grpc.v1.cri".containerd.default_runtime]
base_runtime_spec = ""
cni_conf_dir = ""
cni_max_conf_num = 0
container_annotations = []
pod_annotations = []
privileged_without_host_devices = false
privileged_without_host_devices_all_devices_allowed = false
runtime_engine = ""
runtime_path = ""
runtime_root = ""
runtime_type = ""
sandbox_mode = ""
snapshotter = ""
[plugins."io.containerd.grpc.v1.cri".containerd.default_runtime.options]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
base_runtime_spec = ""
cni_conf_dir = ""
cni_max_conf_num = 0
container_annotations = []
pod_annotations = []
privileged_without_host_devices = false
privileged_without_host_devices_all_devices_allowed = false
runtime_engine = ""
runtime_path = ""
runtime_root = ""
runtime_type = "io.containerd.runc.v2"
sandbox_mode = "podsandbox"
snapshotter = ""
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
BinaryName = ""
CriuImagePath = ""
CriuPath = ""
CriuWorkPath = ""
IoGid = 0
IoUid = 0
NoNewKeyring = false
NoPivotRoot = false
Root = ""
ShimCgroup = ""
SystemdCgroup = true
[plugins."io.containerd.grpc.v1.cri".containerd.untrusted_workload_runtime]
base_runtime_spec = ""
cni_conf_dir = ""
cni_max_conf_num = 0
container_annotations = []
pod_annotations = []
privileged_without_host_devices = false
privileged_without_host_devices_all_devices_allowed = false
runtime_engine = ""
runtime_path = ""
runtime_root = ""
runtime_type = ""
sandbox_mode = ""
snapshotter = ""
[plugins."io.containerd.grpc.v1.cri".containerd.untrusted_workload_runtime.options]
[plugins."io.containerd.grpc.v1.cri".image_decryption]
key_model = "node"
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = ""
[plugins."io.containerd.grpc.v1.cri".registry.auths]
[plugins."io.containerd.grpc.v1.cri".registry.configs]
[plugins."io.containerd.grpc.v1.cri".registry.headers]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".x509_key_pair_streaming]
tls_cert_file = ""
tls_key_file = ""
[plugins."io.containerd.internal.v1.opt"]
path = "/opt/containerd"
[plugins."io.containerd.internal.v1.restart"]
interval = "10s"
[plugins."io.containerd.internal.v1.tracing"]
[plugins."io.containerd.metadata.v1.bolt"]
content_sharing_policy = "shared"
[plugins."io.containerd.monitor.v1.cgroups"]
no_prometheus = false
[plugins."io.containerd.nri.v1.nri"]
disable = true
disable_connections = false
plugin_config_path = "/etc/nri/conf.d"
plugin_path = "/opt/nri/plugins"
plugin_registration_timeout = "5s"
plugin_request_timeout = "2s"
socket_path = "/var/run/nri/nri.sock"
[plugins."io.containerd.runtime.v1.linux"]
no_shim = false
runtime = "runc"
runtime_root = ""
shim = "containerd-shim"
shim_debug = false
[plugins."io.containerd.runtime.v2.task"]
platforms = ["linux/amd64"]
sched_core = false
[plugins."io.containerd.service.v1.diff-service"]
default = ["walking"]
sync_fs = false
[plugins."io.containerd.service.v1.tasks-service"]
blockio_config_file = ""
rdt_config_file = ""
[plugins."io.containerd.snapshotter.v1.aufs"]
root_path = ""
[plugins."io.containerd.snapshotter.v1.blockfile"]
fs_type = ""
mount_options = []
root_path = ""
scratch_file = ""
[plugins."io.containerd.snapshotter.v1.btrfs"]
root_path = ""
[plugins."io.containerd.snapshotter.v1.devmapper"]
async_remove = false
base_image_size = ""
discard_blocks = false
fs_options = ""
fs_type = ""
pool_name = ""
root_path = ""
[plugins."io.containerd.snapshotter.v1.native"]
root_path = ""
[plugins."io.containerd.snapshotter.v1.overlayfs"]
mount_options = []
root_path = ""
sync_remove = false
upperdir_label = false
[plugins."io.containerd.snapshotter.v1.zfs"]
root_path = ""
[plugins."io.containerd.tracing.processor.v1.otlp"]
[plugins."io.containerd.transfer.v1.local"]
config_path = ""
max_concurrent_downloads = 3
max_concurrent_uploaded_layers = 3
[[plugins."io.containerd.transfer.v1.local".unpack_config]]
differ = ""
platform = "linux/amd64"
snapshotter = "overlayfs"
[proxy_plugins]
[stream_processors]
[stream_processors."io.containerd.ocicrypt.decoder.v1.tar"]
accepts = ["application/vnd.oci.image.layer.v1.tar+encrypted"]
args = ["--decryption-keys-path", "/etc/containerd/ocicrypt/keys"]
env = ["OCICRYPT_KEYPROVIDER_CONFIG=/etc/containerd/ocicrypt/ocicrypt_keyprovider.conf"]
path = "ctd-decoder"
returns = "application/vnd.oci.image.layer.v1.tar"
[stream_processors."io.containerd.ocicrypt.decoder.v1.tar.gzip"]
accepts = ["application/vnd.oci.image.layer.v1.tar+gzip+encrypted"]
args = ["--decryption-keys-path", "/etc/containerd/ocicrypt/keys"]
env = ["OCICRYPT_KEYPROVIDER_CONFIG=/etc/containerd/ocicrypt/ocicrypt_keyprovider.conf"]
path = "ctd-decoder"
returns = "application/vnd.oci.image.layer.v1.tar+gzip"
[timeouts]
"io.containerd.timeout.bolt.open" = "0s"
"io.containerd.timeout.metrics.shimstats" = "2s"
"io.containerd.timeout.shim.cleanup" = "5s"
"io.containerd.timeout.shim.load" = "5s"
"io.containerd.timeout.shim.shutdown" = "3s"
"io.containerd.timeout.task.state" = "2s"
[ttrpc]
address = ""
gid = 0
uid = 0
Bootstrap with kubeadm
Creating a cluster with kubeadm
is the next big step towards creating the Kubernetes cluster.
Initialize the control-plane
Having reviewed the requirements and installed all the components already, initialize the control-plane node with flags:
--cri-socket=unix:/run/containerd/containerd.sockto make sure Kubernetes uses the containerd runtime.--pod-network-cidr=10.244.0.0/16as required by flannel, which is the network plugin to be installed later.
$ sudo kubeadm init
$ sudo kubeadm init \
--cri-socket=unix:/run/containerd/containerd.sock \
--pod-network-cidr=10.244.0.0/16
I0426 16:55:42.979255 33862 version.go:261] remote version is much newer: v1.33.0; falling back to: stable-1.32
[init] Using Kubernetes version: v1.32.4
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action beforehand using 'kubeadm config images pull'
W0426 16:55:43.482424 33862 checks.go:846] detected that the sandbox image "registry.k8s.io/pause:3.8" of the container runtime is inconsistent with that used by kubeadm.It is recommended to use "registry.k8s.io/pause:3.10" as the CRI sandbox image.
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local octavo] and IPs [10.96.0.1 10.0.0.8]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost octavo] and IPs [10.0.0.8 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost octavo] and IPs [10.0.0.8 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 501.89299ms
[api-check] Waiting for a healthy API server. This can take up to 4m0s
[api-check] The API server is healthy after 3.503255484s
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node octavo as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node octavo as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: 7k3y4k.r8ebhonqgtm6anyr
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.0.0.8:6443 --token 7k3y4k.r8ebhonqgtm6anyr \
--discovery-token-ca-cert-hash sha256:18d968e92516e1a2808166d90a7d7c8b6f7b37cbac6328c49793863f9ae2b982
Once this is done, Kubernetes control plane is running at 10.0.0.8:6443
# export KUBECONFIG=/etc/kubernetes/admin.conf
# kubectl cluster-info
Kubernetes control plane is running at https://10.0.0.8:6443
CoreDNS is running at https://10.0.0.8:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
# kubectl version --output=yaml
clientVersion:
buildDate: "2025-04-22T16:03:58Z"
compiler: gc
gitCommit: 59526cd4867447956156ae3a602fcbac10a2c335
gitTreeState: clean
gitVersion: v1.32.4
goVersion: go1.23.6
major: "1"
minor: "32"
platform: linux/amd64
kustomizeVersion: v5.5.0
serverVersion:
buildDate: "2025-04-22T15:56:15Z"
compiler: gc
gitCommit: 59526cd4867447956156ae3a602fcbac10a2c335
gitTreeState: clean
gitVersion: v1.32.4
goVersion: go1.23.6
major: "1"
minor: "32"
platform: linux/amd6
Setup kubectl access
To run kubectl as a non-root user, copy the Kubernetes config file under the
~/.kube directory and set the right permissions to it:
$ mkdir $HOME/.kube
$ sudo cp -f /etc/kubernetes/admin.conf $HOME/.kube/config
[sudo] password for ponder:
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
$ ls -l $HOME/.kube/config
-rw------- 1 ponder ponder 5648 Apr 26 17:03 /home/ponder/.kube/config
$ kubectl cluster-info
Kubernetes control plane is running at https://10.0.0.8:6443
CoreDNS is running at https://10.0.0.8:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
Troubleshooting bootstrap
The first time around, sudo kubeadm init failed. This didn't seem to have been
swap not being disabled
at first, but in the end disabling swap seems to have been the (only) solution.
$ sudo kubeadm init
$ sudo kubeadm init \
--cri-socket=unix:/run/containerd/containerd.sock \
--pod-network-cidr=10.244.0.0/16
I0426 15:49:34.806666 537229 version.go:261] remote version is much newer: v1.33.0; falling back to: stable-1.32
[init] Using Kubernetes version: v1.32.4
[preflight] Running pre-flight checks
[WARNING Swap]: swap is supported for cgroup v2 only. The kubelet must be properly configured to use swap. Please refer to https://kubernetes.io/docs/concepts/architecture/nodes/#swap-memory, or disable swap on the node
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action beforehand using 'kubeadm config images pull'
W0426 15:49:35.293520 537229 checks.go:846] detected that the sandbox image "registry.k8s.io/pause:3.8" of the container runtime is inconsistent with that used by kubeadm.It is recommended to use "registry.k8s.io/pause:3.10" as the CRI sandbox image.
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local octavo] and IPs [10.96.0.1 10.0.0.8]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost octavo] and IPs [10.0.0.8 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost octavo] and IPs [10.0.0.8 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is not healthy after 4m0.000397441s
Unfortunately, an error has occurred:
The HTTP call equal to 'curl -sSL http://127.0.0.1:10248/healthz' returned error: Get "http://127.0.0.1:10248/healthz": context deadline exceeded
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
- 'crictl --runtime-endpoint unix:/run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'crictl --runtime-endpoint unix:/run/containerd/containerd.sock logs CONTAINERID'
error execution phase wait-control-plane: could not initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
Swap was disabled (and the server was restarted) only after this happened, but it does not seem likely to have been the cause.
The kubelet is running but possibly unhealthy, systemctl status kubelet shows the
same line every 5 seconds:
Apr 26 16:11:29 octavo kubelet[3081]: E0426 16:11:29.654554 3081 kubelet.go:3002] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"
Many errors and warning are logged and shown by journalctl -xeu kubelet, up to the
point where the above error is repeated every 5 seconds:
$ journalctl -xeu kubelet
$ journalctl -xeu kubelet
Apr 26 16:00:24 octavo systemd[1]: Started kubelet.service - kubelet: The Kubernetes Node Agent.
░░ Subject: A start job for unit kubelet.service has finished successfully
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░
░░ A start job for unit kubelet.service has finished successfully.
░░
░░ The job identifier is 192.
Apr 26 16:00:24 octavo kubelet[3081]: Flag --container-runtime-endpoint has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Apr 26 16:00:24 octavo kubelet[3081]: Flag --pod-infra-container-image has been deprecated, will be removed in 1.35. Image garbage collector will get sandbox image information from CRI.
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.873338 3081 server.go:215] "--pod-infra-container-image will not be pruned by the image garbage collector in kubelet and should also be set in the remote runtime"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.878116 3081 server.go:520] "Kubelet version" kubeletVersion="v1.32.4"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.878137 3081 server.go:522] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.878359 3081 server.go:954] "Client rotation is on, will bootstrap in background"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.879746 3081 certificate_store.go:130] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem".
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.881693 3081 dynamic_cafile_content.go:161] "Starting controller" name="client-ca-bundle::/etc/kubernetes/pki/ca.crt"
Apr 26 16:00:24 octavo kubelet[3081]: E0426 16:00:24.884220 3081 log.go:32] "RuntimeConfig from runtime service failed" err="rpc error: code = Unimplemented desc = unknown method RuntimeConfig for service runtime.v1.RuntimeService"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.884242 3081 server.go:1421] "CRI implementation should be updated to support RuntimeConfig when KubeletCgroupDriverFromCRI feature gate has been enabled. Falling back to using cgroupDriver from kubelet config."
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.892519 3081 server.go:772] "--cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.892745 3081 container_manager_linux.go:268] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.892782 3081 container_manager_linux.go:273] "Creating Container Manager object based on Node Config" nodeConfig={"NodeName":"octavo","RuntimeCgroupsName":"","SystemCgroupsName":"","KubeletCgroupsName":"","KubeletOOMScoreAdj":-999,"ContainerRuntime":"","CgroupsPerQOS":true,"CgroupRoot":"/","CgroupDriver":"systemd","Kubele>
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.892963 3081 topology_manager.go:138] "Creating topology manager with none policy"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.892973 3081 container_manager_linux.go:304] "Creating device plugin manager"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.893096 3081 state_mem.go:36] "Initialized new in-memory state store"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.893362 3081 kubelet.go:446] "Attempting to sync node with API server"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.893383 3081 kubelet.go:341] "Adding static pod path" path="/etc/kubernetes/manifests"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.893401 3081 kubelet.go:352] "Adding apiserver pod source"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.893412 3081 apiserver.go:42] "Waiting for node sync before watching apiserver pods"
Apr 26 16:00:24 octavo kubelet[3081]: W0426 16:00:24.893859 3081 reflector.go:569] k8s.io/client-go/informers/factory.go:160: failed to list *v1.Service: Get "https://10.0.0.8:6443/api/v1/services?fieldSelector=spec.clusterIP%21%3DNone&limit=500&resourceVersion=0": dial tcp 10.0.0.8:6443: connect: connection refused
Apr 26 16:00:24 octavo kubelet[3081]: W0426 16:00:24.893887 3081 reflector.go:569] k8s.io/client-go/informers/factory.go:160: failed to list *v1.Node: Get "https://10.0.0.8:6443/api/v1/nodes?fieldSelector=metadata.name%3Doctavo&limit=500&resourceVersion=0": dial tcp 10.0.0.8:6443: connect: connection refused
Apr 26 16:00:24 octavo kubelet[3081]: E0426 16:00:24.893911 3081 reflector.go:166] "Unhandled Error" err="k8s.io/client-go/informers/factory.go:160: Failed to watch *v1.Service: failed to list *v1.Service: Get \"https://10.0.0.8:6443/api/v1/services?fieldSelector=spec.clusterIP%21%3DNone&limit=500&resourceVersion=0\": dial tcp 10.0.0.8:6443: connect: connection refused" lo>
Apr 26 16:00:24 octavo kubelet[3081]: E0426 16:00:24.893993 3081 reflector.go:166] "Unhandled Error" err="k8s.io/client-go/informers/factory.go:160: Failed to watch *v1.Node: failed to list *v1.Node: Get \"https://10.0.0.8:6443/api/v1/nodes?fieldSelector=metadata.name%3Doctavo&limit=500&resourceVersion=0\": dial tcp 10.0.0.8:6443: connect: connection refused" logger="Unhan>
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.894044 3081 kuberuntime_manager.go:269] "Container runtime initialized" containerRuntime="containerd" version="1.7.27" apiVersion="v1"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.894351 3081 kubelet.go:890] "Not starting ClusterTrustBundle informer because we are in static kubelet mode"
Apr 26 16:00:24 octavo kubelet[3081]: W0426 16:00:24.894385 3081 probe.go:272] Flexvolume plugin directory at /usr/libexec/kubernetes/kubelet-plugins/volume/exec/ does not exist. Recreating.
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.894658 3081 watchdog_linux.go:99] "Systemd watchdog is not enabled"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.894673 3081 server.go:1287] "Started kubelet"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.894770 3081 server.go:169] "Starting to listen" address="0.0.0.0" port=10250
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.894761 3081 ratelimit.go:55] "Setting rate limiting for endpoint" service="podresources" qps=100 burstTokens=10
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.895021 3081 server.go:243] "Starting to serve the podresources API" endpoint="unix:/var/lib/kubelet/pod-resources/kubelet.sock"
Apr 26 16:00:24 octavo kubelet[3081]: E0426 16:00:24.895066 3081 event.go:368] "Unable to write event (may retry after sleeping)" err="Post \"https://10.0.0.8:6443/api/v1/namespaces/default/events\": dial tcp 10.0.0.8:6443: connect: connection refused" event="&Event{ObjectMeta:{octavo.1839e3187cc326ed default 0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] [>
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.895314 3081 fs_resource_analyzer.go:67] "Starting FS ResourceAnalyzer"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.895343 3081 dynamic_serving_content.go:135] "Starting controller" name="kubelet-server-cert-files::/var/lib/kubelet/pki/kubelet.crt::/var/lib/kubelet/pki/kubelet.key"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.895416 3081 volume_manager.go:297] "Starting Kubelet Volume Manager"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.895449 3081 desired_state_of_world_populator.go:150] "Desired state populator starts to run"
Apr 26 16:00:24 octavo kubelet[3081]: E0426 16:00:24.895492 3081 kubelet_node_status.go:466] "Error getting the current node from lister" err="node \"octavo\" not found"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.895521 3081 reconciler.go:26] "Reconciler: start to sync state"
Apr 26 16:00:24 octavo kubelet[3081]: E0426 16:00:24.896004 3081 controller.go:145] "Failed to ensure lease exists, will retry" err="Get \"https://10.0.0.8:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/octavo?timeout=10s\": dial tcp 10.0.0.8:6443: connect: connection refused" interval="200ms"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.896248 3081 factory.go:219] Registration of the crio container factory failed: Get "http://%2Fvar%2Frun%2Fcrio%2Fcrio.sock/info": dial unix /var/run/crio/crio.sock: connect: no such file or directory
Apr 26 16:00:24 octavo kubelet[3081]: W0426 16:00:24.897181 3081 reflector.go:569] k8s.io/client-go/informers/factory.go:160: failed to list *v1.CSIDriver: Get "https://10.0.0.8:6443/apis/storage.k8s.io/v1/csidrivers?limit=500&resourceVersion=0": dial tcp 10.0.0.8:6443: connect: connection refused
Apr 26 16:00:24 octavo kubelet[3081]: E0426 16:00:24.897377 3081 reflector.go:166] "Unhandled Error" err="k8s.io/client-go/informers/factory.go:160: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: Get \"https://10.0.0.8:6443/apis/storage.k8s.io/v1/csidrivers?limit=500&resourceVersion=0\": dial tcp 10.0.0.8:6443: connect: connection refused" logger="UnhandledEr>
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.897627 3081 server.go:479] "Adding debug handlers to kubelet server"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.898015 3081 factory.go:221] Registration of the containerd container factory successfully
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.898031 3081 factory.go:221] Registration of the systemd container factory successfully
Apr 26 16:00:24 octavo kubelet[3081]: E0426 16:00:24.898126 3081 kubelet.go:1555] "Image garbage collection failed once. Stats initialization may not have completed yet" err="invalid capacity 0 on image filesystem"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.907984 3081 cpu_manager.go:221] "Starting CPU manager" policy="none"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.907991 3081 cpu_manager.go:222] "Reconciling" reconcilePeriod="10s"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.908003 3081 state_mem.go:36] "Initialized new in-memory state store"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.909090 3081 policy_none.go:49] "None policy: Start"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.909098 3081 memory_manager.go:186] "Starting memorymanager" policy="None"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.909104 3081 state_mem.go:35] "Initializing new in-memory state store"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.948672 3081 manager.go:519] "Failed to read data from checkpoint" checkpoint="kubelet_internal_checkpoint" err="checkpoint is not found"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.948805 3081 eviction_manager.go:189] "Eviction manager: starting control loop"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.948817 3081 container_log_manager.go:189] "Initializing container log rotate workers" workers=1 monitorPeriod="10s"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.948983 3081 plugin_manager.go:118] "Starting Kubelet Plugin Manager"
Apr 26 16:00:24 octavo kubelet[3081]: E0426 16:00:24.949523 3081 eviction_manager.go:267] "eviction manager: failed to check if we have separate container filesystem. Ignoring." err="no imagefs label for configured runtime"
Apr 26 16:00:24 octavo kubelet[3081]: E0426 16:00:24.949568 3081 eviction_manager.go:292] "Eviction manager: failed to get summary stats" err="failed to get node info: node \"octavo\" not found"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.952661 3081 kubelet_network_linux.go:50] "Initialized iptables rules." protocol="IPv4"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.953773 3081 kubelet_network_linux.go:50] "Initialized iptables rules." protocol="IPv6"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.953793 3081 status_manager.go:227] "Starting to sync pod status with apiserver"
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.953815 3081 watchdog_linux.go:127] "Systemd watchdog is not enabled or the interval is invalid, so health checking will not be started."
Apr 26 16:00:24 octavo kubelet[3081]: I0426 16:00:24.953821 3081 kubelet.go:2382] "Starting kubelet main sync loop"
Apr 26 16:00:24 octavo kubelet[3081]: E0426 16:00:24.953861 3081 kubelet.go:2406] "Skipping pod synchronization" err="PLEG is not healthy: pleg has yet to be successful"
Apr 26 16:00:24 octavo kubelet[3081]: W0426 16:00:24.954293 3081 reflector.go:569] k8s.io/client-go/informers/factory.go:160: failed to list *v1.RuntimeClass: Get "https://10.0.0.8:6443/apis/node.k8s.io/v1/runtimeclasses?limit=500&resourceVersion=0": dial tcp 10.0.0.8:6443: connect: connection refused
Apr 26 16:00:24 octavo kubelet[3081]: E0426 16:00:24.954314 3081 reflector.go:166] "Unhandled Error" err="k8s.io/client-go/informers/factory.go:160: Failed to watch *v1.RuntimeClass: failed to list *v1.RuntimeClass: Get \"https://10.0.0.8:6443/apis/node.k8s.io/v1/runtimeclasses?limit=500&resourceVersion=0\": dial tcp 10.0.0.8:6443: connect: connection refused" logger="Unha>
Apr 26 16:00:25 octavo kubelet[3081]: I0426 16:00:25.051160 3081 kubelet_node_status.go:75] "Attempting to register node" node="octavo"
Apr 26 16:00:25 octavo kubelet[3081]: E0426 16:00:25.051955 3081 kubelet_node_status.go:107] "Unable to register node with API server" err="Post \"https://10.0.0.8:6443/api/v1/nodes\": dial tcp 10.0.0.8:6443: connect: connection refused" node="octavo"
Apr 26 16:00:25 octavo kubelet[3081]: I0426 16:00:25.096663 3081 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"ca-certs\" (UniqueName: \"kubernetes.io/host-path/ad9e0985cbbefcb442de36a1fa2a7651-ca-certs\") pod \"kube-controller-manager-octavo\" (UID: \"ad9e0985cbbefcb442de36a1fa2a7651\") " pod="kube-system/kube-controller->
Apr 26 16:00:25 octavo kubelet[3081]: I0426 16:00:25.096703 3081 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-local-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/ad9e0985cbbefcb442de36a1fa2a7651-usr-local-share-ca-certificates\") pod \"kube-controller-manager-octavo\" (UID: \"ad9e0985cbbefcb442de36a1f>
Apr 26 16:00:25 octavo kubelet[3081]: I0426 16:00:25.096736 3081 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"ca-certs\" (UniqueName: \"kubernetes.io/host-path/9143a118ddce4da18128e4a82a4a703f-ca-certs\") pod \"kube-apiserver-octavo\" (UID: \"9143a118ddce4da18128e4a82a4a703f\") " pod="kube-system/kube-apiserver-octavo"
Apr 26 16:00:25 octavo kubelet[3081]: I0426 16:00:25.096755 3081 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"k8s-certs\" (UniqueName: \"kubernetes.io/host-path/9143a118ddce4da18128e4a82a4a703f-k8s-certs\") pod \"kube-apiserver-octavo\" (UID: \"9143a118ddce4da18128e4a82a4a703f\") " pod="kube-system/kube-apiserver-octavo"
Apr 26 16:00:25 octavo kubelet[3081]: I0426 16:00:25.096790 3081 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-local-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/9143a118ddce4da18128e4a82a4a703f-usr-local-share-ca-certificates\") pod \"kube-apiserver-octavo\" (UID: \"9143a118ddce4da18128e4a82a4a703f\">
Apr 26 16:00:25 octavo kubelet[3081]: I0426 16:00:25.096824 3081 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etc-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/ad9e0985cbbefcb442de36a1fa2a7651-etc-ca-certificates\") pod \"kube-controller-manager-octavo\" (UID: \"ad9e0985cbbefcb442de36a1fa2a7651\") " pod="kube-s>
Apr 26 16:00:25 octavo kubelet[3081]: I0426 16:00:25.096849 3081 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"flexvolume-dir\" (UniqueName: \"kubernetes.io/host-path/ad9e0985cbbefcb442de36a1fa2a7651-flexvolume-dir\") pod \"kube-controller-manager-octavo\" (UID: \"ad9e0985cbbefcb442de36a1fa2a7651\") " pod="kube-system/kube>
Apr 26 16:00:25 octavo kubelet[3081]: I0426 16:00:25.096876 3081 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"k8s-certs\" (UniqueName: \"kubernetes.io/host-path/ad9e0985cbbefcb442de36a1fa2a7651-k8s-certs\") pod \"kube-controller-manager-octavo\" (UID: \"ad9e0985cbbefcb442de36a1fa2a7651\") " pod="kube-system/kube-controlle>
Apr 26 16:00:25 octavo kubelet[3081]: I0426 16:00:25.096896 3081 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etcd-certs\" (UniqueName: \"kubernetes.io/host-path/6a67e3805915a015cc4032245f92dbab-etcd-certs\") pod \"etcd-octavo\" (UID: \"6a67e3805915a015cc4032245f92dbab\") " pod="kube-system/etcd-octavo"
Apr 26 16:00:25 octavo kubelet[3081]: I0426 16:00:25.096920 3081 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etcd-data\" (UniqueName: \"kubernetes.io/host-path/6a67e3805915a015cc4032245f92dbab-etcd-data\") pod \"etcd-octavo\" (UID: \"6a67e3805915a015cc4032245f92dbab\") " pod="kube-system/etcd-octavo"
Apr 26 16:00:25 octavo kubelet[3081]: E0426 16:00:25.096941 3081 controller.go:145] "Failed to ensure lease exists, will retry" err="Get \"https://10.0.0.8:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/octavo?timeout=10s\": dial tcp 10.0.0.8:6443: connect: connection refused" interval="400ms"
Apr 26 16:00:25 octavo kubelet[3081]: I0426 16:00:25.096967 3081 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kubeconfig\" (UniqueName: \"kubernetes.io/host-path/ad9e0985cbbefcb442de36a1fa2a7651-kubeconfig\") pod \"kube-controller-manager-octavo\" (UID: \"ad9e0985cbbefcb442de36a1fa2a7651\") " pod="kube-system/kube-control>
Apr 26 16:00:25 octavo kubelet[3081]: I0426 16:00:25.097001 3081 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/ad9e0985cbbefcb442de36a1fa2a7651-usr-share-ca-certificates\") pod \"kube-controller-manager-octavo\" (UID: \"ad9e0985cbbefcb442de36a1fa2a7651\") ">
Apr 26 16:00:25 octavo kubelet[3081]: I0426 16:00:25.097020 3081 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kubeconfig\" (UniqueName: \"kubernetes.io/host-path/9a50c799dacb081ee9c958350b51d8ec-kubeconfig\") pod \"kube-scheduler-octavo\" (UID: \"9a50c799dacb081ee9c958350b51d8ec\") " pod="kube-system/kube-scheduler-octavo"
Apr 26 16:00:25 octavo kubelet[3081]: I0426 16:00:25.097037 3081 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etc-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/9143a118ddce4da18128e4a82a4a703f-etc-ca-certificates\") pod \"kube-apiserver-octavo\" (UID: \"9143a118ddce4da18128e4a82a4a703f\") " pod="kube-system/kub>
Apr 26 16:00:25 octavo kubelet[3081]: I0426 16:00:25.097054 3081 reconciler_common.go:251] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/9143a118ddce4da18128e4a82a4a703f-usr-share-ca-certificates\") pod \"kube-apiserver-octavo\" (UID: \"9143a118ddce4da18128e4a82a4a703f\") " pod="kub>
Apr 26 16:00:25 octavo kubelet[3081]: E0426 16:00:25.247411 3081 kubelet.go:3190] "No need to create a mirror pod, since failed to get node info from the cluster" err="node \"octavo\" not found" node="octavo"
Apr 26 16:00:25 octavo kubelet[3081]: E0426 16:00:25.251941 3081 kubelet.go:3190] "No need to create a mirror pod, since failed to get node info from the cluster" err="node \"octavo\" not found" node="octavo"
Apr 26 16:00:25 octavo kubelet[3081]: I0426 16:00:25.253846 3081 kubelet_node_status.go:75] "Attempting to register node" node="octavo"
Apr 26 16:00:25 octavo kubelet[3081]: E0426 16:00:25.254394 3081 kubelet_node_status.go:107] "Unable to register node with API server" err="Post \"https://10.0.0.8:6443/api/v1/nodes\": dial tcp 10.0.0.8:6443: connect: connection refused" node="octavo"
Apr 26 16:00:25 octavo kubelet[3081]: E0426 16:00:25.275473 3081 kubelet.go:3190] "No need to create a mirror pod, since failed to get node info from the cluster" err="node \"octavo\" not found" node="octavo"
Apr 26 16:00:25 octavo kubelet[3081]: E0426 16:00:25.301544 3081 kubelet.go:3190] "No need to create a mirror pod, since failed to get node info from the cluster" err="node \"octavo\" not found" node="octavo"
Apr 26 16:00:24 octavo kubelet[3081]: E0426 16:00:24.974201 3081 controller.go:145] "Failed to ensure lease exists, will retry" err="Get \"https://10.0.0.8:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/octavo?timeout=10s\": dial tcp 10.0.0.8:6443: connect: connection refused" interval="800ms"
Apr 26 16:00:25 octavo kubelet[3081]: I0426 16:00:25.134490 3081 kubelet_node_status.go:75] "Attempting to register node" node="octavo"
Apr 26 16:00:25 octavo kubelet[3081]: E0426 16:00:25.135362 3081 kubelet_node_status.go:107] "Unable to register node with API server" err="Post \"https://10.0.0.8:6443/api/v1/nodes\": dial tcp 10.0.0.8:6443: connect: connection refused" node="octavo"
Apr 26 16:00:25 octavo kubelet[3081]: W0426 16:00:25.220172 3081 reflector.go:569] k8s.io/client-go/informers/factory.go:160: failed to list *v1.Node: Get "https://10.0.0.8:6443/api/v1/nodes?fieldSelector=metadata.name%3Doctavo&limit=500&resourceVersion=0": dial tcp 10.0.0.8:6443: connect: connection refused
Apr 26 16:00:25 octavo kubelet[3081]: E0426 16:00:25.220349 3081 reflector.go:166] "Unhandled Error" err="k8s.io/client-go/informers/factory.go:160: Failed to watch *v1.Node: failed to list *v1.Node: Get \"https://10.0.0.8:6443/api/v1/nodes?fieldSelector=metadata.name%3Doctavo&limit=500&resourceVersion=0\": dial tcp 10.0.0.8:6443: connect: connection refused" logger="Unhan>
Apr 26 16:00:25 octavo kubelet[3081]: W0426 16:00:25.451675 3081 reflector.go:569] k8s.io/client-go/informers/factory.go:160: failed to list *v1.Service: Get "https://10.0.0.8:6443/api/v1/services?fieldSelector=spec.clusterIP%21%3DNone&limit=500&resourceVersion=0": dial tcp 10.0.0.8:6443: connect: connection refused
Apr 26 16:00:25 octavo kubelet[3081]: E0426 16:00:25.451792 3081 reflector.go:166] "Unhandled Error" err="k8s.io/client-go/informers/factory.go:160: Failed to watch *v1.Service: failed to list *v1.Service: Get \"https://10.0.0.8:6443/api/v1/services?fieldSelector=spec.clusterIP%21%3DNone&limit=500&resourceVersion=0\": dial tcp 10.0.0.8:6443: connect: connection refused" lo>
Apr 26 16:00:25 octavo kubelet[3081]: W0426 16:00:25.487336 3081 reflector.go:569] k8s.io/client-go/informers/factory.go:160: failed to list *v1.RuntimeClass: Get "https://10.0.0.8:6443/apis/node.k8s.io/v1/runtimeclasses?limit=500&resourceVersion=0": dial tcp 10.0.0.8:6443: connect: connection refused
Apr 26 16:00:25 octavo kubelet[3081]: E0426 16:00:25.487463 3081 reflector.go:166] "Unhandled Error" err="k8s.io/client-go/informers/factory.go:160: Failed to watch *v1.RuntimeClass: failed to list *v1.RuntimeClass: Get \"https://10.0.0.8:6443/apis/node.k8s.io/v1/runtimeclasses?limit=500&resourceVersion=0\": dial tcp 10.0.0.8:6443: connect: connection refused" logger="Unha>
Apr 26 16:00:25 octavo kubelet[3081]: W0426 16:00:25.504868 3081 reflector.go:569] k8s.io/client-go/informers/factory.go:160: failed to list *v1.CSIDriver: Get "https://10.0.0.8:6443/apis/storage.k8s.io/v1/csidrivers?limit=500&resourceVersion=0": dial tcp 10.0.0.8:6443: connect: connection refused
Apr 26 16:00:25 octavo kubelet[3081]: E0426 16:00:25.504969 3081 reflector.go:166] "Unhandled Error" err="k8s.io/client-go/informers/factory.go:160: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: Get \"https://10.0.0.8:6443/apis/storage.k8s.io/v1/csidrivers?limit=500&resourceVersion=0\": dial tcp 10.0.0.8:6443: connect: connection refused" logger="UnhandledEr>
Apr 26 16:00:25 octavo kubelet[3081]: E0426 16:00:25.775421 3081 controller.go:145] "Failed to ensure lease exists, will retry" err="Get \"https://10.0.0.8:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/octavo?timeout=10s\": dial tcp 10.0.0.8:6443: connect: connection refused" interval="1.6s"
Apr 26 16:00:25 octavo kubelet[3081]: I0426 16:00:25.938373 3081 kubelet_node_status.go:75] "Attempting to register node" node="octavo"
Apr 26 16:00:25 octavo kubelet[3081]: E0426 16:00:25.939217 3081 kubelet_node_status.go:107] "Unable to register node with API server" err="Post \"https://10.0.0.8:6443/api/v1/nodes\": dial tcp 10.0.0.8:6443: connect: connection refused" node="octavo"
Apr 26 16:00:26 octavo kubelet[3081]: E0426 16:00:26.436928 3081 kubelet.go:3190] "No need to create a mirror pod, since failed to get node info from the cluster" err="node \"octavo\" not found" node="octavo"
Apr 26 16:00:26 octavo kubelet[3081]: E0426 16:00:26.438217 3081 kubelet.go:3190] "No need to create a mirror pod, since failed to get node info from the cluster" err="node \"octavo\" not found" node="octavo"
Apr 26 16:00:26 octavo kubelet[3081]: E0426 16:00:26.439369 3081 kubelet.go:3190] "No need to create a mirror pod, since failed to get node info from the cluster" err="node \"octavo\" not found" node="octavo"
Apr 26 16:00:26 octavo kubelet[3081]: E0426 16:00:26.440112 3081 kubelet.go:3190] "No need to create a mirror pod, since failed to get node info from the cluster" err="node \"octavo\" not found" node="octavo"
Apr 26 16:00:27 octavo kubelet[3081]: E0426 16:00:27.442281 3081 kubelet.go:3190] "No need to create a mirror pod, since failed to get node info from the cluster" err="node \"octavo\" not found" node="octavo"
Apr 26 16:00:27 octavo kubelet[3081]: E0426 16:00:27.442300 3081 kubelet.go:3190] "No need to create a mirror pod, since failed to get node info from the cluster" err="node \"octavo\" not found" node="octavo"
Apr 26 16:00:27 octavo kubelet[3081]: E0426 16:00:27.442354 3081 kubelet.go:3190] "No need to create a mirror pod, since failed to get node info from the cluster" err="node \"octavo\" not found" node="octavo"
Apr 26 16:00:27 octavo kubelet[3081]: I0426 16:00:27.540869 3081 kubelet_node_status.go:75] "Attempting to register node" node="octavo"
Apr 26 16:00:27 octavo kubelet[3081]: E0426 16:00:27.804881 3081 nodelease.go:49] "Failed to get node when trying to set owner ref to the node lease" err="nodes \"octavo\" not found" node="octavo"
Apr 26 16:00:27 octavo kubelet[3081]: I0426 16:00:27.998050 3081 kubelet_node_status.go:78] "Successfully registered node" node="octavo"
Apr 26 16:00:28 octavo kubelet[3081]: I0426 16:00:28.072723 3081 kubelet.go:3194] "Creating a mirror pod for static pod" pod="kube-system/etcd-octavo"
Apr 26 16:00:28 octavo kubelet[3081]: E0426 16:00:28.079480 3081 kubelet.go:3196] "Failed creating a mirror pod" err="pods \"etcd-octavo\" is forbidden: no PriorityClass with name system-node-critical was found" pod="kube-system/etcd-octavo"
Apr 26 16:00:28 octavo kubelet[3081]: I0426 16:00:28.079531 3081 kubelet.go:3194] "Creating a mirror pod for static pod" pod="kube-system/kube-apiserver-octavo"
Apr 26 16:00:28 octavo kubelet[3081]: E0426 16:00:28.086877 3081 kubelet.go:3196] "Failed creating a mirror pod" err="pods \"kube-apiserver-octavo\" is forbidden: no PriorityClass with name system-node-critical was found" pod="kube-system/kube-apiserver-octavo"
Apr 26 16:00:28 octavo kubelet[3081]: I0426 16:00:28.086955 3081 kubelet.go:3194] "Creating a mirror pod for static pod" pod="kube-system/kube-controller-manager-octavo"
Apr 26 16:00:28 octavo kubelet[3081]: E0426 16:00:28.092824 3081 kubelet.go:3196] "Failed creating a mirror pod" err="pods \"kube-controller-manager-octavo\" is forbidden: no PriorityClass with name system-node-critical was found" pod="kube-system/kube-controller-manager-octavo"
Apr 26 16:00:28 octavo kubelet[3081]: I0426 16:00:28.092885 3081 kubelet.go:3194] "Creating a mirror pod for static pod" pod="kube-system/kube-scheduler-octavo"
Apr 26 16:00:28 octavo kubelet[3081]: E0426 16:00:28.094961 3081 kubelet.go:3196] "Failed creating a mirror pod" err="pods \"kube-scheduler-octavo\" is forbidden: no PriorityClass with name system-node-critical was found" pod="kube-system/kube-scheduler-octavo"
Apr 26 16:00:28 octavo kubelet[3081]: I0426 16:00:28.372556 3081 apiserver.go:52] "Watching apiserver"
Apr 26 16:00:28 octavo kubelet[3081]: I0426 16:00:28.442538 3081 kubelet.go:3194] "Creating a mirror pod for static pod" pod="kube-system/kube-apiserver-octavo"
Apr 26 16:00:28 octavo kubelet[3081]: E0426 16:00:28.444604 3081 kubelet.go:3196] "Failed creating a mirror pod" err="pods \"kube-apiserver-octavo\" is forbidden: no PriorityClass with name system-node-critical was found" pod="kube-system/kube-apiserver-octavo"
Apr 26 16:00:28 octavo kubelet[3081]: I0426 16:00:28.472896 3081 desired_state_of_world_populator.go:158] "Finished populating initial desired state of world"
Apr 26 16:00:29 octavo kubelet[3081]: I0426 16:00:29.338842 3081 kubelet.go:3194] "Creating a mirror pod for static pod" pod="kube-system/etcd-octavo"
Apr 26 16:00:31 octavo kubelet[3081]: I0426 16:00:31.031022 3081 kubelet.go:3194] "Creating a mirror pod for static pod" pod="kube-system/kube-controller-manager-octavo"
Apr 26 16:00:32 octavo kubelet[3081]: I0426 16:00:32.884678 3081 kubelet.go:3194] "Creating a mirror pod for static pod" pod="kube-system/kube-scheduler-octavo"
Apr 26 16:00:34 octavo kubelet[3081]: I0426 16:00:34.459610 3081 pod_startup_latency_tracker.go:104] "Observed pod startup duration" pod="kube-system/etcd-octavo" podStartSLOduration=5.459582191 podStartE2EDuration="5.459582191s" podCreationTimestamp="2025-04-26 16:00:29 +0200 CEST" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling="0001-01-01 00:00:0>
Apr 26 16:00:34 octavo kubelet[3081]: I0426 16:00:34.483008 3081 pod_startup_latency_tracker.go:104] "Observed pod startup duration" pod="kube-system/kube-controller-manager-octavo" podStartSLOduration=3.482988267 podStartE2EDuration="3.482988267s" podCreationTimestamp="2025-04-26 16:00:31 +0200 CEST" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling=>
Apr 26 16:00:34 octavo kubelet[3081]: I0426 16:00:34.483102 3081 pod_startup_latency_tracker.go:104] "Observed pod startup duration" pod="kube-system/kube-scheduler-octavo" podStartSLOduration=2.483094445 podStartE2EDuration="2.483094445s" podCreationTimestamp="2025-04-26 16:00:32 +0200 CEST" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling="0001-01->
Apr 26 16:00:35 octavo kubelet[3081]: I0426 16:00:35.015311 3081 kuberuntime_manager.go:1702] "Updating runtime config through cri with podcidr" CIDR="10.244.0.0/24"
Apr 26 16:00:35 octavo kubelet[3081]: I0426 16:00:35.016020 3081 kubelet_network.go:61] "Updating Pod CIDR" originalPodCIDR="" newPodCIDR="10.244.0.0/24"
Apr 26 16:00:36 octavo kubelet[3081]: I0426 16:00:36.698529 3081 kubelet.go:3194] "Creating a mirror pod for static pod" pod="kube-system/kube-apiserver-octavo"
Apr 26 16:00:36 octavo kubelet[3081]: I0426 16:00:36.724746 3081 pod_startup_latency_tracker.go:104] "Observed pod startup duration" pod="kube-system/kube-apiserver-octavo" podStartSLOduration=0.724721039 podStartE2EDuration="724.721039ms" podCreationTimestamp="2025-04-26 16:00:36 +0200 CEST" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling="0001-01->
Apr 26 16:02:24 octavo kubelet[3081]: E0426 16:02:24.440322 3081 kubelet_node_status.go:460] "Node not becoming ready in time after startup"
Apr 26 16:02:24 octavo kubelet[3081]: E0426 16:02:24.460985 3081 kubelet.go:3002] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"
Apr 26 16:02:29 octavo kubelet[3081]: E0426 16:02:29.463219 3081 kubelet.go:3002] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"
Apr 26 16:02:34 octavo kubelet[3081]: E0426 16:02:34.465049 3081 kubelet.go:3002] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"
Apr 26 16:02:39 octavo kubelet[3081]: E0426 16:02:39.467066 3081 kubelet.go:3002] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"
Apr 26 16:02:44 octavo kubelet[3081]: E0426 16:02:44.468109 3081 kubelet.go:3002] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"
...
Despite all these errors, kubelet and containerd are running, and something is
listening on port 6443, but the kubernetes-admin has no access in any way.
Regenerating kubeconfig file for the admin user
did not help either. Running kubectl proxy starts serving on port 8001 but all
requests are still denied:
$ curl 2>/dev/null -X GET \
http://127.0.0.1:8001/api/v1/nodes/octavo/proxy/configz \
| jq .
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {},
"status": "Failure",
"message": "nodes \"octavo\" is forbidden: User \"kubernetes-admin\" cannot get resource \"nodes/proxy\" in API group \"\" at the cluster scope",
"reason": "Forbidden",
"details": {
"name": "octavo",
"kind": "nodes"
},
"code": 403
}
At this point all that was left to do was tear down the cluster:
$ sudo kubeadm reset
$ sudo kubeadm reset -f \
--cri-socket=unix:/run/containerd/containerd.sock \
--cleanup-tmp-dir
[reset] Reading configuration from the "kubeadm-config" ConfigMap in namespace "kube-system"...
[reset] Use 'kubeadm init phase upload-config --config your-config.yaml' to re-upload it.
W0426 16:44:31.464050 537333 reset.go:143] [reset] Unable to fetch the kubeadm-config ConfigMap from cluster: failed to get config map: configmaps "kubeadm-config" is forbidden: User "kubernetes-admin" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
[preflight] Running pre-flight checks
W0426 16:44:31.464336 537333 removeetcdmember.go:106] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] Deleted contents of the etcd data directory: /var/lib/etcd
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of directories: [/etc/kubernetes/manifests /var/lib/kubelet /etc/kubernetes/pki /etc/kubernetes/tmp]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/super-admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d
The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.
If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.
The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.
$ sudo mv /etc/kubernetes/ /etc/bad-kubernetes/
$ sudo mv /etc/cni /etc/bad-cni
$ sudo iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
ts-input all -- anywhere anywhere
KUBE-FIREWALL all -- anywhere anywhere
Chain FORWARD (policy ACCEPT)
target prot opt source destination
ts-forward all -- anywhere anywhere
DOCKER-USER all -- anywhere anywhere
DOCKER-FORWARD all -- anywhere anywhere
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
KUBE-FIREWALL all -- anywhere anywhere
Chain DOCKER (1 references)
target prot opt source destination
DROP all -- anywhere anywhere
Chain DOCKER-BRIDGE (1 references)
target prot opt source destination
DOCKER all -- anywhere anywhere
Chain DOCKER-CT (1 references)
target prot opt source destination
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
Chain DOCKER-FORWARD (1 references)
target prot opt source destination
DOCKER-CT all -- anywhere anywhere
DOCKER-ISOLATION-STAGE-1 all -- anywhere anywhere
DOCKER-BRIDGE all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target prot opt source destination
DOCKER-ISOLATION-STAGE-2 all -- anywhere anywhere
Chain DOCKER-ISOLATION-STAGE-2 (1 references)
target prot opt source destination
DROP all -- anywhere anywhere
Chain DOCKER-USER (1 references)
target prot opt source destination
RETURN all -- anywhere anywhere
Chain KUBE-FIREWALL (2 references)
target prot opt source destination
DROP all -- !localhost/8 localhost/8 /* block incoming localnet connections */ ! ctstate RELATED,ESTABLISHED,DNAT
Chain KUBE-KUBELET-CANARY (0 references)
target prot opt source destination
Chain ts-forward (1 references)
target prot opt source destination
MARK all -- anywhere anywhere MARK xset 0x40000/0xff0000
ACCEPT all -- anywhere anywhere mark match 0x40000/0xff0000
DROP all -- 100.64.0.0/10 anywhere
ACCEPT all -- anywhere anywhere
Chain ts-input (1 references)
target prot opt source destination
ACCEPT all -- octavo.royal-penny.ts.net anywhere
RETURN all -- 100.115.92.0/23 anywhere
DROP all -- 100.64.0.0/10 anywhere
ACCEPT all -- anywhere anywhere
ACCEPT udp -- anywhere anywhere udp dpt:41641
$ sudo iptables -L -t nat
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
DOCKER all -- anywhere anywhere ADDRTYPE match dst-type LOCAL
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
DOCKER all -- anywhere !localhost/8 ADDRTYPE match dst-type LOCAL
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
ts-postrouting all -- anywhere anywhere
MASQUERADE all -- 172.17.0.0/16 anywhere
Chain DOCKER (2 references)
target prot opt source destination
RETURN all -- anywhere anywhere
Chain KUBE-KUBELET-CANARY (0 references)
target prot opt source destination
Chain ts-postrouting (1 references)
target prot opt source destination
MASQUERADE all -- anywhere anywhere mark match 0x40000/0xff0000
$ sudo iptables -L -t mangle
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
Chain KUBE-IPTABLES-HINT (0 references)
target prot opt source destination
Chain KUBE-KUBELET-CANARY (0 references)
target prot opt source destination
$ sudo iptables -F
$ sudo iptables -t nat -F
$ sudo iptables -t mangle -F
$ sudo iptables -X
$ sudo iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
$ sudo iptables -t nat
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
Chain DOCKER (0 references)
target prot opt source destination
Chain KUBE-KUBELET-CANARY (0 references)
target prot opt source destination
Chain ts-postrouting (0 references)
target prot opt source destination
$ sudo iptables -t mangle
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
Chain KUBE-IPTABLES-HINT (0 references)
target prot opt source destination
Chain KUBE-KUBELET-CANARY (0 references)
target prot opt source destination
Network plugin
Installing a Pod network add-on is the next required step and, once again, in lack of other suggestions, deploying flannel manually like in previous clusters seems the way to go:
$ wget \
https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
$ kubectl apply -f kube-flannel.yml
namespace/kube-flannel created
serviceaccount/flannel created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created
$ kubectl get all -n kube-flannel
NAME READY STATUS RESTARTS AGE
pod/kube-flannel-ds-m8h8n 1/1 Running 0 8s
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/kube-flannel-ds 1 1 1 1 1 <none> 8s
Enable single-node cluster
Control plane node isolation
is required for a single-node cluster, because otherwise a cluster will not schedule
Pods on the control plane nodes for security reasons.
This is reflected in the Taints found in the node details:
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
octavo Ready control-plane 14m v1.32.4 192.168.0.8 <none> Ubuntu 24.04.2 LTS 6.8.0-58-generic containerd://1.7.27
$ kubectl describe node octavo
Name: octavo
Roles: control-plane
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=octavo
kubernetes.io/os=linux
node-role.kubernetes.io/control-plane=
node.kubernetes.io/exclude-from-external-load-balancers=
Annotations: flannel.alpha.coreos.com/backend-data: {"VNI":1,"VtepMAC":"0a:7f:28:09:c7:77"}
flannel.alpha.coreos.com/backend-type: vxlan
flannel.alpha.coreos.com/kube-subnet-manager: true
flannel.alpha.coreos.com/public-ip: 10.0.0.8
kubeadm.alpha.kubernetes.io/cri-socket: unix:/run/containerd/containerd.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Sat, 26 Apr 2025 16:55:47 +0200
Taints: node-role.kubernetes.io/control-plane:NoSchedule
Unschedulable: false
Remove this taint to allow other pods to be scheduled:
$ kubectl taint nodes --all node-role.kubernetes.io/control-plane-
node/octavo untainted
$ kubectl describe node octavo | grep -i taint
Taints: <none>
Allow external load balancers
The node.kubernetes.io/exclude-from-external-load-balancers label highlighted
above will later lead to the problem of
MetalLB is not advertising my service from my control-plane nodes or from my single node cluster;
the recommened solution is to remove this label:
$ kubectl label nodes octavo \
node.kubernetes.io/exclude-from-external-load-balancers-
node/octavo unlabeled
Test pod scheduling
Before moving forward, run a test pod to confirm that pods can be scheduled:
$ kubectl apply -f https://k8s.io/examples/pods/commands.yaml
pod/command-demo created
$ kubectl get all
NAME READY STATUS RESTARTS AGE
pod/command-demo 0/1 ContainerCreating 0 1s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 24m
After a minute or two, the pod becomes Completed, indicating a successful run.
With the cluster now ready to run pods and services, move on to installing more components that will be used by the actual services:
MetalLB Load Balancer,
Kubernets Dashboard,
Ingress Controller,
HTTPS certificates
with Let’s Encrypt, including automatic renewal.
LocalPath PV provisioner
for simple persistent storage in local file systems would seem unnecessary,
based on experience with previous clusters.
Logs reader helper
Troubleshooting podcs and services often involves reading or watching the logs,
which involves combining two kubectl commands to find the relevant pod/service
and requesting the logs. To makes this easier, put the following in ~/bin/klogs
(and add ~/bin/ to your path if not there already):
#!/bin/bash
#
# Watch logs from Kubernetes pod/service.
#
# Usage: klogs <namespace> <pod/service>
ns=$1
pd=$2
kubectl logs -n $ns \
$(kubectl get pods -n $ns | grep $pd | cut -f1 -d' ') -f
MetalLB Load Balancer
A Load Balancer is going to be necessary for the Dashboard
and other services, to expose individual services via open ports on the server
(NodePort) or virtual IP addresses.
Installation By Manifest
is as simple as applying the provided manifest:
$ wget \
https://raw.githubusercontent.com/metallb/metallb/v0.14.9/config/manifests/metallb-native.yaml
$ kubectl apply -f metallb/metallb-native.yaml
namespace/metallb-system created
customresourcedefinition.apiextensions.k8s.io/bfdprofiles.metallb.io created
customresourcedefinition.apiextensions.k8s.io/bgpadvertisements.metallb.io created
customresourcedefinition.apiextensions.k8s.io/bgppeers.metallb.io created
customresourcedefinition.apiextensions.k8s.io/communities.metallb.io created
customresourcedefinition.apiextensions.k8s.io/ipaddresspools.metallb.io created
customresourcedefinition.apiextensions.k8s.io/l2advertisements.metallb.io created
customresourcedefinition.apiextensions.k8s.io/servicel2statuses.metallb.io created
serviceaccount/controller created
serviceaccount/speaker created
role.rbac.authorization.k8s.io/controller created
role.rbac.authorization.k8s.io/pod-lister created
clusterrole.rbac.authorization.k8s.io/metallb-system:controller created
clusterrole.rbac.authorization.k8s.io/metallb-system:speaker created
rolebinding.rbac.authorization.k8s.io/controller created
rolebinding.rbac.authorization.k8s.io/pod-lister created
clusterrolebinding.rbac.authorization.k8s.io/metallb-system:controller created
clusterrolebinding.rbac.authorization.k8s.io/metallb-system:speaker created
configmap/metallb-excludel2 created
secret/metallb-webhook-cert created
service/metallb-webhook-service created
deployment.apps/controller created
daemonset.apps/speaker created
validatingwebhookconfiguration.admissionregistration.k8s.io/metallb-webhook-configuration created
Soon enough the deployment should have the controller and speaker running:
$ kubectl get all -n metallb-system
NAME READY STATUS RESTARTS AGE
pod/controller-bb5f47665-vt57w 1/1 Running 0 62s
pod/speaker-92c2g 1/1 Running 0 62s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/metallb-webhook-service ClusterIP 10.98.162.115 <none> 443/TCP 62s
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/speaker 1 1 1 1 1 kubernetes.io/os=linux 62s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/controller 1/1 1 1 62s
NAME DESIRED CURRENT READY AGE
replicaset.apps/controller-bb5f47665 1 1 1 62s
MetalLB remains idle until configured, which is done by deploying resources into its namespace. A small range of IP addresses is advertised via Layer 2 Configuration, which does not not require the IPs to be bound to the network interfaces:
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: production
namespace: metallb-system
spec:
addresses:
- 192.168.0.171-192.168.0.180
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: l2-advert
namespace: metallb-system
The range is based on the local DHCP server configuration and which IPs are currently in use; this range has just not been leased so far. The reason to use IPs from the leased range is that the router only allows adding port forwarding rules for those. This range is intentionally on the same network range and subnet as the DHCP server so that no routing is needed to reach MetalLB IPs.
$ kubectl apply -f metallb/ipaddress-pool-octavo.yaml
ipaddresspool.metallb.io/production created
l2advertisement.metallb.io/l2-advert created
$ kubectl get ipaddresspool.metallb.io -n metallb-system
NAME AUTO ASSIGN AVOID BUGGY IPS ADDRESSES
production true false ["192.168.0.171-192.168.0.180"]
$ kubectl get l2advertisement.metallb.io -n metallb-system
NAME IPADDRESSPOOLS IPADDRESSPOOL SELECTORS INTERFACES
l2-advert
$ kubectl describe ipaddresspool.metallb.io production -n metallb-system
stem
Name: production
Namespace: metallb-system
Labels: <none>
Annotations: <none>
API Version: metallb.io/v1beta1
Kind: IPAddressPool
Metadata:
Creation Timestamp: 2025-04-26T15:40:23Z
Generation: 1
Resource Version: 3974
UID: 3a7c5d52-ab54-4cb8-b339-2a81930bf199
Spec:
Addresses:
192.168.0.171-192.168.0.180
Auto Assign: true
Avoid Buggy I Ps: false
Events: <none>
Kubernets Dashboard
Install the Helm repository for the Kubernetes dashboard (this requires having previously installed Helm):
$ helm repo add \
kubernetes-dashboard \
https://kubernetes.github.io/dashboard/
"kubernetes-dashboard" has been added to your repositories
And install the Kubernetes dashboard without any customization:
$ helm upgrade \
--install kubernetes-dashboard \
kubernetes-dashboard/kubernetes-dashboard \
--create-namespace \
--namespace kubernetes-dashboard
Release "kubernetes-dashboard" does not exist. Installing it now.
NAME: kubernetes-dashboard
LAST DEPLOYED: Sat Apr 26 17:56:44 2025
NAMESPACE: kubernetes-dashboard
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
*************************************************************************************************
*** PLEASE BE PATIENT: Kubernetes Dashboard may need a few minutes to get up and become ready ***
*************************************************************************************************
Congratulations! You have just installed Kubernetes Dashboard in your cluster.
To access Dashboard run:
kubectl -n kubernetes-dashboard port-forward svc/kubernetes-dashboard-kong-proxy 8443:443
NOTE: In case port-forward command does not work, make sure that kong service name is correct.
Check the services in Kubernetes Dashboard namespace using:
kubectl -n kubernetes-dashboard get svc
Dashboard will be available at:
https://localhost:8443
After a minute or two, all services are running:
$ kubectl get all -n kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
pod/kubernetes-dashboard-api-64c997cbcc-cxbjt 1/1 Running 0 30s
pod/kubernetes-dashboard-auth-5cf6848ffd-5vcm7 1/1 Running 0 30s
pod/kubernetes-dashboard-kong-79867c9c48-dwncj 1/1 Running 0 30s
pod/kubernetes-dashboard-metrics-scraper-76df4956c4-bx6wj 1/1 Running 0 30s
pod/kubernetes-dashboard-web-56df7655d9-jc8ss 1/1 Running 0 30s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes-dashboard-api ClusterIP 10.96.67.106 <none> 8000/TCP 30s
service/kubernetes-dashboard-auth ClusterIP 10.110.81.112 <none> 8000/TCP 30s
service/kubernetes-dashboard-kong-proxy ClusterIP 10.97.89.215 <none> 443/TCP 30s
service/kubernetes-dashboard-metrics-scraper ClusterIP 10.111.10.215 <none> 8000/TCP 30s
service/kubernetes-dashboard-web ClusterIP 10.99.213.93 <none> 8000/TCP 30s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/kubernetes-dashboard-api 1/1 1 1 30s
deployment.apps/kubernetes-dashboard-auth 1/1 1 1 30s
deployment.apps/kubernetes-dashboard-kong 1/1 1 1 30s
deployment.apps/kubernetes-dashboard-metrics-scraper 1/1 1 1 30s
deployment.apps/kubernetes-dashboard-web 1/1 1 1 30s
NAME DESIRED CURRENT READY AGE
replicaset.apps/kubernetes-dashboard-api-64c997cbcc 1 1 1 30s
replicaset.apps/kubernetes-dashboard-auth-5cf6848ffd 1 1 1 30s
replicaset.apps/kubernetes-dashboard-kong-79867c9c48 1 1 1 30s
replicaset.apps/kubernetes-dashboard-metrics-scraper-76df4956c4 1 1 1 30s
replicaset.apps/kubernetes-dashboard-web-56df7655d9 1 1 1 30s
The dashboard is now behind the kubernetes-dashboard-kong-proxy service, and
the suggested kubectl port-forward command.can be used to map port 8443 to it:
$ kubectl -n kubernetes-dashboard port-forward \
svc/kubernetes-dashboard-kong-proxy 8443:443 \
--address 0.0.0.0
Forwarding from 0.0.0.0:8443 -> 8443
The dasboard is then available at https://octavo:8443/:
Note
Documentation pages omit the --address 0.0.0.0 flag, but without it the
dashboard is either unreachable or non-functional, see
Troubleshooting Dashboard
for details of how this issue was encountered before.
Accessing the Dashboard UI requires creating a sample user; the setup in the tutorial creates an example admin user with all privileges, good enough for now:
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
namespace: kubernetes-dashboard
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kubernetes-dashboard
---
apiVersion: v1
kind: Secret
metadata:
name: admin-user
namespace: kubernetes-dashboard
annotations:
kubernetes.io/service-account.name: "admin-user"
type: kubernetes.io/service-account-token
$ kubectl apply -f dashboard/admin-sa-rbac.yaml
serviceaccount/admin-user created
clusterrolebinding.rbac.authorization.k8s.io/admin-user created
secret/admin-user created
To login on the dashboard as the admin user, each time, generate a new
tokenw with:
The next step is to make the dashboard available at a stable URL, without running
the kubectl port-forward command.
Ingress Controller
An Nginx Ingress Controller will be used to redirect HTTPS requests to different services depending on the Host header, while all those requests will be hitting
the same IP address. The current Nginx
Installation Guide
essentially suggests several methods to install Nginx, of which the first is Helm:
$ helm repo add \
ingress-nginx \
https://kubernetes.github.io/ingress-nginx
"ingress-nginx" has been added to your repositories
To enable the use of snippets annotations, used to hide server headers, override
allow-snippet-annotations
which is set to false by default to mitigate known vulnerability
CVE-2021-25742.
The nginx.ingress.kubernetes.io/configuration-snippet is rated Critical, so the
annotation also requires raisig the annotations-risk-level. To tweak both of these
in the Helm chart, use the following nginx-values.yaml:
controller:
allowSnippetAnnotations: true
config:
annotations-risk-level: "Critical"
$ helm upgrade \
--install ingress-nginx \
ingress-nginx/ingress-nginx \
--create-namespace \
--namespace ingress-nginx \
--values nginx-values.yaml
Release "ingress-nginx" does not exist. Installing it now.
NAME: ingress-nginx
LAST DEPLOYED: Sat Apr 26 19:31:25 2025
NAMESPACE: ingress-nginx
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
The ingress-nginx controller has been installed.
It may take a few minutes for the load balancer IP to be available.
You can watch the status by running 'kubectl get service --namespace ingress-nginx ingress-nginx-controller --output wide --watch'
An example Ingress that makes use of the controller:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: example
namespace: foo
spec:
ingressClassName: nginx
rules:
- host: www.example.com
http:
paths:
- pathType: Prefix
backend:
service:
name: exampleService
port:
number: 80
path: /
# This section is only required if TLS is to be enabled for the Ingress
tls:
- hosts:
- www.example.com
secretName: example-tls
If TLS is enabled for the Ingress, a Secret containing the certificate and key must also be provided:
apiVersion: v1
kind: Secret
metadata:
name: example-tls
namespace: foo
data:
tls.crt: <base64 encoded cert>
tls.key: <base64 encoded key>
type: kubernetes.io/tls
After just half a minute the service is available on the LoadBalancer IP address:
$ kubectl get all -n ingress-nginx
NAME READY STATUS RESTARTS AGE
pod/ingress-nginx-controller-b49d9c7b9-w26hb 1/1 Running 0 25s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/ingress-nginx-controller LoadBalancer 10.99.252.250 192.168.0.171 80:30278/TCP,443:30974/TCP 25s
service/ingress-nginx-controller-admission ClusterIP 10.96.96.221 <none> 443/TCP 25s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/ingress-nginx-controller 1/1 1 1 25s
NAME DESIRED CURRENT READY AGE
replicaset.apps/ingress-nginx-controller-b49d9c7b9 1 1 1 25
The first virtual IP address is assigned to the ingress-nginx-controller service and
there is NGinx happily returning 404 Not found and reachable from other hosts:
$ curl -k https://192.168.0.171/
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>
Kubernetes Dashboard Ingress
With both Ngnix and the Kubernetes dashboard up and running, it is now possible to make
the dashboard more conveniently accessible via Nginx. This Ingress is a slightly more
complete one based on the example above,
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: kubernetes-dashboard-ingress
namespace: kubernetes-dashboard
annotations:
nginx.ingress.kubernetes.io/backend-protocol: HTTPS
nginx.ingress.kubernetes.io/auth-tls-verify-client: "false"
nginx.ingress.kubernetes.io/whitelist-source-range: 10.244.0.0/16
nginx.ingress.kubernetes.io/configuration-snippet: |
more_set_headers "server: hide";
spec:
ingressClassName: nginx
rules:
- host: k8s.octavo
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: kubernetes-dashboard-kong-proxy
port:
number: 443
This points to the kubernetes-dashboard-kong-proxy service which is the one listening
on the standard HTTPS port 443, and the one previously targeted by the
kubectl port-forward command above. After applying this Ingress, adding the
Host: k8s.octavo header will get the request correctly reach the dashboard:
$ kubectl apply -f dashboard/octavo-ingress.yaml
ingress.networking.k8s.io/kubernetes-dashboard-ingress created
$ curl 2>/dev/null \
-H "Host: k8s.octavo" \
-k https://192.168.0.171/ \
| head -2
<!--
Copyright 2017 The Kubernetes Authors.
Cloudflare Ingress
Having a Cloudflare Tunnel already setup with
https://kubernetes-octavo.very-very-dark-gray.top/ (pointing to https://localhost),
updating this to point to https://192.168.0.171/ will make requests reach Nginx, and
updating the host value above to the kubernetes-octavo.very-very-dark-gray.top
makes the dashboard available at that address.
Tailscale Ingress
To make the dashboard available over Tailscale, start by installing the Tailscale Kubernetes operator:
$ helm repo add tailscale https://pkgs.tailscale.com/helmcharts && \
helm repo update
"tailscale" has been added to your repositories
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "kubernetes-dashboard" chart repository
...Successfully got an update from the "ingress-nginx" chart repository
...Successfully got an update from the "tailscale" chart repository
Update Complete. ⎈Happy Helming!⎈
$ kubectl create namespace tailscale
namespace/tailscale created
$ kubectl label namespace tailscale pod-security.kubernetes.io/enforce=privileged
namespace/tailscale labeled
$ helm upgrade \
--install \
tailscale-operator \
tailscale/tailscale-operator \
--namespace=tailscale \
--create-namespace \
--set-string oauth.clientId="_________________" \
--set-string oauth.clientSecret="tskey-client-_________________-__________________________________" \
--wait
Release "tailscale-operator" does not exist. Installing it now.
NAME: tailscale-operator
LAST DEPLOYED: Sat Apr 26 20:15:26 2025
NAMESPACE: tailscale
STATUS: deployed
REVISION: 1
TEST SUITE: None
$ sudo tailscale cert octavo.royal-penny.ts.net
Wrote public cert to octavo.royal-penny.ts.net.crt
Wrote private key to octavo.royal-penny.ts.net.key
Using the Tailscale Ingress Controller, it is now possible to make the dashboard available at https://kubernetes-octavo.royal-penny.ts.net by adding a new Ingress
(with its own unique metadata.name) in octavo-ingress.yaml:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: kubernetes-dashboard-ingress
namespace: kubernetes-dashboard
annotations:
nginx.ingress.kubernetes.io/backend-protocol: HTTPS
nginx.ingress.kubernetes.io/auth-tls-verify-client: "false"
nginx.ingress.kubernetes.io/whitelist-source-range: 10.244.0.0/16
nginx.ingress.kubernetes.io/configuration-snippet: |
more_set_headers "server: hide";
spec:
ingressClassName: nginx
rules:
- host: kubernetes-octavo.very-very-dark-gray.top
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: kubernetes-dashboard-kong-proxy
port:
number: 443
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: kubernetes-dashboard-ingress-tailscale
namespace: kubernetes-dashboard
spec:
defaultBackend:
service:
name: kubernetes-dashboard-kong-proxy
port:
number: 443
ingressClassName: tailscale
tls:
- hosts:
- kubernetes-octavo
$ kubectl apply -f dashboard/octavo-ingress.yaml
ingress.networking.k8s.io/kubernetes-dashboard-ingress unchanged
ingress.networking.k8s.io/kubernetes-dashboard-ingress-tailscale created
$ kubectl get ingress -A
NAMESPACE NAME CLASS HOSTS ADDRESS PORTS AGE
kubernetes-dashboard kubernetes-dashboard-ingress nginx kubernetes-octavo.very-very-dark-gray.top 192.168.0.171 80 39m
kubernetes-dashboard kubernetes-dashboard-ingress-tailscale tailscale * kubernetes-octavo.royal-penny.ts.net 80, 443 31s
After some time the dashboard is available also at https://kubernetes-octavo.royal-penny.ts.net/.
HTTPS certificates
The Kubernetes dashboard uses a self-signed certificate, and so does Nginx by default, which works in so far as encrypting traffic, but provides no guarantee that the traffic is coming from the actual servers and is just very annoying when browsers complain every time accessing the service. It is now time to get properly signed HTTPS certificates. This is the way.
Install cert-manager
cert-manager is the native Kubernetes certificate management controller of choice to issue certificates from Let's Encrypt to secure NGINX-ingress.
Having already installed Helm (3.17), deployed the NGINX
Ingress Controller,
assigned octavo.uu.am to the router's external IP address, and deployed the
Kubernetes dashboard
service, the system is ready for
Step 5 - Deploy cert-manager,
starting with the
installation with Helm:
$ helm repo add jetstack https://charts.jetstack.io --force-update
"jetstack" has been added to your repositories
$ helm install \
cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--version v1.17.2 \
--set crds.enabled=true
NAME: cert-manager
LAST DEPLOYED: Sat Apr 26 21:05:57 2025
NAMESPACE: cert-manager
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
cert-manager v1.17.2 has been deployed successfully!
In order to begin issuing certificates, you will need to set up a ClusterIssuer
or Issuer resource (for example, by creating a 'letsencrypt-staging' issuer).
More information on the different types of issuers and how to configure them
can be found in our documentation:
https://cert-manager.io/docs/configuration/
For information on how to configure cert-manager to automatically provision
Certificates for Ingress resources, take a look at the `ingress-shim`
documentation:
https://cert-manager.io/docs/usage/ingress/
The helm install command takes several seconds to come back with the above output,
at which point the pods and services are all up and running:
$ kubectl get all -n cert-manager
NAME READY STATUS RESTARTS AGE
pod/cert-manager-7d67448f59-c2fgn 1/1 Running 0 102s
pod/cert-manager-cainjector-666b8b6b66-fl6rp 1/1 Running 0 102s
pod/cert-manager-webhook-78cb4cf989-wb4wz 1/1 Running 0 102s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/cert-manager ClusterIP 10.103.213.89 <none> 9402/TCP 102s
service/cert-manager-cainjector ClusterIP 10.108.222.22 <none> 9402/TCP 102s
service/cert-manager-webhook ClusterIP 10.108.56.97 <none> 443/TCP,9402/TCP 102s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/cert-manager 1/1 1 1 102s
deployment.apps/cert-manager-cainjector 1/1 1 1 102s
deployment.apps/cert-manager-webhook 1/1 1 1 102s
NAME DESIRED CURRENT READY AGE
replicaset.apps/cert-manager-7d67448f59 1 1 1 102s
replicaset.apps/cert-manager-cainjector-666b8b6b66 1 1 1 102s
replicaset.apps/cert-manager-webhook-78cb4cf989 1 1 1 102s
Test cert-manager
Verify the installation by creating a simple self-signed certificate:
Test certificate: test-resources.yaml
apiVersion: v1
kind: Namespace
metadata:
name: cert-manager-test
---
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: test-selfsigned
namespace: cert-manager-test
spec:
selfSigned: {}
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: selfsigned-cert
namespace: cert-manager-test
spec:
dnsNames:
- example.com
secretName: selfsigned-cert-tls
issuerRef:
name: test-selfsigned
Deploying this succeeds withing seconds, after which it can be cleaned up:
kubectl apply -f test-resources.yaml
$ kubectl apply -f test-resources.yaml
namespace/cert-manager-test created
issuer.cert-manager.io/test-selfsigned created
certificate.cert-manager.io/selfsigned-cert created
$ kubectl describe certificate -n cert-manager-test
Name: selfsigned-cert
Namespace: cert-manager-test
Labels: <none>
Annotations: <none>
API Version: cert-manager.io/v1
Kind: Certificate
Metadata:
Creation Timestamp: 2025-04-26T19:11:05Z
Generation: 1
Resource Version: 21857
UID: e16b3f4e-494b-4866-b328-bb92759fd482
Spec:
Dns Names:
example.com
Issuer Ref:
Name: test-selfsigned
Secret Name: selfsigned-cert-tls
Status:
Conditions:
Last Transition Time: 2025-04-26T19:11:05Z
Message: Certificate is up to date and has not expired
Observed Generation: 1
Reason: Ready
Status: True
Type: Ready
Not After: 2025-07-25T19:11:05Z
Not Before: 2025-04-26T19:11:05Z
Renewal Time: 2025-06-25T19:11:05Z
Revision: 1
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Issuing 8s cert-manager-certificates-trigger Issuing certificate as Secret does not exist
Normal Generated 8s cert-manager-certificates-key-manager Stored new private key in temporary Secret resource "selfsigned-cert-fzsgh"
Normal Requested 8s cert-manager-certificates-request-manager Created new CertificateRequest resource "selfsigned-cert-1"
Normal Issuing 8s cert-manager-certificates-issuing The certificate has been successfully issued
Configure Let's Encrypt
Step 6 - Configure a Let's Encrypt Issuer
shows how to create an Issuer, but in this system with multiple services running in
different namespaces, a ClusterIssuer is needed instead:
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: [email protected]
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- http01:
ingress:
class: nginx
$ kubectl create -f cert-manager-issuer.yaml
clusterissuer.cert-manager.io/letsencrypt-prod created
$ kubectl describe clusterissuer.cert-manager.io/letsencrypt-prod
Name: letsencrypt-prod
Namespace:
Labels: <none>
Annotations: <none>
API Version: cert-manager.io/v1
Kind: ClusterIssuer
Metadata:
Creation Timestamp: 2025-04-26T19:18:14Z
Generation: 1
Resource Version: 22532
UID: c7063ffb-d530-4f72-8b0a-7f323b39c593
Spec:
Acme:
Email: [email protected]
Private Key Secret Ref:
Name: letsencrypt-prod
Server: https://acme-v02.api.letsencrypt.org/directory
Solvers:
http01:
Ingress:
Class: nginx
Status:
Acme:
Last Private Key Hash: Qb3v8RLD0ixqgBLKVsI/tEgX16kauzYTQXAaC3pdqCE=
Last Registered Email: [email protected]
Uri: https://acme-v02.api.letsencrypt.org/acme/acct/2364237637
Conditions:
Last Transition Time: 2025-04-26T19:18:15Z
Message: The ACME account was registered with the ACME server
Observed Generation: 1
Reason: ACMEAccountRegistered
Status: True
Type: Ready
Events: <none>
End-to-end test Let's Encrypt
Step 7 - Deploy a TLS Ingress Resource
is the last step left to actually make pratical use of all the above. Based on the
provided example, apply the equivalent changes to dashboard/nginx-ingress.yaml,
using cert-manager.io/cluster-issuer instead of cert-manager.io/issuer and adding
the tls section with just the relevant FQDN under hosts:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: kubernetes-dashboard-ingress
namespace: kubernetes-dashboard
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/backend-protocol: HTTPS
nginx.ingress.kubernetes.io/auth-tls-verify-client: "false"
nginx.ingress.kubernetes.io/whitelist-source-range: 10.244.0.0/16
nginx.ingress.kubernetes.io/configuration-snippet: |
more_set_headers "server: hide";
spec:
ingressClassName: nginx
rules:
- host: kubernetes-octavo.very-very-dark-gray.top
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: kubernetes-dashboard-kong-proxy
port:
number: 443
tls:
- secretName: tls-secret-cloudflare
hosts:
- kubernetes-octavo.very-very-dark-gray.top
Note
The Tailscale Ingress is omitted here because it needs no change; there is
no way to obtain and renew Let's Encrypt certificates on Tailscale hostnames.
Applying the changes to this Ingress triggers the request for a certificate signed by
Let's Encrypt; there will be a pending order and challenge for a new certificate:
$ kubectl apply -f dashboard/octavo-ingress.yaml
ingress.networking.k8s.io/kubernetes-dashboard-ingress configured
ingress.networking.k8s.io/kubernetes-dashboard-ingress-tailscale unchanged
$ kubectl get svc -A | grep acme
kubernetes-dashboard cm-acme-http-solver-pg4lj NodePort 10.105.119.248 <none> 8089:31654/TCP 40s
The pod behind the service is listening and the logs can be monitored in real time:
$ klogs kubernetes-dashboard kubernetes-dashboard | grep acme | cut -f1 -d' ') -f
I0426 19:51:06.525439 1 solver.go:52] "starting listener" logger="cert-manager.acmesolver" expected_domain="kubernetes-octavo.very-very-dark-gray.top" expected_token="A8NepZWnfMUnjHcB01FaBct-OtTlyevnybrzEu2d2lo" expected_key="A8NepZWnfMUnjHcB01FaBct-OtTlyevnybrzEu2d2lo.iqPqqTpFo6Xc2HKxELaaa6msFZd96MSHPdgrxtrPdwM" listen_port=8089
Now, instead of routing requests to the different NodePort assigned each time,
it is more convinient to update route requests always to a fixed port 32080,
and then patch the ACME resolve service to
change the service's NodePort.
The patch command is directed at the specific ACME resolver service in each namespace:
$ kubectl -n kubernetes-dashboard patch \
service cm-acme-http-solver-pg4lj \
-p '{"spec":{"ports": [{"port": 8089, "nodePort": 32080}]}}'
service/cm-acme-http-solver-pg4lj patched
$ kubectl get svc -A | grep acme
kubernetes-dashboard cm-acme-http-solver-pg4lj NodePort 10.105.119.248 <none> 8089:32080/TCP 2m47s
Once the service is patched, external connections can be selectively routed to this
port by adding a public hostname to route only requests for paths under
/.well-known to port 32080 over plain HTTP. Combined with the patching of
the ACME solver above, this makes solvers for HTTP01 challenges reachable, so
certificates can be issued and renewed.
Very soon after making the ACME solver reachable, the challenge is resolved and a valid certificate is obtained and installed.
At this point it is no longer necessary to have No TLS Verify enable under
because Nginx is now using a certificate signed by Let's Encrypt. To complete the
end-to-end test, disable No TLS Verify and set Origin Server Name
to the FQDN (kubernetes-octavo.very-very-dark-gray.top) so that Cloudflare accepts
the new certificate.
Although this is not really necessary when accessing services through a
Cloudflare Tunnel, it does serve as a good end-to-end test
and secures the communication between the Cloudflare connector and Nginx. Later,
the same mechanism will make it possible to obtain and renew Let's Encrypt certificates
for services exposed externally by means of forwarding the router's external port 443
to the LoadBalancer IP address of Nginx on this server, so that services can be
accessed directly, securely and without bandwidth contraints through a different
domain, e.g. https://k8s.octavo.uu.am.
Automatic renewal
The above patch operation can be automated to update each ACME resolver service
when it starts:
#!/bin/bash
#
# Patch the nodePort of running cert-manager renewal challenge, to listen
# on port 32080 which is the one the router is forwarding port 80 to.
# Check if there is a LetsEncrypt challenge resolver (acme) running.
export KUBECONFIG=/etc/kubernetes/admin.conf
namespace=$(kubectl get svc -A | grep acme | awk '{print $1}' | head -1)
service=$(kubectl get svc -A | grep acme | awk '{print $2}' | head -1)
# Patch the service to listen on port 32080 (set up in router).
if [ -n "${namespace}" ] && [ -n "${service}" ]; then
kubectl \
-n "${namespace}" patch service "${service}" \
-p '{"spec":{"ports": [{"port": 8089, "nodePort":32080}]}}'
fi
Install this script in a convenient location and setup crontab to run it every minute:
Test automatic renewal
Expose the Kubernetes dashboard at https://k8s.octavo.uu.am to test the automatic renewal of Let's Encrypt certificates end-to-end. More generally, this is the process to expose services directly without tunnels for high-bandwidth applications:
- Add a DNS record to point
k8s.octavo.uu.amto the router's external IP address. - Add a port forwarding rule to redirect port 80 to port 32080 on the server.
- Add a port forwarding rule to redirect port 443 to the IP address of Nginx.
- In the
kubernetes-dashboard-ingress(not a new one),- duplicate the
hostandtlsobjects, - replace the
.topFQDN with thek8s.octavo.uu.amand - rename the new
tls.secretNameto avoid hitting Let's Encrypt rate limits.
- duplicate the
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: kubernetes-dashboard-ingress
namespace: kubernetes-dashboard
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/backend-protocol: HTTPS
nginx.ingress.kubernetes.io/auth-tls-verify-client: "false"
nginx.ingress.kubernetes.io/whitelist-source-range: 10.244.0.0/16
nginx.ingress.kubernetes.io/configuration-snippet: |
more_set_headers "server: hide";
spec:
ingressClassName: nginx
rules:
- host: kubernetes-octavo.very-very-dark-gray.top
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: kubernetes-dashboard-kong-proxy
port:
number: 443
- host: k8s.octavo.uu.am
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: kubernetes-dashboard-kong-proxy
port:
number: 443
tls:
- secretName: tls-secret-cloudflare
hosts:
- kubernetes-octavo.very-very-dark-gray.top
- secretName: tls-secret-uu-am
hosts:
- k8s.octavo.uu.am
Once the DNS record has propagated and port 443 is redirected, the dashboard
is reachable only by adding the Host and ignoring SSL verifycation (-k):
$ curl 2>/dev/null -k \
-H "Host: kubernetes-octavo.very-very-dark-gray.top"\
https://k8s.octavo.uu.am/ \
| head -2
<!--
Copyright 2017 The Kubernetes Authors.
$ curl \
-H "Host: kubernetes-octavo.very-very-dark-gray.top"\
https://k8s.octavo.uu.am/
curl: (60) SSL certificate problem: self-signed certificate
More details here: https://curl.se/docs/sslcerts.html
curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.
Applying the changes to the Ingress will trigger the requests for a new certifiate
and the automatic patching of the service will get the process completed within a couple
of minmutes (provided port 80 is correctly redirected):
$ kubectl apply -f dashboard/octavo-ingress.yaml
ingress.networking.k8s.io/kubernetes-dashboard-ingress configured
ingress.networking.k8s.io/kubernetes-dashboard-ingress-tailscale unchanged
$ kubectl get svc -A | grep acme
kubernetes-dashboard cm-acme-http-solver-wlk2b NodePort 10.102.244.182 <none> 8089:32562/TCP 0s
$ klogs kubernetes-dashboard kubernetes-dashboard | grep acme | cut -f1 -d' ') -f
I0427 06:42:50.918410 1 solver.go:52] "starting listener" logger="cert-manager.acmesolver" expected_domain="k8s.octavo.uu.am" expected_token="NtYo8LxQxMIGK78bsXvv65RwI4skIolgdtSrWNuLeRs" expected_key="NtYo8LxQxMIGK78bsXvv65RwI4skIolgdtSrWNuLeRs.iqPqqTpFo6Xc2HKxELaaa6msFZd96MSHPdgrxtrPdwM" listen_port=8089
I0427 06:43:09.789648 1 solver.go:89] "validating request" logger="cert-manager.acmesolver" host="k8s.octavo.uu.am" path="/.well-known/acme-challenge/NtYo8LxQxMIGK78bsXvv65RwI4skIolgdtSrWNuLeRs" base_path="/.well-known/acme-challenge" token="NtYo8LxQxMIGK78bsXvv65RwI4skIolgdtSrWNuLeRs" headers={"Accept-Encoding":["gzip"],"Connection":["close"],"User-Agent":["cert-manager-challenges/v1.17.2 (linux/amd64) cert-manager/f3ffb86641f75d94d01e5a2606b9871ff89645ef"]}
...
I0427 06:43:18.934689 1 solver.go:112] "got successful challenge request, writing key" logger="cert-manager.acmesolver" host="k8s.octavo.uu.am" path="/.well-known/acme-challenge/NtYo8LxQxMIGK78bsXvv65RwI4skIolgdtSrWNuLeRs" base_path="/.well-known/acme-challenge" token="NtYo8LxQxMIGK78bsXvv65RwI4skIolgdtSrWNuLeRs" headers={"Accept":["*/*"],"Accept-Encoding":["gzip"],"Connection":["close"],"User-Agent":["Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"]}
E0427 06:43:20.531057 1 main.go:42] "error executing command" err="http: Server closed" logger="cert-manager"
With the new certificate installed, the dashboard is now reachable at
https://k8s.octavo.uu.am/ without ignoring SSL verifycation (-k):
$ curl 2>/dev/null \
https://k8s.octavo.uu.am/ \
| head -2
<!--
Copyright 2017 The Kubernetes Authors.
Migration from Lexicon
With the new cluster up and running, the next step is to migrate (most of) the
applications installed
previously in lexicon over to octavo, while preserving their storage and status.
Audiobookshelf
Audiobookshelf has been easily my
most used application for over a year, and already migrated it to rapture once as a
test,
so the process here is essentially the same. The deployment is only different in the
securityContext UID/GID and FQDN where the service will be available at:
Kubernetes deployment: audiobookshelf.yaml
| audiobookshelf.yaml | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 | |
First, stop Audiobookshelf in lexicon (it won't be used moving forward):
$ kubectl scale -n audiobookshelf deployment audiobookshelf --replicas=0
deployment.apps/audiobookshelf scaled
Then copy the /config and /metadata directories over from lexicon to octavo:
root@octavo ~ # groupadd audiobookshelf -g 117
root@octavo ~ # useradd audiobookshelf -u 117 -g 117 -s /usr/sbin/nologin
root@octavo ~ # rsync -ua lexicon:/home/k8s/audiobookshelf /home/k8s/
root@octavo ~ # chown -R audiobookshelf:audiobookshelf /home/k8s/audiobookshelf
root@octavo ~ # ls -hal /home/k8s/audiobookshelf
drwxr-xr-x 1 audiobookshelf audiobookshelf 28 Feb 27 2024 .
drwxr-xr-x 1 root root 104 Apr 28 22:22 ..
drwxr-xr-x 1 audiobookshelf audiobookshelf 102 Apr 28 21:27 config
drwxr-xr-x 1 audiobookshelf audiobookshelf 230 Jan 14 20:54 metadata
Finally, start the deployment in octavo:
$ kubectl apply -f audiobookshelf.yaml
namespace/audiobookshelf created
persistentvolume/audiobookshelf-pv-config created
persistentvolume/audiobookshelf-pv-metadata created
persistentvolume/audiobookshelf-pv-audiobooks created
persistentvolume/audiobookshelf-pv-podcasts created
persistentvolumeclaim/audiobookshelf-pvc-config created
persistentvolumeclaim/audiobookshelf-pvc-metadata created
persistentvolumeclaim/audiobookshelf-pvc-audiobooks created
persistentvolumeclaim/audiobookshelf-pvc-podcasts created
deployment.apps/audiobookshelf created
service/audiobookshelf-svc created
ingress.networking.k8s.io/audiobookshelf-ingress created
$ kubectl get all -n audiobookshelf
NAME READY STATUS RESTARTS AGE
pod/audiobookshelf-5b486f64b4-8zd2g 1/1 Running 0 20s
pod/cm-acme-http-solver-wd8rk 1/1 Running 0 16s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/audiobookshelf-svc NodePort 10.108.231.47 <none> 13388:31378/TCP 20s
service/cm-acme-http-solver-vblgl NodePort 10.105.105.53 <none> 8089:31546/TCP 16s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/audiobookshelf 1/1 1 1 20s
NAME DESIRED CURRENT READY AGE
replicaset.apps/audiobookshelf-5b486f64b4 1 1 1 20s
$ kubectl get svc -n audiobookshelf
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
audiobookshelf-svc NodePort 10.108.231.47 <none> 13388:31378/TCP 13s
cm-acme-http-solver-vblgl NodePort 10.105.105.53 <none> 8089:31546/TCP 9s
$ kubectl get ingress -n audiobookshelf
NAME CLASS HOSTS ADDRESS PORTS AGE
audiobookshelf-ingress nginx audiobookshelf.very-very-dark-gray.top 192.168.0.171 80, 443 60s
After a couple for minutes Audiobookshelf is available at https://audiobookshelf.very-very-dark-gray.top/ and works fine.
Home Assistant
Home Assistant
can be deployed in octavo in very much the same way it was on lexicon, using
the same base manifests
and a new kustomization.yaml file just to set different hostnames:
Prepare remote access
first, then create /home/k8s/home-assistant/ anew and apply the deployment.
$ sudo mkdir /home/k8s/home-assistant
$ ls -lah /home/k8s/home-assistant
total 0
drwxr-xr-x 1 root root 0 Apr 27 14:03 .
drwxr-xr-x 1 root root 28 Apr 27 14:03 ..
$ kubectl apply -k octavo
namespace/home-assistant created
configmap/home-assistant-config-59kccc4bcd created
configmap/home-assistant-configmap created
service/home-assistant-svc created
persistentvolume/home-assistant-pv-config created
persistentvolumeclaim/home-assistant-config-root created
deployment.apps/home-assistant created
ingress.networking.k8s.io/home-assistant-nginx created
ingress.networking.k8s.io/home-assistant-tailscale created
$ kubectl get all -n home-assistant
NAME READY STATUS RESTARTS AGE
pod/cm-acme-http-solver-ssftt 1/1 Running 0 54s
pod/home-assistant-77bf44c47b-tqz2s 1/1 Running 0 59s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/cm-acme-http-solver-f7jbv NodePort 10.110.189.111 <none> 8089:31135/TCP 54s
service/home-assistant-svc ClusterIP 10.107.114.233 <none> 8123/TCP 59s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/home-assistant 1/1 1 1 59s
NAME DESIRED CURRENT READY AGE
replicaset.apps/home-assistant-77bf44c47b 1 1 1 59s
$ sudo ls -lah /home/k8s/home-assistant
total 916K
drwxr-xr-x 1 root root 372 Apr 27 14:14 .
drwxr-xr-x 1 root root 28 Apr 27 14:03 ..
drwxr-xr-x 1 root root 32 Apr 27 14:14 blueprints
drwxr-xr-x 1 root root 0 Apr 27 14:14 .cloud
-rw-r--r-- 1 root root 0 Apr 27 14:14 configuration.yaml
-rw-r--r-- 1 root root 8 Apr 27 14:14 .HA_VERSION
-rw-r--r-- 1 root root 0 Apr 27 14:14 home-assistant.log
-rw-r--r-- 1 root root 0 Apr 27 14:14 home-assistant.log.1
-rw-r--r-- 1 root root 0 Apr 27 14:14 home-assistant.log.fault
-rw-r--r-- 1 root root 4.0K Apr 27 14:14 home-assistant_v2.db
-rw-r--r-- 1 root root 32K Apr 27 14:30 home-assistant_v2.db-shm
-rw-r--r-- 1 root root 866K Apr 27 14:30 home-assistant_v2.db-wal
drwxr-xr-x 1 root root 428 Apr 27 14:29 .storage
drwxr-xr-x 1 root root 0 Apr 27 14:14 tts
After less than a minute, the ACME solver is patched to listen on port 32080,
and after just about another minute the solver is gone. At that point Home Assistant
is available at both https://home-assistant-octavo.royal-penny.ts.net/ and
https://home-assistant-octavo.very-very-dark-gray.top/ ready to start the
onboarding process.
Bluetooth setup
Bluetooth failed setup
in lexicon and the same is bound to repeat in octavo now; even though the base
deployment manifests already mount /run/dbus it is also required to
switch from dbus-daemon to dbus-broker and install BlueZ:
Restore backup
At this point a fresh backup from lexicon could be restored in octavo to
initialize Home Assistant, including the tweaked configuration.yaml file,
Home Assistant Community Store
and all the cards and dashboards.
However, restoring a backup fails consistently with an OS-level error that
doesn't seem to make sense; the logs for the pod show that a process was unable
to open the main configuration.yaml file, even though there was definitely
nothing accessing it at the time:
$ klogs home-assistant home-assistant
s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service legacy-services: starting
services-up: info: copying legacy longrun home-assistant (no readiness notification)
s6-rc: info: service legacy-services successfully started
[12:58:06] INFO: Home Assistant Core finish process exit code 100
INFO:homeassistant.backup_restore:Restoring /config/backups/lexicon-last_2025-04-27_14.40_32799259.tar
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/usr/src/homeassistant/homeassistant/__main__.py", line 227, in <module>
sys.exit(main())
~~~~^^
File "/usr/src/homeassistant/homeassistant/__main__.py", line 186, in main
if restore_backup(config_dir):
~~~~~~~~~~~~~~^^^^^^^^^^^^
File "/usr/src/homeassistant/homeassistant/backup_restore.py", line 197, in restore_backup
_extract_backup(
~~~~~~~~~~~~~~~^
config_dir=config_dir,
^^^^^^^^^^^^^^^^^^^^^^
restore_content=restore_content,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/usr/src/homeassistant/homeassistant/backup_restore.py", line 143, in _extract_backup
_clear_configuration_directory(config_dir, keep)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
File "/usr/src/homeassistant/homeassistant/backup_restore.py", line 87, in _clear_configuration_directory
entrypath.unlink()
~~~~~~~~~~~~~~~~^^
File "/usr/local/lib/python3.13/pathlib/_local.py", line 746, in unlink
os.unlink(self)
~~~~~~~~~^^^^^^
OSError: [Errno 16] Resource busy: '/config/configuration.yaml'
[12:58:08] INFO: Home Assistant Core finish process exit code 1
[12:58:08] INFO: Home Assistant Core service shutdown
s6-rc: info: service legacy-services: stopping
s6-rc: info: service legacy-services successfully stopped
s6-rc: info: service legacy-cont-init: stopping
s6-rc: info: service legacy-cont-init successfully stopped
s6-rc: info: service fix-attrs: stopping
s6-rc: info: service fix-attrs successfully stopped
s6-rc: info: service s6rc-oneshot-runner: stopping
s6-rc: info: service s6rc-oneshot-runner successfully stopped
The error persists after Home Assistant (pod) is restarted:
$ klogs home-assistant home-assistant
s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service legacy-services: starting
services-up: info: copying legacy longrun home-assistant (no readiness notification)
s6-rc: info: service legacy-services successfully started
2025-04-27 14:58:16.784 WARNING (MainThread) [homeassistant.components.backup]
Backup restore failed with OSError: [Errno 16] Resource busy: '/config/configuration.yaml'
In the meantime, the Home Assistant onboarding page keeps waiting even though nothing is happening in the backend. Opening the same page on a new browser tab and selecting the option to restore the backup shows the same error:
The backup could not be restored. Please try again.
Error:
[Errno 16] Resource busy: '/config/configuration.yaml'
Trying again consistently leads to the same error time and again, even after stopping
the pods, deleting the entire /home/k8s/home-assistant directory and starting anew
with a fresh empty one.
Hardcore migration
The solution turned out to be to simply copy over the /home/k8s/home-assistant
directory at the rigth time between stopping and starting the relevant services:
-
Stop Home Assistant on both servers by scaling the deployment to 0 replicas:
-
Copy
/home/k8s/home-assistantover fromlexicontooctavo:root@octavo ~ # rm -rf /home/k8s/home-assistant root@octavo ~ # rsync -ua lexicon:/home/k8s/home-assistant /home/k8s/ root@octavo ~ # ls -hal /home/k8s/home-assistant total 110M drwxr-xr-x 1 root root 570 Apr 27 23:28 . drwxr-xr-x 1 root root 76 Apr 28 21:14 .. -rw-r--r-- 1 root root 0 Apr 25 22:39 automations.yaml drwxr-xr-x 1 root root 484 Apr 28 05:37 backups drwxr-xr-x 1 root root 48 Apr 25 22:42 blueprints drwxr-xr-x 1 root root 0 Apr 21 11:13 .cloud -rw-r--r-- 1 root root 670 Apr 27 19:37 configuration.yaml drwxr-xr-x 1 root root 26 Apr 24 21:04 custom_components drwxr-xr-x 1 root root 0 Apr 21 17:58 deps -rw-r--r-- 1 root root 8 Apr 26 05:08 .HA_VERSION -rw-r--r-- 1 root root 2.2M Apr 28 22:20 home-assistant.log -rw-r--r-- 1 root root 5.8K Apr 27 23:28 home-assistant.log.1 -rw-r--r-- 1 root root 0 Apr 27 23:28 home-assistant.log.fault -rw-r--r-- 1 root root 103M Apr 28 22:18 home-assistant_v2.db -rw-r--r-- 1 root root 32K Apr 28 22:20 home-assistant_v2.db-shm -rw-r--r-- 1 root root 4.4M Apr 28 22:20 home-assistant_v2.db-wal drwxr-xr-x 1 root root 448 Apr 23 21:07 image -rw-r--r-- 1 root root 4.9K Apr 22 22:59 install-hacs.sh -rw-r--r-- 1 root root 0 Apr 25 22:39 scenes.yaml -rw-r--r-- 1 root root 0 Apr 25 22:39 scripts.yaml drwxr-xr-x 1 root root 1.3K Apr 28 22:13 .storage drwxr-xr-x 1 root root 14 Apr 25 22:29 templates drwxr-xr-x 1 root root 0 Apr 21 11:13 tts drwxr-xr-x 1 root root 18 Apr 23 22:58 www -
Start Home Assistant on
octavoonly (scaling back to 1 replica):
Now Home Assistant is available at https://home-assistant-octavo.royal-penny.ts.net/ and https://home-assistant-octavo.very-very-dark-gray.top/ and working flawlessly, with the same user credentails, except for just one little quirk:
DC:21:48:43:B7:C2 is the Bluetooth adapter in lexicon, so it can be removed, and
D0:65:78:A5:8B:E1 is the Bluetooth adapter in octavo so that's the one to add.
InfluxDB
The InfluxDB integration
stopped reporting metrics to octavo even though the configuration remained valid;
the integration had gone missing and would no longer be found under
Settings > Devices & services. To restore reporting it was found necessary to
- Remove the
influxdbsection from Home Assistant'sconfiguration.yaml. - Restart Home Assistant.
- Add the
influxdbsection again toconfiguration.yaml. - Restart Home Assistant again.
This persuaded Home Assistant to add the InfluxDB integration back, and after that the integration simply resumed sending all metrics again.
Synology DSM
Another little surprise from Home Assistant after moving to octavo is
Synology DSM discovering the
Synology DS423+ (luggage).
Create a new NAS user in the admin group but without access to any files, but with
access to all services:
Then add this user's credentials to the Synology DSM integration and specify port 5001 to use HTTPS, check the option for Uses an SSL certificate but uncheck the one for Verify SSL certificate. After sumitting, Home Assistant creates devices for the NAS, each volume and each drive. This adds 5 devides and 43 entities, to be used later to create a cool dashboard.
InfluxDB and Grafana
InfluxDB and Grafana
running in lexicon collect data from all sources reporting through
Continuous Monitoring and serves the dashboards.
This setup is about a year old and runing Grafana 10.4.2 and InfluxDB 1.8.
Seems like time to update these:
- Upgrading InfluxDB from v1.8.10 to v1.11.7 is a large jump and there is no reason to switch to v2 given the future of Flux (deprecated and not planned for InfluxDB v3).
- Breaking changes in Grafana v11.0 looks like it should be fine to update from 10.4.2, at least worth a try.
At this time the latest versions available in Docker images are InfluxDB 1.11.8 and Grafana 11.6.1.
Combined deployment for InfluxDB and Grafana
apiVersion: v1
kind: Namespace
metadata:
name: monitoring
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: influxdb-pv
labels:
type: local
namespace: monitoring
spec:
storageClassName: manual
capacity:
storage: 30Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /home/k8s/influxdb
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: influxdb-pv-claim
namespace: monitoring
spec:
storageClassName: manual
volumeName: influxdb-pv
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 30Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: monitoring
labels:
app: influxdb
name: influxdb
spec:
replicas: 1
selector:
matchLabels:
app: influxdb
template:
metadata:
labels:
app: influxdb
spec:
hostname: influxdb
containers:
- image: docker.io/influxdb:1.11.8
env:
- name: "INFLUXDB_HTTP_AUTH_ENABLED"
value: "true"
name: influxdb
volumeMounts:
- mountPath: /var/lib/influxdb
name: influxdb-data
securityContext:
runAsUser: 114
runAsGroup: 114
volumes:
- name: influxdb-data
persistentVolumeClaim:
claimName: influxdb-pv-claim
---
apiVersion: v1
kind: Service
metadata:
labels:
app: influxdb
name: influxdb-svc
namespace: monitoring
spec:
ports:
- port: 18086
protocol: TCP
targetPort: 8086
nodePort: 30086
selector:
app: influxdb
type: NodePort
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: grafana-pv
labels:
type: local
namespace: monitoring
spec:
storageClassName: manual
capacity:
storage: 3Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /home/k8s/grafana
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: grafana-pv-claim
namespace: monitoring
spec:
storageClassName: manual
volumeName: grafana-pv
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: monitoring
labels:
app: grafana
name: grafana
spec:
replicas: 1
selector:
matchLabels:
app: grafana
template:
metadata:
labels:
app: grafana
spec:
containers:
- image: docker.io/grafana/grafana:11.6.1
env:
- name: HOSTNAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: "GF_AUTH_ANONYMOUS_ENABLED"
value: "true"
- name: "GF_SECURITY_ADMIN_USER"
value: "admin"
- name: "GF_SECURITY_ADMIN_PASSWORD"
value: "__________________________"
name: grafana
volumeMounts:
- name: grafana-data
mountPath: /var/lib/grafana
securityContext:
runAsUser: 115
runAsGroup: 115
fsGroup: 115
volumes:
- name: grafana-data
persistentVolumeClaim:
claimName: grafana-pv-claim
---
apiVersion: v1
kind: Service
metadata:
labels:
app: grafana
name: grafana-svc
namespace: monitoring
spec:
ports:
- port: 13000
protocol: TCP
targetPort: 3000
nodePort: 30300
selector:
app: grafana
type: NodePort
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: grafana-ingress
namespace: monitoring
annotations:
acme.cert-manager.io/http01-edit-in-place: "true"
cert-manager.io/issue-temporary-certificate: "true"
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
ingressClassName: nginx
rules:
- host: grafana.very-very-dark-gray.top
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: grafana-svc
port:
number: 3000
tls:
- secretName: tls-secret-grafana
hosts:
- grafana.very-very-dark-gray.top
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: influxdb-ingress
namespace: monitoring
annotations:
acme.cert-manager.io/http01-edit-in-place: "true"
cert-manager.io/issue-temporary-certificate: "true"
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
ingressClassName: nginx
rules:
- host: influxdb.very-very-dark-gray.top
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: influxdb-svc
port:
number: 8086
tls:
- secretName: tls-secret-influxdb
hosts:
- influxdb.very-very-dark-gray.top
Before deploying the above, dedicated users and home directories must be created for
influxdb and grafana:
root@octavo ~ # groupadd influxdb -g 114
root@octavo ~ # groupadd grafana -g 115
root@octavo ~ # useradd influxdb -u 114 -g 114 -s /usr/sbin/nologin
root@octavo ~ # useradd grafana -u 115 -g 115 -s /usr/sbin/nologin
Then files need to be copied over at the right time between stopping the services
in lexicon and starting them again:
-
Stop Grafana and InfluxDB (in this order) in
lexicon: -
Copy data over from
lexicontooctavo:root@octavo ~ # rsync -ua lexicon:/home/k8s/influxdb /home/k8s/ root@octavo ~ # rsync -ua lexicon:/home/k8s/grafana /home/k8s/ root@octavo ~ # chown -R influxdb:influxdb /home/k8s/influxdb root@octavo ~ # chown -R grafana:grafana /home/k8s/grafana root@octavo ~ # ls -hal /home/k8s/influxdb /home/k8s/grafana /home/k8s/grafana: total 3.9M drwxr-xr-x 1 grafana grafana 68 Apr 27 19:08 . drwxr-xr-x 1 root root 58 Apr 27 19:10 .. drwxr-x--- 1 grafana grafana 2 Apr 20 2024 alerting drwx------ 1 grafana grafana 0 Apr 20 2024 csv -rw-r----- 1 grafana grafana 3.9M Apr 27 19:08 grafana.db drwx------ 1 grafana grafana 0 Apr 20 2024 pdf drwxr-xr-x 1 grafana grafana 0 Apr 20 2024 plugins drwx------ 1 grafana grafana 0 Apr 20 2024 png /home/k8s/influxdb: total 0 drwxr-xr-x 1 influxdb influxdb 22 Apr 20 2024 . drwxr-xr-x 1 root root 58 Apr 27 19:10 .. drwxr-xr-x 1 influxdb influxdb 90 Apr 26 08:20 data drwxr-xr-x 1 influxdb influxdb 14 Apr 27 02:08 meta drwx------ 1 influxdb influxdb 90 Apr 26 08:20 wal -
Start InfluxDB and Grafana (in this order) in
lexicon:
Finally, start the deployment in octavo:
$ kubectl apply -f monitoring.yaml
namespace/monitoring created
persistentvolume/influxdb-pv created
persistentvolumeclaim/influxdb-pv-claim created
deployment.apps/influxdb created
service/influxdb-svc created
persistentvolume/grafana-pv created
persistentvolumeclaim/grafana-pv-claim created
deployment.apps/grafana created
service/grafana-svc created
ingress.networking.k8s.io/grafana-ingress created
ingress.networking.k8s.io/influxdb-ingress created
$ kubectl get all -n monitoring
NAME READY STATUS RESTARTS AGE
pod/cm-acme-http-solver-dkgn5 1/1 Running 0 59s
pod/grafana-6fff9dbb6c-v22hg 1/1 Running 0 62s
pod/influxdb-5974bf664f-8r5mf 1/1 Running 0 62s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/cm-acme-http-solver-n8l8h NodePort 10.98.3.183 <none> 8089:30378/TCP 59s
service/grafana-svc NodePort 10.110.29.239 <none> 13000:30300/TCP 62s
service/influxdb-svc NodePort 10.110.65.108 <none> 18086:30086/TCP 62s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/grafana 1/1 1 1 62s
deployment.apps/influxdb 1/1 1 1 62s
NAME DESIRED CURRENT READY AGE
replicaset.apps/grafana-6fff9dbb6c 1 1 1 62s
replicaset.apps/influxdb-5974bf664f 1 1 1 62s
$ kubectl get svc -n monitoring
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cm-acme-http-solver-n8l8h NodePort 10.98.3.183 <none> 8089:30378/TCP 63s
grafana-svc NodePort 10.110.29.239 <none> 13000:30300/TCP 66s
influxdb-svc NodePort 10.110.65.108 <none> 18086:30086/TCP 66s
$ kubectl get ingress -n monitoring
NAME CLASS HOSTS ADDRESS PORTS AGE
grafana-ingress nginx grafana.very-very-dark-gray.top 192.168.0.171 80, 443 69s
influxdb-ingress nginx influxdb.very-very-dark-gray.top 192.168.0.171 80, 443 69s
This worked surprisingly well, both services became quickly available at their
assigned URLs, with authentication working as intended and all dashboards working
as before. The only other change needed was getting data flowing into the new InfluxDB
server, by updating the conmon scripts in all the reporting systems.
Clean up lexicon
Once InfluxDB is running in octavo there are a few additional monitoring scripts to
relocate to octavo (because they've been running in lexicon), some of which require
additional dependencies to be installed system-wide:
conmon-speedtestdepens onspeedtest-cliconmon-tapo.pyhas more complex Python dependencies.
Once the scripts work, they need to run from root's crontab:
# crontab -e
*/3 * * * * /home/k8s/code-server/conmon/conmon-speedtest
*/5 * * * * /home/k8s/code-server/conmon/conmon-tapo.py
Komga
Komga (eBook library)
was very easy to migrate from lexicon, with only the minor quirk that its /config
directory had been inadvertently placed inside itself; fixing this, and the location
of eBooks, are the only changes in this deployment:
Kubernetes deployment: komga.yaml
| komga.yaml | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 | |
Then again, the trick is to copy the /config directory between stopping the service
in lexicon and starting it in octavo.
First, stop komga in lexicon (it won't be used moving forward):
Copy data over from lexicon to octavo, moving the /config directory one level up:
root@octavo ~ # groupadd komga -g 118
root@octavo ~ # useradd komga -u 118 -g 118 -s /usr/sbin/nologin
root@octavo ~ # rsync -ua lexicon:/home/k8s/komga/config/ /home/k8s/komga
root@octavo ~ # chown -R komga:komga /home/k8s/komga
root@octavo ~ # ls -hal /home/k8s/komga
total 14M
drwxr-xr-x 1 komga komga 74 Apr 28 23:09 .
drwxr-xr-x 1 root root 114 Apr 28 23:26 ..
-rw-r--r-- 1 komga komga 14M Apr 28 23:09 database.sqlite
drwxr-xr-x 1 komga komga 368 Apr 28 05:05 logs
drwxr-xr-x 1 komga komga 668 Mar 16 06:08 lucene
-rw-r--r-- 1 komga komga 156K Apr 28 23:09 tasks.sqlite
Warning
The trailing / in lexicon:/home/k8s/komga/config/ is critical to make
the destination directory /home/k8s/komga in octavo be that data
directory, instead of containing it.
Finally, start the deployment in octavo:
$ kubectl apply -f komga.yaml
namespace/komga created
persistentvolume/komga-pv-config created
persistentvolume/komga-pv-books created
persistentvolumeclaim/komga-pvc-config created
persistentvolumeclaim/komga-pvc-books created
deployment.apps/komga created
service/komga-svc created
ingress.networking.k8s.io/komga-ingress created
$ kubectl get all -n komga
NAME READY STATUS RESTARTS AGE
pod/cm-acme-http-solver-v8rp6 1/1 Running 0 21s
pod/komga-5cc699fdcd-xwqlh 1/1 Running 0 24s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/cm-acme-http-solver-m625s NodePort 10.108.18.62 <none> 8089:31587/TCP 21s
service/komga-svc NodePort 10.105.251.174 <none> 25600:30600/TCP 24s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/komga 1/1 1 1 24s
NAME DESIRED CURRENT READY AGE
replicaset.apps/komga-5cc699fdcd 1 1 1 24s
$ kubectl get svc -n komga
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
komga-svc NodePort 10.105.251.174 <none> 25600:30600/TCP 67s
$ kubectl get ingress -n komga
NAME CLASS HOSTS ADDRESS PORTS AGE
komga-ingress nginx komga.very-very-dark-gray.top 192.168.0.171 80, 443 72s
After a couple for minutes Komga is available at https://komga.very-very-dark-gray.top/ and everything works fine.
Navidrome
Navidrome (music streaming)
was very easy to migrate from lexicon, with only the minor quirk that its /data
directory had been inadvertently placed inside itself; fixing this, and the location
of music files, are the only changes in this deployment:
Kubernetes deployment: navidrome.yaml
| navidrome.yaml | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 | |
Then again, the trick is to copy the /data directory between stopping the service
in lexicon and starting it in octavo.
First, stop Navidrome in lexicon (it won't be used moving forward):
Copy data over from lexicon to octavo, moving the /data directory one level up:
root@octavo ~ # groupadd navidrome -g 116
root@octavo ~ # useradd navidrome -u 116 -g 116 -s /usr/sbin/nologin
root@octavo ~ # rsync -ua lexicon:/home/k8s/navidrome/data/ /home/k8s/navidrome
root@octavo ~ # chown -R navidrome:navidrome /home/k8s/navidrome
root@octavo ~ # ls -hal /home/k8s/navidrome
total 33M
drwxr-xr-x 1 navidrome navidrome 98 Apr 28 21:22 .
drwxr-xr-x 1 root root 76 Apr 28 21:14 ..
drwxr-xr-x 1 navidrome navidrome 56 Dec 23 06:09 cache
-rw-r--r-- 1 navidrome navidrome 32M Apr 28 21:08 navidrome.db
-rw-r--r-- 1 navidrome navidrome 32K Apr 28 22:18 navidrome.db-shm
-rw-r--r-- 1 navidrome navidrome 620K Apr 28 22:18 navidrome.db-wal
Warning
The trailing / in lexicon:/home/k8s/navidrome/data/ is critical to make
the destination directory /home/k8s/navidrome in octavo be that data
directory, instead of containing it.
Finally, start the deployment in octavo:
$ kubectl apply -f navidrome.yaml
NAME READY STATUS RESTARTS AGE
pod/cm-acme-http-solver-tq8wv 1/1 Running 0 6s
pod/navidrome-588d4d77c7-pcxqm 0/1 ContainerCreating 0 9s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/cm-acme-http-solver-l9wp2 NodePort 10.97.50.141 <none> 8089:31185/TCP 6s
service/navidrome-svc NodePort 10.110.51.110 <none> 4533:30533/TCP 9s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/navidrome 0/1 1 0 9s
NAME DESIRED CURRENT READY AGE
replicaset.apps/navidrome-588d4d77c7 1 1 0 9s
$ kubectl get svc -n navidrome
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cm-acme-http-solver-l9wp2 NodePort 10.97.50.141 <none> 8089:31185/TCP 10s
navidrome-svc NodePort 10.110.51.110 <none> 4533:30533/TCP 13s
$ kubectl get ingress -n navidrome
NAME CLASS HOSTS ADDRESS PORTS AGE
navidrome-ingress nginx navidrome.very-very-dark-gray.top 192.168.0.171 80, 443 17
After a couple for minutes Navidrome is available at https://navidrome.very-very-dark-gray.top/ and everything works fine.
Visual Studio Code Server
Visual Studio Code Server
has been running in lexicon for nearly 2 years and it is still sometimes
preferable to using Visual Studio Code on the desktop, so this service is
also migrated over to octavo in very much the same fashion.
Kubernetes deployment: code-server.yaml
| code-server.yaml | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 | |
First, stop the service in lexicon (it won't be used moving forward):
$ kubectl scale -n code-server deployment code-server --replicas=0
deployment.apps/code-server scaled
Copy data over from lexicon to octavo (keeping ownership to ponder):
Start the deployment in octavo:
$ kubectl apply -f code-server.yaml
namespace/code-server created
service/code-server created
persistentvolume/code-server-pv created
persistentvolumeclaim/code-server-pv-claim created
deployment.apps/code-server created
ingress.networking.k8s.io/code-server-ingress created
$ kubectl get all -n code-server
NAME READY STATUS RESTARTS AGE
pod/cm-acme-http-solver-44r69 1/1 Running 0 16s
pod/code-server-69d64cd9cd-llgqq 1/1 Running 0 18s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/cm-acme-http-solver-x75d2 NodePort 10.106.2.52 <none> 8089:32080/TCP 16s
service/code-server ClusterIP 10.103.60.210 <none> 80/TCP 18s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/code-server 1/1 1 1 18s
NAME DESIRED CURRENT READY AGE
replicaset.apps/code-server-69d64cd9cd 1 1 1 18s
$ kubectl get svc -n code-server
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cm-acme-http-solver-x75d2 NodePort 10.106.2.52 <none> 8089:32080/TCP 5s
code-server ClusterIP 10.103.60.210 <none> 80/TCP 7s
$ kubectl get ingress -n code-server
NAME CLASS HOSTS ADDRESS PORTS AGE
code-server-ingress nginx code-server.very-very-dark-gray.top 80, 443 11s
After a couple for minutes Visual Studio Code is available at https://code-server.very-very-dark-gray.top/ and everything works fine, although due to the new URL it needs to be re-authorized on the linked GitHub account.
Plex Media Server
Migrating my Plex Media Server to Kubernetes
on lexicon was relatively involved and this Plex server has already been replaced by
Jellyfin, so it's not clear that this Plex server needs
to be migrated in the same way as before.
To make sure the service won't be missed, the deployment is now stopped:
For now, its database will be copied over to octavo so the "same" Plex
server can be restored later in necessary, although it may make more time
to instead set it up as a new Plex server with only a subset of the library.
root@octavo ~ # rsync -ua lexicon:/home/k8s/plexmediaserver /home/k8s/
root@octavo ~ # ls -lan /home/k8s/plexmediaserver/
total 0
drwxr-xr-x 1 998 998 42 Sep 16 2023 .
drwxr-xr-x 1 0 0 306 May 1 19:16 ..
drwxr-xr-x 1 998 998 38 Jul 6 2022 Library
drwxr-xr-x 1 998 998 38 Sep 15 2023 Library.backup
UID/GID 998 is already taken by systemd-network so the files would need to have their
ownership changed if and when a dedicated plex user is created for a new Plex Media
Server deployment.
Minecraft Server
Running Minecraft Java Server for Bedrock clients on Kubernetes
was fun while the kids wanted to play Minecraft, but they don't seem to be so inclined
since some time ago and have admitted they won't be using it any time soon. Since the
server was taking up a sizeable 10GB of RAM (to do essentially nothing with it),
it has been scaled down to zero replicas and won't be setup in octavo until there is
demand for it again. Maybe by then it can be added to a working
Pterodactyl®.
$ kubectl scale -n minecraft-server deployment minecraft-server --replicas=0
deployment.apps/minecraft-server scaled
In the meantime, backups are kept in their configured paths for potential future use:
root@octavo ~ # rsync -ua lexicon:/home/k8s/minecraft-server /home/k8s/
root@octavo ~ # rsync -ua lexicon:/home/k8s/minecraft-server-backups /home/k8s/
root@octavo ~ # du -sh /home/k8s/minecraft-server*
1.8G /home/k8s/minecraft-server
40G /home/k8s/minecraft-server-backups
Firefly III
Self-hosted accountancy with Firefly III has not been used much; in all honesty keeping track of expenses and transactions is a lot less fun and a lot more grind than everything else going on. Even so, just in case this will be used in the future, it is worth migrating just to not throw it away.
Warning
Do not bother trying to change what user these processes run as, these are
hard-coded in the images and (at least) the mariadb pod will fail to start
if it runs as a different user.
Kubernetes deployment: firefly-iii.yaml
| firefly-iii.yaml | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 | |
First, stop the service in lexicon (it won't be used moving forward):
$ kubectl scale -n firefly-iii deployment firefly-iii --replicas=0
deployment.apps/firefly-iii scaled
$ kubectl scale -n firefly-iii deployment firefly-iii-mysql --replicas=0
deployment.apps/firefly-iii-mysql scaled
Copy data over from lexicon to octavo:
root@octavo ~ # rsync -ua lexicon:/home/k8s/firefly-iii /home/k8s/
root@octavo ~ # ls -hal /home/k8s/firefly-iii
drwxr-xr-x 1 root root 22 May 19 2024 .
drwxr-xr-x 1 root root 254 May 1 15:20 ..
drwxr-xr-x 1 dhcpcd lxd 566 May 1 15:32 mysql
drwxrwxr-x 1 www-data www-data 0 May 19 2024 upload
Start the deployment in octavo:
$ kubectl apply -f firefly-iii.yaml
namespace/firefly-iii created
persistentvolume/firefly-iii-pv-mysql created
persistentvolumeclaim/firefly-iii-pvc-mysql created
service/firefly-iii-mysql-svc created
persistentvolume/firefly-iii-pv-upload created
persistentvolumeclaim/firefly-iii-pvc-upload created
service/firefly-iii-svc created
ingress.networking.k8s.io/firefly-iii-ingress created
deployment.apps/firefly-iii created
ingress.networking.k8s.io/firefly-iii-ingress configured
$ kubectl get all -n firefly-iii
NAME READY STATUS RESTARTS AGE
pod/cm-acme-http-solver-rhfrm 1/1 Running 0 15s
pod/firefly-iii-7c6f8597c9-j7nlf 1/1 Running 0 3m27s
pod/firefly-iii-mysql-859cd77d57-85pjp 1/1 Running 0 4m1s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/cm-acme-http-solver-4dln6 NodePort 10.99.155.239 <none> 8089:32080/TCP 15s
service/firefly-iii-mysql-svc NodePort 10.105.106.204 <none> 3306:30306/TCP 10m
service/firefly-iii-svc NodePort 10.98.129.183 <none> 8080:30080/TCP 10m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/firefly-iii 1/1 1 1 9m57s
deployment.apps/firefly-iii-mysql 1/1 1 1 9m57s
NAME DESIRED CURRENT READY AGE
replicaset.apps/firefly-iii-7c6f8597c9 1 1 1 4m22s
replicaset.apps/firefly-iii-mysql-859cd77d57 1 1 1 4m23s
$ kubectl get svc -n firefly-iii
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cm-acme-http-solver-hgc27 NodePort 10.110.227.244 <none> 8089:32080/TCP 72s
firefly-iii-mysql-svc NodePort 10.105.106.204 <none> 3306:30306/TCP 76s
firefly-iii-svc NodePort 10.98.129.183 <none> 8080:30080/TCP 76s
$ kubectl get ingress -n firefly-iii
NAME CLASS HOSTS ADDRESS PORTS AGE
firefly-iii-ingress nginx firefly-iii.very-very-dark-gray.top 192.168.0.171 80, 443 79s
After a couple of minutes Firefly III is available at https://firefly-iii.very-very-dark-gray.top/ and ready to use with the migrated database and everything else.
New services (post migration)
After all the above migrations, new services have been added to octavo that were not
previously available in lexicon:
- Jellyfin to watch videos from anywhere, including hardware acceleration for transcoding AV1 videos using the onboard Intel GPU Irix Xe Graphics.
- Unifi Network Application
was never actually used in
lexicon, so instead of migrating an empty deployment this was deployed anew inoctavousing a slightly updated manifest to set newer versions of both images and updated UID/GID (119).
Migration to new ISP
Replacing the router with a UniFi Gateway Fiber (UXG-FIBER) required first adopting the router, by connecting it to the LAN on both sides:
- Ethernet uplink port to get an IP on the current LAN to communicate both with the Internet (old router) and LAN (to the UniFi Network app). The IP range on the LAN is 192.168.0.0/24 as set by the old router, this being its default settings.
- Ethernet downlink port sets its own static IP range, seemingly choosing
the 192.168.1.0/24 range automatically, it being the next available
/24network available.
Take 1: Unifi Network Application on new LAN
The Unifi Network Application being only available on the 0.0/24 network, it would not be reachable by the router once the old router is disconnected from the LAN. To make the Unifi Network Application reachable on the new router's LAN side, it needs an IP address on the 1.0/24 network,
Kubernetes services can be exposed on a new LAN by adding a suitable range
to the pool of IP addresses for MetalLB Load Balancer
and adding a new Service using one of those IP addresses:
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: production
namespace: metallb-system
spec:
addresses:
- 192.168.0.171-192.168.0.180
- 192.168.1.171-192.168.1.220
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: l2-advert
namespace: metallb-system
$ kubectl apply -f metallb/ipaddress-pool-octavo.yaml
ipaddresspool.metallb.io/production configured
l2advertisement.metallb.io/l2-advert unchanged
$ kubectl get ipaddresspool.metallb.io -n metallb-system
NAME AUTO ASSIGN AVOID BUGGY IPS ADDRESSES
production true false ["192.168.0.171-192.168.0.180","192.168.1.201-192.168.1.220"]
$ kubectl apply -f unifi-network-app.yaml
namespace/unifi unchanged
persistentvolume/mongo-pv-data unchanged
persistentvolume/mongo-pv-init unchanged
persistentvolumeclaim/mongo-pvc-data unchanged
persistentvolumeclaim/mongo-pvc-init unchanged
deployment.apps/mongo unchanged
service/mongo-svc unchanged
persistentvolume/unifi-pv-config unchanged
persistentvolumeclaim/unifi-pvc-config unchanged
deployment.apps/unifi unchanged
service/unifi-tcp unchanged
service/unifi-tcp-1 created
service/unifi-udp unchanged
ingress.networking.k8s.io/unifi-ingress unchanged
$ kubectl get services -A | grep '192.168.'
ingress-nginx ingress-nginx-controller LoadBalancer 10.99.252.250 192.168.0.171 80:30278/TCP,443:30974/TCP 150d
unifi unifi-tcp LoadBalancer 10.105.232.48 192.168.0.173 6789:31231/TCP,8080:32034/TCP,8443:30909/TCP 145d
unifi unifi-tcp-1 LoadBalancer 10.109.194.34 192.168.1.220 6789:30860/TCP,8080:31627/TCP,8443:30889/TCP 6d1h
unifi unifi-udp LoadBalancer 10.108.54.45 192.168.0.173 3478:31805/UDP,10001:32694/UDP,1900:30234/UDP 145d
Take 2: Old router on new LAN
Turns out the UniFi router, once adopted, would never reporr to the
Unifi Network Application on 192.168.1.220, even though the application
is available on https://192.168.1.220:8443 just as well as on
https://192.168.0.173:8443 because, unlike access points, the router has no
usable SSH connection to set-inform https://192.168.0.220:8080.
Eventually the solution was to set the old router to be leasing addresses (as DHCP server) on the 1.0/24 network, which let the UniFi router to re-configure its LAN ports to adopt the 0.0/24 network. After this change, the router was able again to reach the Unifi Network Application on 192.168.0.173 and everything was well once again.



