Skip to main content

IDIAL Kubernetes Setup Guide

VersionDateAuthorChange
1.025.03.2026Maximilian Wilke (BxC Security)First Version

This document provides the formal installation and deployment procedure for the IDIAL platform on a Kubernetes cluster. It covers the initial cluster setup, installation of the required infrastructure components, deployment of the IDIAL application, service access, failover expectations, and troubleshooting guidance.

The guide is intended for administrators responsible for preparing and operating the target Kubernetes environment. All command blocks and configuration examples must be executed exactly as shown.


1. General

1.1 Introduction

IDIAL runs on a bare-metal Kubernetes cluster consisting of one control-plane node and two worker nodes. The infrastructure stack consists of:

  • containerd as the container runtime
  • Flannel as the CNI plugin (pod CIDR 10.244.0.0/16)
  • Longhorn for distributed, replicated persistent storage
  • MetalLB for LoadBalancer IP assignment on bare metal
  • nginx-ingress as the single TLS entry point

2. Base Preparation

Begin by updating the package index and upgrading all installed packages on every node. Then install the container runtime.

sudo apt update -y && sudo apt upgrade -y
sudo apt install -y containerd

Configure containerd by creating the required configuration directory and generating the default configuration file.

sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml

In /etc/containerd/config.toml, change SystemdCgroup from false to true in order to align the container runtime with Kubernetes systemd-based cgroup management.

sudo vim /etc/containerd/config.toml
# Find: SystemdCgroup = false
# Change: SystemdCgroup = true

Load the kernel modules required by Kubernetes networking.

sudo vim /etc/modules-load.d/k8s.conf

Add the following content:

overlay
br_netfilter

Configure the required network-related kernel parameters.

sudo vim /etc/sysctl.conf

Add or uncomment the following lines:

net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1

Reboot the node to ensure all configuration changes and kernel settings are applied.


2.1 Installation of Kubernetes Packages

Install the prerequisite packages required to access the Kubernetes package repository. Then add the Kubernetes signing key and repository, update the package index, and install the Kubernetes node components. This must be done on all nodes.

sudo apt-get install -y apt-transport-https ca-certificates curl gpg

curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.35/deb/Release.key \
| sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] \
https://pkgs.k8s.io/core:/stable:/v1.35/deb/ /' \
| sudo tee /etc/apt/sources.list.d/kubernetes.list

sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

2.2 Initialization of the Control Plane

Initialize the Kubernetes control plane on the master node using the specified control-plane endpoint, node name, and pod network CIDR.

sudo kubeadm init \
--control-plane-endpoint=<MASTER-IP-ADDRESS> \
--node-name k8s-idial-master-orange \
--pod-network-cidr=10.244.0.0/16

After initialization, configure kubectl access for the current user.

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Install the Flannel CNI plugin to enable pod networking within the cluster.

kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml

Verify that all system pods are operational.

kubectl get pods --all-namespaces

2.3 Joining the Worker Nodes

Generate the worker node join command on the master node.

kubeadm token create --print-join-command

Execute the generated output on each worker node in order to join the cluster.

sudo kubeadm join <MASTER-IP-ADDRESS>:6443 --token <TOKEN> \
--discovery-token-ca-cert-hash sha256:<HASH>

After all worker nodes have joined, verify the cluster node status from the master node.

kubectl get nodes

Expected output:

kubectl get nodes showing all three nodes Ready


2.4 Preparation of Worker Nodes for Longhorn

Required for Longhorn Failover

Only needed if you plan to use Longhorn with volume replication for failover. Skip this step for a single-worker setup.

Install the packages required by Longhorn and enable the iSCSI daemon.

sudo apt-get install -y open-iscsi nfs-common
sudo systemctl enable --now iscsid

To prevent conflicts between multipathd and Longhorn block devices, apply the following configuration and restart the service.

cat << 'EOF' | sudo tee /etc/multipath.conf
defaults {
user_friendly_names yes
}
blacklist {
devnode "^sd[a-z0-9]+"
}
EOF
sudo systemctl restart multipathd

3. Infrastructure Components

3.1 Longhorn – Distributed Storage

Optional

Only deploy Longhorn if failover with replicated volumes is required. Otherwise skip to 3.2 MetalLB.

Deploy Longhorn into the cluster.

kubectl apply -f \
https://raw.githubusercontent.com/longhorn/longhorn/v1.7.2/deploy/longhorn.yaml

# Wait until all Longhorn components are ready (~2–3 min)
kubectl wait --namespace longhorn-system \
--for=condition=ready pod \
--selector=app=longhorn-manager \
--timeout=300s

kubectl get pods -n longhorn-system

Adjust the default Longhorn storage settings to better accommodate environments with smaller disks (optional).

# Reduce minimum free space from 25% to 10%
kubectl patch settings.longhorn.io storage-minimal-available-percentage \
-n longhorn-system --type=merge -p '{"value":"10"}'

# Reduce reserved storage per node from 30% to 10%
kubectl patch settings.longhorn.io storage-reserved-percentage-for-default-disk \
-n longhorn-system --type=merge -p '{"value":"10"}'

# Automatically delete pods on node failure to enable failover
kubectl patch settings.longhorn.io node-down-pod-deletion-policy \
-n longhorn-system --type=merge \
-p '{"value":"delete-both-statefulset-and-deployment-pod"}'

The Longhorn web interface may optionally be accessed locally for monitoring and administration purposes.

kubectl port-forward -n longhorn-system svc/longhorn-frontend 8080:80

3.2 MetalLB

Deploy MetalLB to provide load-balancer functionality in the bare-metal Kubernetes environment.

kubectl apply -f \
https://raw.githubusercontent.com/metallb/metallb/v0.14.9/config/manifests/metallb-native.yaml

kubectl wait --namespace metallb-system \
--for=condition=ready pod \
--selector=app=metallb \
--timeout=120s

Apply the predefined IP address pool configuration.

kubectl apply -f metallb/metallb-config.yaml
Adjust IP Range

The IP range defined in metallb/metallb-config.yaml must be adjusted to match available addresses in the node network before applying.


3.3 NGINX-Ingress Controller

Deploy the nginx ingress controller.

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.11.3/deploy/static/provider/cloud/deploy.yaml

kubectl wait --namespace ingress-nginx \
--for=condition=ready pod \
--selector=app.kubernetes.io/component=controller \
--timeout=120s

# Verify external IP – must show an address from the MetalLB pool
kubectl get svc -n ingress-nginx ingress-nginx-controller

For improved availability, scale the controller to two replicas and enforce node-level distribution using pod anti-affinity.

kubectl scale deployment ingress-nginx-controller -n ingress-nginx --replicas=2

kubectl patch deployment ingress-nginx-controller -n ingress-nginx --type=merge -p '{
"spec": {
"template": {
"spec": {
"affinity": {
"podAntiAffinity": {
"requiredDuringSchedulingIgnoredDuringExecution": [{
"labelSelector": {
"matchLabels": {"app.kubernetes.io/component": "controller"}
},
"topologyKey": "kubernetes.io/hostname"
}]
}
}
}
}
}
}'

4. Deployment of IDIAL

4.1 Adjustment of Configuration Files

Set the IDIAL image version (mandatory). To review the available image tags, use the following command:

curl -s -u <DOCKER_USERNAME>:<TOKEN> \
"https://hub.docker.com/v2/repositories/bxc2security/idial/tags/?page_size=25" \
| python3 -c "import sys,json; [print(t['name']) for t in json.load(sys.stdin)['results']]"

Specify a stable image version in 05-idial.yaml:

image: docker.io/bxc2security/idial:<STABLE_VERSION>

Update the application secrets in 02-app-secrets.yaml:

  • SECRET_KEY must be set to a secure random value.
  • PKCS8_PW must match the password used during certificate generation.

4.2 Creation of Namespace and Registry Credentials

Create the target namespace and then create the Docker registry secret required for image pulls.

kubectl apply -f 00-namespace.yaml

kubectl create secret docker-registry registry-credentials \
--docker-server=https://index.docker.io/v1/ \
--docker-username=<DOCKER_USERNAME> \
--docker-password=<DOCKER_PASSWORD_OR_ACCESS_TOKEN> \
--docker-email=<EMAIL> \
-n idial

4.3 Creation of the TLS Secret for Ingress

Generate a self-signed wildcard certificate for *.company.local, create the Kubernetes TLS secret, and remove the local certificate files afterward.

Optional

You can also use your own certificate. In that case, skip the openssl command and create the secret directly from your existing .crt and .key files.

openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
-keyout idial-ingress.key \
-out idial-ingress.crt \
-subj "/CN=*.company.local/O=Company" \
-addext "subjectAltName=DNS:idial.company.local,DNS:idial-api.company.local,DNS:ua.company.local,DNS:db.company.local"

kubectl create secret tls idial-ingress-tls \
--cert=idial-ingress.crt \
--key=idial-ingress.key \
-n idial

rm idial-ingress.key idial-ingress.crt

4.4 Deployment of All Application Components

Deploy the complete application stack using the kustomization in the current directory.

kubectl apply -k .

Monitor the deployment progress until all pods are running and all persistent volume claims are bound.

kubectl get pods -n idial -w
kubectl get pvc -n idial # All PVCs must be Bound
kubectl get ingress -n idial

4.5 DNS & Accessing the Services

After successful deployment, create A records pointing to the MetalLB VIP. Example:

HostnameIP
idial.bxc.local10.10.10.230
idial-api.bxc.local10.10.10.230
ua.bxc.local10.10.10.230

Alternative – local testing via hosts file:

<METALLB-VIP> idial.company.local idial-api.company.local ua.company.local
  • Linux / macOS: /etc/hosts
  • Windows: C:\Windows\System32\drivers\etc\hosts

4.6 Accessing the Services

ServiceURL
IDIAL Web UIhttps://idial.bxc.local
IDIAL REST APIhttps://idial-api.bxc.local
OPC UA Expert (VNC)https://ua.bxc.local

To confirm the MetalLB virtual IP assignment, run:

kubectl get svc -n ingress-nginx ingress-nginx-controller

5. Failover Behavior

The environment operates with two worker nodes and Longhorn volume replication using two copies per volume. If one worker node fails, workloads are expected to restart automatically on the remaining worker node.

Expected failover timeline:

PhaseDuration
Node becomes NotReady~40 s
Pods are evicted~5 min (Kubernetes default)
Longhorn volumes reattach~60 s
Pods start on Worker 2~30 s
Total~7 minutes

Failover test:

# Watch from the master while shutting down one node:
kubectl get pods -n idial -o wide -w

Manual recovery procedure if pods remain stuck:

# Force-delete stuck Terminating pods
kubectl delete pods -n idial --all --force --grace-period=0

# Release Longhorn volumes stuck on the failed node
kubectl get volumes.longhorn.io -n longhorn-system -o json | \
python3 -c "
import sys, json, subprocess
data = json.load(sys.stdin)
for v in data['items']:
if v['status'].get('currentNodeID') == 'k8s-idial-worker01-orange':
name = v['metadata']['name']
subprocess.run([
'kubectl', '-n', 'longhorn-system', 'patch', 'volume.longhorn.io', name,
'--type=merge', '--patch', '{\"spec\":{\"nodeID\":\"\",\"migrationNodeID\":\"\"}}'
])
"

6. Troubleshooting

Diagnostic Commands

# Pod details (e.g. for Pending or CrashLoop)
kubectl describe pod -n idial <POD_NAME>

# Application logs
kubectl logs -n idial deployment/idial
kubectl logs -n idial deployment/idial-web-backend

# Longhorn volume status
kubectl get volumes.longhorn.io -n longhorn-system

# Ingress controller logs
kubectl logs -n ingress-nginx deployment/ingress-nginx-controller

# Events in the namespace
kubectl get events -n idial --sort-by='.lastTimestamp'

# MetalLB status
kubectl get ipaddresspool -n metallb-system
kubectl get l2advertisement -n metallb-system

Common Symptoms

SymptomLikely CauseSolution
Pod PendingLonghorn volume not readykubectl get pods -n longhorn-system
ImagePullBackOffRegistry credentials missing or wrongRedo step 4.2
PVC PendingLonghorn not ready or DiskPressurekubectl get nodes.longhorn.io -n longhorn-system
Longhorn volume faultedDiskPressure – not enough free spaceSet storage-minimal-available-percentage to 10
CrashLoopBackOff (idial)Wrong image tag (latest is broken)Set a stable tag in 05-idial.yaml
Ingress EXTERNAL-IP: <pending>MetalLB not installed or no IP poolCheck step 3.2
503 Service UnavailableBackend pod not readykubectl get pods -n idial
502 Bad Gateway after failoverPods not yet started on Worker 2Wait (~7 min) or force-delete stuck pods
Longhorn manager 1/2multipathd conflict on workerApply /etc/multipath.conf fix (step 2.4)
Volume stuck AttachingOld pod holding volume (Multi-Attach)kubectl delete pod <POD> -n idial --force --grace-period=0

Non-Migrated Services

The following services from the Docker Compose deployment are not included in the Kubernetes setup:

ServiceReason
dockhandRequires Docker socket — not available with containerd
status_checkDocker-specific pre-flight check — not needed in K8S
finish-idialInteractive Docker Compose hint — not relevant in K8S