物理構成

tx100s3-01 (Debian 11, 16GB)                tx100s3-02 (Debian 11, 16GB)
├── KVMゲスト1: ha-master1 (3GB)        ├── KVMゲスト1: hb-master (3GB)
├── KVMゲスト2: ha-worker1 (4GB)        ├── KVMゲスト2: hb-worker1 (4GB)
├── KVMゲスト2: ha-worker2 (4GB)        ├── KVMゲスト2: hb-worker2 (4GB)

論理構成

    今回         [ 高可用性K3sクラスター ] ------------ Keepalived + HAProxyで冗長化
                           ↓
    作成済み        [ 外部PostgreSQL ] ------------ Keepalived + HAProxyで冗長化
                (ホストA or B のどちらかで稼働)
                           ↓
          [ ストレージ: Longhorn (4ノード分散) ]
                           ↓
            [ メールサーバー + その他サービス ]

作成

host-aにて下記を作成,実行

/v/bin/k3s/07-deploy-k3s-ha-cluster.sh

bash /v/bin/k3s/07-deploy-k3s-ha-cluster.sh

確認

ha-master:~$ kubectl get all
NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service/kubernetes   ClusterIP   10.43.0.1    <none>        443/TCP   2m4s
ha-master:~$ kubectl get nodes
NAME         STATUS   ROLES           AGE    VERSION
ha-master    Ready    control-plane   2m9s   v1.34.3+k3s1
ha-worker1   Ready    <none>          108s   v1.34.3+k3s1
ha-worker2   Ready    <none>          100s   v1.34.3+k3s1
hb-master    Ready    control-plane   71s    v1.34.3+k3s1
hb-worker1   Ready    <none>          68s    v1.34.3+k3s1
hb-worker2   Ready    <none>          60s    v1.34.3+k3s1
ha-master:~$ ls -la /etc/rancher/k3s/
total 12
drwxr-xr-x 2 root root 4096 Jan 31 22:09 .
drwxr-xr-x 4 root root 4096 Jan 31 22:09 ..
-rw------- 1 root root    0 Jan 31 22:09 k3s.env
-rw-r--r-- 1 root root 2945 Jan 31 22:09 k3s.yaml
ha-master:~$ sudo netstat -ntlpu
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 127.0.0.1:10010         0.0.0.0:*               LISTEN      5774/containerd 
tcp        0      0 127.0.0.1:10257         0.0.0.0:*               LISTEN      5731/k3s server
tcp        0      0 127.0.0.1:10256         0.0.0.0:*               LISTEN      5731/k3s server
tcp        0      0 127.0.0.1:10259         0.0.0.0:*               LISTEN      5731/k3s server
tcp        0      0 127.0.0.1:10258         0.0.0.0:*               LISTEN      5731/k3s server
tcp        0      0 127.0.0.1:10249         0.0.0.0:*               LISTEN      5731/k3s server
tcp        0      0 127.0.0.1:10248         0.0.0.0:*               LISTEN      5731/k3s server
tcp        0      0 127.0.0.1:6444          0.0.0.0:*               LISTEN      5731/k3s server
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      2224/sshd [listener
tcp        0      0 :::6443                 :::*                    LISTEN      5731/k3s server
tcp        0      0 :::22                   :::*                    LISTEN      2224/sshd [listener
tcp        0      0 :::10250                :::*                    LISTEN      5731/k3s server
udp        0      0 0.0.0.0:8472            0.0.0.0:*                           -
ha-master:~$ ps awxu|grep kube
 5730 root      0:00 supervise-daemon k3s --start --stdout /var/log/k3s.log --stderr /var/log/k3s.log --pidfile /var/run/k3s.pid --respawn-delay 5 --respawn-max 0 --respawn-period 1800 /usr/local/bin/k3s -- server --token k3s-reinstalled-20260131-2209 --datastore-endpoint postgres://k3s:**************@172.31.0.200:6432/k3s?sslmode=disable --node-name ha-master --node-label datacenter=ha --node-taint CriticalAddonsOnly=true:NoExecute --tls-san 172.31.0.200 --cluster-init --write-kubeconfig-mode 644
 7640 k3sadmin  0:00 grep kube

ホスト機及びローカル端末からkubectlを使えるようにする

kube_configのダウンロード

scp k3sadmin@$VM_A_MASTER_IP:/etc/rancher/k3s/k3s.yaml ~/.kube/config
sed -i 's/127.0.0.1/$VIP/g' ~/.kube/config

既存のhaproxy.confに下記を追加(postgres-haproxyとは別)

frontend k3s-api-lan
  bind 172.24.44.200:6443
  mode tcp
  option tcplog
  default_backend k3s_api_backend

frontend k3s-api-vlan3
  bind 172.31.0.200:6443
  mode tcp
  option tcplog
  default_backend k3s_api_backend

backend k3s_api_backend
    mode tcp
    option tcp-check
    balance roundrobin
    default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100

    # K3sマスターノード
    server k3s_master1 172.31.1.11:6443 check
    server k3s_master2 172.31.1.21:6443 check

別ネットワークからの接続用にnodeのk3sを修正する

--tls-san へ新規接続先VIPもしくはドメイン名を追加する

/v/bin/k3s/08-add-k3s-cert-alpine.sh

#!/bin/bash

EXTERNAL_IP="172.24.44.200"
for master in 172.31.1.11 172.31.1.21; do
    echo "修正: $master"
    ssh k3sadmin@$master "sudo sed -i \"/'--tls-san'/{n;s/'172.31.0.200'/'172.31.0.200,$EXTERNAL_IP'/}\" /etc/init.d/k3s && sudo rc-service k3s restart"
done

動作確認

test pods

k3s-test.sh 
=== Kubernetes Redundancy Test Script ===
Starting tests at: 2026年  2月  1日 日曜日 13:57:30 JST

1. Deploying sample applications...
deployment.apps/nginx-deployment unchanged
service/nginx-service unchanged
configmap/health-monitor-html unchanged
deployment.apps/health-monitor unchanged
service/health-monitor-service unchanged
Waiting for pods to be ready...
pod/nginx-deployment-6ff54694c8-g77zx condition met
pod/nginx-deployment-6ff54694c8-lnvgv condition met
pod/nginx-deployment-6ff54694c8-qbbwq condition met
pod/health-monitor-5bd64f6f84-htlmf condition met
pod/health-monitor-5bd64f6f84-m4qh9 condition met

2. Current cluster status:
NAME         STATUS   ROLES           AGE   VERSION        INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION   CONTAINER-RUNTIME
ha-master    Ready    control-plane   15h   v1.34.3+k3s1   172.31.1.11   <none>        Alpine Linux v3.23   6.18.5-0-virt    containerd://2.1.5-k3s1
ha-worker1   Ready    <none>          15h   v1.34.3+k3s1   172.31.1.12   <none>        Alpine Linux v3.23   6.18.5-0-virt    containerd://2.1.5-k3s1
ha-worker2   Ready    <none>          15h   v1.34.3+k3s1   172.31.1.13   <none>        Alpine Linux v3.23   6.18.5-0-virt    containerd://2.1.5-k3s1
hb-master    Ready    control-plane   15h   v1.34.3+k3s1   172.31.1.21   <none>        Alpine Linux v3.23   6.18.5-0-virt    containerd://2.1.5-k3s1
hb-worker1   Ready    <none>          15h   v1.34.3+k3s1   172.31.1.22   <none>        Alpine Linux v3.23   6.18.5-0-virt    containerd://2.1.5-k3s1
hb-worker2   Ready    <none>          15h   v1.34.3+k3s1   172.31.1.23   <none>        Alpine Linux v3.23   6.18.5-0-virt    containerd://2.1.5-k3s1

NAME                                READY   STATUS    RESTARTS   AGE   IP            NODE         NOMINATED NODE   READINESS GATES
health-monitor-5bd64f6f84-htlmf     1/1     Running   0          85s   10.42.16.26   hb-worker1   <none>           <none>
health-monitor-5bd64f6f84-m4qh9     1/1     Running   0          75s   10.42.18.5    hb-worker2   <none>           <none>
nginx-deployment-6ff54694c8-g77zx   1/1     Running   0          40m   10.42.2.4     ha-worker2   <none>           <none>
nginx-deployment-6ff54694c8-lnvgv   1/1     Running   0          40m   10.42.1.5     ha-worker1   <none>           <none>
nginx-deployment-6ff54694c8-qbbwq   1/1     Running   0          40m   10.42.16.4    hb-worker1   <none>           <none>

NAME                     TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE   SELECTOR
health-monitor-service   ClusterIP   10.43.227.157   <none>        80/TCP         40m   app=health-monitor
kubernetes               ClusterIP   10.43.0.1       <none>        443/TCP        15h   <none>
nginx-service            NodePort    10.43.98.251    <none>        80:30080/TCP   40m   app=nginx

3. Testing connectivity...
Test 1:
    <p><strong>Pod Name:</strong> nginx-deployment-6ff54694c8-g77zx</p>
Test 2:
    <p><strong>Pod Name:</strong> nginx-deployment-6ff54694c8-qbbwq</p>
Test 3:
    <p><strong>Pod Name:</strong> nginx-deployment-6ff54694c8-lnvgv</p>
Test 4:
If you don't see a command prompt, try pressing enter.
warning: couldn't attach to pod/test-4, falling back to streaming logs: Internal error occurred: unable to upgrade connection: container test-4 not found in pod test-4_default
    <p><strong>Pod Name:</strong> nginx-deployment-6ff54694c8-qbbwq</p>
Test 5:
    <p><strong>Pod Name:</strong> nginx-deployment-6ff54694c8-g77zx</p>

コントロールプレーン1台を落としてみる

/home/k3sadmin # date
Sun Feb  1 14:09:00 JST 2026
/home/k3sadmin # kubectl get pods -o wide
NAME                                READY   STATUS    RESTARTS   AGE   IP            NODE         NOMINATED NODE   READINESS GATES
health-monitor-5bd64f6f84-htlmf     1/1     Running   0          12m   10.42.16.26   hb-worker1   <none>           <none>
health-monitor-5bd64f6f84-m4qh9     1/1     Running   0          12m   10.42.18.5    hb-worker2   <none>           <none>
nginx-deployment-6ff54694c8-g77zx   1/1     Running   0          52m   10.42.2.4     ha-worker2   <none>           <none>
nginx-deployment-6ff54694c8-lnvgv   1/1     Running   0          52m   10.42.1.5     ha-worker1   <none>           <none>
nginx-deployment-6ff54694c8-qbbwq   1/1     Running   0          52m   10.42.16.4    hb-worker1   <none>           <none>
/home/k3sadmin # poweroff 
/home/k3sadmin # Connection to 172.31.1.11 closed by remote host.
Connection to 172.31.1.11 closed.

test app状態

k3s-test.sh 
=== Kubernetes Redundancy Test Script ===
Starting tests at: 2026年  2月  1日 日曜日 14:09:49 JST

1. Deploying sample applications...
deployment.apps/nginx-deployment unchanged
service/nginx-service unchanged
configmap/health-monitor-html unchanged
deployment.apps/health-monitor unchanged
service/health-monitor-service unchanged
Waiting for pods to be ready...
pod/nginx-deployment-6ff54694c8-g77zx condition met
pod/nginx-deployment-6ff54694c8-lnvgv condition met
pod/nginx-deployment-6ff54694c8-qbbwq condition met
pod/health-monitor-5bd64f6f84-htlmf condition met
pod/health-monitor-5bd64f6f84-m4qh9 condition met

2. Current cluster status:
NAME         STATUS   ROLES           AGE   VERSION        INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION   CONTAINER-RUNTIME
ha-master    Ready    control-plane   16h   v1.34.3+k3s1   172.31.1.11   <none>        Alpine Linux v3.23   6.18.5-0-virt    containerd://2.1.5-k3s1
ha-worker1   Ready    <none>          15h   v1.34.3+k3s1   172.31.1.12   <none>        Alpine Linux v3.23   6.18.5-0-virt    containerd://2.1.5-k3s1
ha-worker2   Ready    <none>          15h   v1.34.3+k3s1   172.31.1.13   <none>        Alpine Linux v3.23   6.18.5-0-virt    containerd://2.1.5-k3s1
hb-master    Ready    control-plane   15h   v1.34.3+k3s1   172.31.1.21   <none>        Alpine Linux v3.23   6.18.5-0-virt    containerd://2.1.5-k3s1
hb-worker1   Ready    <none>          15h   v1.34.3+k3s1   172.31.1.22   <none>        Alpine Linux v3.23   6.18.5-0-virt    containerd://2.1.5-k3s1
hb-worker2   Ready    <none>          15h   v1.34.3+k3s1   172.31.1.23   <none>        Alpine Linux v3.23   6.18.5-0-virt    containerd://2.1.5-k3s1

NAME                                READY   STATUS    RESTARTS   AGE   IP            NODE         NOMINATED NODE   READINESS GATES
health-monitor-5bd64f6f84-htlmf     1/1     Running   0          13m   10.42.16.26   hb-worker1   <none>           <none>
health-monitor-5bd64f6f84-m4qh9     1/1     Running   0          13m   10.42.18.5    hb-worker2   <none>           <none>
nginx-deployment-6ff54694c8-g77zx   1/1     Running   0          53m   10.42.2.4     ha-worker2   <none>           <none>
nginx-deployment-6ff54694c8-lnvgv   1/1     Running   0          53m   10.42.1.5     ha-worker1   <none>           <none>
nginx-deployment-6ff54694c8-qbbwq   1/1     Running   0          53m   10.42.16.4    hb-worker1   <none>           <none>

NAME                     TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE   SELECTOR
health-monitor-service   ClusterIP   10.43.227.157   <none>        80/TCP         53m   app=health-monitor
kubernetes               ClusterIP   10.43.0.1       <none>        443/TCP        16h   <none>
nginx-service            NodePort    10.43.98.251    <none>        80:30080/TCP   53m   app=nginx

3. Testing connectivity...
Test 1:
    <p><strong>Pod Name:</strong> nginx-deployment-6ff54694c8-lnvgv</p>
Test 2:
    <p><strong>Pod Name:</strong> nginx-deployment-6ff54694c8-lnvgv</p>
Test 3:
    <p><strong>Pod Name:</strong> nginx-deployment-6ff54694c8-qbbwq</p>
Test 4:
If you don't see a command prompt, try pressing enter.
warning: couldn't attach to pod/test-4, falling back to streaming logs: Internal error occurred: unable to upgrade connection: container test-4 not found in pod test-4_default
    <p><strong>Pod Name:</strong> nginx-deployment-6ff54694c8-g77zx</p>
Test 5:
If you don't see a command prompt, try pressing enter.
warning: couldn't attach to pod/test-5, falling back to streaming logs: Internal error occurred: Internal error occurred: error attaching to container: container is in CONTAINER_EXITED state
    <p><strong>Pod Name:</strong> nginx-deployment-6ff54694c8-lnvgv</p>

workerノードは別なので影響なしを確認

ha-master 復旧

[172.31.0.2][root@tx100s3-01 14:11:43 ~]# virsh list
 Id    名前         状態
----------------------------
 20    adm01        実行中
 157   ha-worker2   実行中
 158   ha-worker1   実行中

[172.31.0.2][root@tx100s3-01 14:11:45 ~]# virsh start ha-master
Domain 'ha-master' started