Files
Debian cb5691cf3c feat(cluster): (c) Phase-3 MVP — stable node-id + self-register + Cluster-Page
Minimal-Slice für Phase-3-Cluster:
* internal/cluster/node_id.go — stable UUID 'n-<16hex>' in
  /var/lib/edgeguard/node-id, idempotent über reboots.
* internal/cluster/store.go — ha_nodes-Repo (List/Get/UpsertSelf)
  via pgxpool. EnsureSelfRegistered upsertet die lokale Row beim
  Boot mit FQDN aus setup.json.
* internal/handlers/cluster.go — GET /api/v1/cluster/nodes liefert
  alle ha_nodes plus local_id (für UI-Highlighting).
* main.go: nach DB-Pool-Open wird EnsureSelfRegistered (nur wenn
  setup.completed) ausgeführt, ClusterHandler registriert.
* management-ui/src/pages/Cluster/index.tsx — Tabelle mit Node-ID,
  FQDN, Rolle, Beitrittszeit; eigene Node mit "diese Node"-Tag
  markiert. Sidebar-Eintrag + i18n de/en.

Bewusst NICHT in dieser Runde: cluster-init/cluster-join CLIs, KeyDB
Active-Active config-gen, PG streaming replication, mTLS zwischen
Peers, License-Leader-Election. Diese kommen mit dem ersten echten
Multi-Node-Test (Phase 3.1) — sonst Code ohne Smoke-Möglichkeit.

End-to-end-Smoke: setup → restart → ha_nodes hat 1 Row mit
fqdn=eg.example.com, /cluster/nodes liefert sie korrekt mit
local_id-Markierung.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 11:52:54 +02:00

82 lines
2.2 KiB
Go

// Package cluster owns the local cluster identity (node ID + role)
// and self-registration into ha_nodes on boot.
//
// v1 is single-node only — we register the local node so the UI's
// Cluster page has something to show and so multi-node Phase 3.1
// can build on a stable identity. Real cluster-join + KeyDB AA +
// PG streaming replication come later.
package cluster
import (
"crypto/rand"
"encoding/hex"
"fmt"
"os"
"path/filepath"
"strings"
)
const (
// DefaultNodeIDPath persists the node identifier across restarts.
// Lives in the EdgeGuard data dir so /etc/machine-id collisions
// (cloned VMs) don't matter — only this file determines identity.
DefaultNodeIDPath = "/var/lib/edgeguard/node-id"
nodeIDPrefix = "n-"
)
// EnsureNodeID returns the stable cluster node identifier, generating
// and persisting one on first call. The format is `n-<16 hex chars>`.
//
// On read errors (missing dir, permission denied) the function returns
// the freshly-minted in-memory ID and the persistence error so the
// caller can decide whether to abort or proceed with an ephemeral ID
// (development boxes typically don't have /var/lib/edgeguard/ writable).
func EnsureNodeID(path string) (string, error) {
if path == "" {
path = DefaultNodeIDPath
}
if b, err := os.ReadFile(path); err == nil {
s := strings.TrimSpace(string(b))
if validNodeID(s) {
return s, nil
}
}
id, err := mintNodeID()
if err != nil {
return "", err
}
if err := os.MkdirAll(filepath.Dir(path), 0o750); err != nil {
return id, fmt.Errorf("ensure node-id dir: %w", err)
}
if err := os.WriteFile(path, []byte(id+"\n"), 0o640); err != nil {
return id, fmt.Errorf("write node-id: %w", err)
}
return id, nil
}
func mintNodeID() (string, error) {
buf := make([]byte, 8)
if _, err := rand.Read(buf); err != nil {
return "", err
}
return nodeIDPrefix + hex.EncodeToString(buf), nil
}
func validNodeID(s string) bool {
if !strings.HasPrefix(s, nodeIDPrefix) {
return false
}
rest := s[len(nodeIDPrefix):]
if len(rest) != 16 {
return false
}
for _, r := range rest {
ok := (r >= '0' && r <= '9') || (r >= 'a' && r <= 'f')
if !ok {
return false
}
}
return true
}