feat(alerts): Health-Alarme via Webhook + Email-SMTP

Sidebar → System → Alarme.

Migration 0021: alert_channels (kind=webhook|email, target, settings,
active) + alert_events (kind, severity=info/warning/error/critical,
subject, message, sent_to JSONB).

internal/services/alerts/:
  - Fire(kind, severity, subject, message) — broadcastet an alle
    aktiven Channels + persistiert Event mit per-Channel-Result
    (ok/error) in sent_to.
  - Webhook-Sender: POST JSON {kind, severity, subject, message,
    content, text, fired_at}. Slack/Discord/Teams akzeptieren das
    out-of-the-box ohne Adapter (content + text-Felder gleichzeitig).
  - Email-Sender: net/smtp + STARTTLS optional. Settings (smtp_host,
    smtp_port, username/password, from, use_tls) liegen in
    channel.settings JSONB.

internal/handlers/alerts.go: CRUD + POST /alerts/test + GET
/alerts/events (history).

Scheduler-Trigger:
  - cert.expiring  — TLS-Cert <14 Tage Restzeit (12h-dedupe pro cert)
                     severity warning, <3 Tage → error
  - cert.renew_failed       — Renewer-Cycle hat fails
  - cert.renewer.run_failed — Renewer-Cycle abgebrochen
  - backup.failed  — Scheduled Backup error
  - license.invalid — License-Server liefert valid=false

In-process Dedupe (12h TTL, map[key]time.Time) verhindert dass
identische Alerts in Schleifen feuern.

UI (pages/Alerts): Tabs Channels (CRUD-Tabelle, Add-Modal mit
conditional-Email-Fields) + History (200 letzte Events mit
severity-Tag + per-Channel-Delivery-Status). Header-Button
„Test-Alert" feuert einen Test-Event in alle aktiven Channels.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Debian
2026-05-13 15:57:05 +02:00
parent 4a34629023
commit 81a8217493
13 changed files with 1012 additions and 14 deletions

View File

@@ -30,6 +30,7 @@ import (
wgrender "git.netcell-it.de/projekte/edgeguard-native/internal/wireguard"
"git.netcell-it.de/projekte/edgeguard-native/internal/handlers/response"
"git.netcell-it.de/projekte/edgeguard-native/internal/services/acme"
"git.netcell-it.de/projekte/edgeguard-native/internal/services/alerts"
"git.netcell-it.de/projekte/edgeguard-native/internal/services/audit"
"git.netcell-it.de/projekte/edgeguard-native/internal/services/backends"
"git.netcell-it.de/projekte/edgeguard-native/internal/services/backendservers"
@@ -52,7 +53,7 @@ import (
wgsvc "git.netcell-it.de/projekte/edgeguard-native/internal/services/wireguard"
)
var version = "1.0.73"
var version = "1.0.74"
func main() {
addr := os.Getenv("EDGEGUARD_API_ADDR")
@@ -205,6 +206,7 @@ func main() {
// Jobs laufen im edgeguard-scheduler.
handlers.NewBackupHandler(backup.New(pool), auditRepo, nodeID, version).Register(authed)
handlers.NewDiagnosticsHandler().Register(authed)
handlers.NewAlertsHandler(alerts.New(pool), auditRepo, nodeID).Register(authed)
handlers.NewTLSCertsHandler(tlsRepo, auditRepo, nodeID, acmeService).Register(authed)
// Firewall reload: nach jeder Mutation den Renderer neu fahren
// (writes ruleset.nft + sudo nft -f). Errors loggen, nicht failen.