feat(alerts): Health-Alarme via Webhook + Email-SMTP

Sidebar → System → Alarme.

Migration 0021: alert_channels (kind=webhook|email, target, settings,
active) + alert_events (kind, severity=info/warning/error/critical,
subject, message, sent_to JSONB).

internal/services/alerts/:
  - Fire(kind, severity, subject, message) — broadcastet an alle
    aktiven Channels + persistiert Event mit per-Channel-Result
    (ok/error) in sent_to.
  - Webhook-Sender: POST JSON {kind, severity, subject, message,
    content, text, fired_at}. Slack/Discord/Teams akzeptieren das
    out-of-the-box ohne Adapter (content + text-Felder gleichzeitig).
  - Email-Sender: net/smtp + STARTTLS optional. Settings (smtp_host,
    smtp_port, username/password, from, use_tls) liegen in
    channel.settings JSONB.

internal/handlers/alerts.go: CRUD + POST /alerts/test + GET
/alerts/events (history).

Scheduler-Trigger:
  - cert.expiring  — TLS-Cert <14 Tage Restzeit (12h-dedupe pro cert)
                     severity warning, <3 Tage → error
  - cert.renew_failed       — Renewer-Cycle hat fails
  - cert.renewer.run_failed — Renewer-Cycle abgebrochen
  - backup.failed  — Scheduled Backup error
  - license.invalid — License-Server liefert valid=false

In-process Dedupe (12h TTL, map[key]time.Time) verhindert dass
identische Alerts in Schleifen feuern.

UI (pages/Alerts): Tabs Channels (CRUD-Tabelle, Add-Modal mit
conditional-Email-Fields) + History (200 letzte Events mit
severity-Tag + per-Channel-Delivery-Status). Header-Button
„Test-Alert" feuert einen Test-Event in alle aktiven Channels.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Debian
2026-05-13 15:57:05 +02:00
parent 4a34629023
commit 81a8217493
13 changed files with 1012 additions and 14 deletions

View File

@@ -22,6 +22,7 @@
"logs": "Logs",
"backups": "Backups",
"diagnostics": "Diagnose",
"alerts": "Alarme",
"license": "Lizenz",
"settings": "Einstellungen",
"section": {
@@ -682,6 +683,34 @@
"src": "Quell-IP"
}
},
"alerts": {
"title": "Health-Alarme",
"intro": "Notification-Channels für kritische Events. Webhook (Slack/Discord/Teams/Generic-HTTP) oder Email (SMTP). Triggers: cert.expiring (<14 d), cert.renew_failed, backup.failed, license.invalid.",
"scopeTitle": "Was triggert Alarme?",
"scopeDesc": "cert.expiring — TLS-Zertifikat <14 Tage Restzeit (dedupe 12h). cert.renew_failed — ACME-Renewer hat Fails. backup.failed — Scheduled Backup konnte nicht erstellt werden. license.invalid — License-Server liefert valid=false. Mehr Triggers folgen (Backend-Down, Disk-Usage).",
"tabs": { "channels": "Channels", "events": "History" },
"add": "Channel hinzufügen",
"addTitle": "Notification-Channel anlegen",
"editTitle": "Channel bearbeiten",
"test": "Test-Alert",
"testDone": "Test gesendet — {{ok}}/{{total}} Channels erfolgreich",
"emptyChannels": "Keine Channels. Lege einen Webhook oder eine Email an.",
"emptyEvents": "Noch keine Alarme — Triggers haben noch keinen Event gefeuert.",
"noChannels": "kein Channel aktiv",
"confirmDelete": "Channel {{name}} wirklich löschen?",
"col": {
"name": "Name",
"kind": "Typ",
"target": "Ziel",
"targetWebhook": "Webhook-URL",
"targetEmail": "Empfänger-Email",
"active": "Aktiv",
"time": "Zeit",
"severity": "Severity",
"subject": "Betreff",
"delivered": "Gesendet"
}
},
"diag": {
"title": "Diagnose",
"intro": "Operator-Tools direkt aus dem UI: ping, traceroute, DNS, HTTP-Probe, TCP-Connect. Alle Calls laufen authentifiziert auf dieser Box (nicht im Browser).",