From 7d655231edea9b19e9e5c4b746cfea203117514c Mon Sep 17 00:00:00 2001 From: Eratostenes de Gitjabia Date: Sat, 9 May 2026 12:29:31 +0000 Subject: [PATCH] docs(architecture): overview document --- docs/01-architecture/overview.md | 125 +++++++++++++++++++++++++++++++ 1 file changed, 125 insertions(+) create mode 100644 docs/01-architecture/overview.md diff --git a/docs/01-architecture/overview.md b/docs/01-architecture/overview.md new file mode 100644 index 0000000..5fe7a40 --- /dev/null +++ b/docs/01-architecture/overview.md @@ -0,0 +1,125 @@ +# Architecture · Overview + +This is the high-level technical view. For specific subsystems see the sibling files in this folder. + +## The four tiers + +``` + ┌──────────────────────────────────────┐ + │ Hub (EU sovereign bare-metal) │ + │ - multi-site control plane │ + │ - cross-site forensic search │ + │ - operator auth (Keycloak) │ + │ - long-term embeddings index │ + └────────────────┬─────────────────────┘ + │ MQTT bridge over TLS + │ + HTTPS for blob storage + │ + ╔══════════════╪══════════════╗ + ║ │ ║ per-site + ║ ┌─────────▼─────────┐ ║ boundary + ║ │ Router (OpenWrt) │ ║ + ║ │ - mosquitto │ ║ + ║ │ - tailscale │ ║ + ║ │ - GitOps recon. │ ║ + ║ │ - SPA host │ ║ + ║ │ - reverse proxy │ ║ + ║ └────┬─────────┬────┘ ║ + ║ │ │ ║ + ║ VLAN-10 VLAN-20 ║ + ║ cameras compute ║ + ║ ┌──┐ ┌──┐ ┌────────┐ ║ + ║ │c1│ │c2│ │ Cell │ ║ + ║ └──┘ └──┘ │RK3588 │ ║ + ║ ... │ │ ║ + ║ │frigate │ ║ + ║ │enricher│ ║ + ║ │re-id │ ║ + ║ │healthd │ ║ + ║ └────────┘ ║ + ║ ▲ ║ + ║ │ ║ + ║ ▼ ║ + ║ (optional Core) ║ + ║ ┌─────────────┐ ║ + ║ │ Jetson Orin │ ║ + ║ │ federates │ ║ + ║ │ N Cells │ ║ + ║ └─────────────┘ ║ + ╚══════════════════════════════╝ +``` + +## Data flow + +**Capture**: cameras (RTSP) and microphones publish to the Cell. + +**Inference**: Frigate runs detection on streams, generates events. Enricher consumes events and produces embeddings, hashes, re-ID vectors. + +**Bus**: everything flows over MQTT topics on the local broker. The contract is documented in [`mqtt-contract.md`](mqtt-contract.md). + +**Storage**: +- Raw video clips → Cell's encrypted disk (NVMe hot + HDD cold). Never bridged. +- Embeddings + metadata → Cell's local index, optionally bridged to hub. +- Snapshots → local only. +- Aggregated health/state → bridged to hub. + +**Console**: SPA hosted by the router. Talks to the router via `/api/router/*` (ubus) and to the Cell via `/api/cell/*` (reverse-proxied by the router). + +**Configuration**: GitOps repos cloned by the router. Reconciled every 5 minutes. See [ADR-0004](../../decisions/0004-gitops-como-source-of-truth.md). + +**Egress sovereignty**: only what the bridge policy explicitly allows leaves the site. See [`data-sovereignty.md`](data-sovereignty.md). + +## Key technologies + +| Layer | Choice | +|---|---| +| Router OS | OpenWrt 23.05+ | +| Router hardware | GL.iNet GL-MT6000 (default), Banana Pi BPi-R4 (alternative) | +| Cell OS | Balena OS on RK3588 | +| Cell hardware | Banana Pi BPI-W3, Radxa Rock 5B+ | +| Edge AI engine | Frigate v0.14+ with RKNN | +| Models | YOLOv8n (detection), CLIP (embeddings), reid models, Whisper-v3 (audio) | +| Broker | Mosquitto with bridge | +| Time sync | chrony with NTS | +| VPN | Tailscale | +| GitOps | Git + cron-driven reconcile (custom ucode script) | +| Hub OS | Debian on Hetzner bare-metal | +| Hub services | Mosquitto, MinIO, Qdrant, TimescaleDB, Keycloak, Caddy | + +## Provisioning lifecycle + +1. **Image build**: OpenWrt Image Builder produces a router firmware with `luci-app-blocao-console` and dependencies. +2. **First boot**: router shows the wizard at `http://blocao-router.local/`. Installer goes through 6 steps. +3. **Cell enrollment**: Cell devices on the local network are auto-discovered (mDNS + MQTT announce). Balena handles their provisioning over the air. +4. **GitOps repos**: created by the wizard at hub side. Site repo is initialized with applied UCI config. +5. **Hub registration**: router exchanges enrollment token for mTLS cert. Bridge starts. Sites Overview at hub now shows the new site. +6. **Camera onboarding**: scan VLAN-10, identify, authenticate, test, configure. GitOps commit + Frigate reload. +7. **Operator login**: console at the router via Tailscale or local network. + +## Failure modes considered + +| Failure | Mitigation | +|---|---| +| WAN down | Site continues operating; events queue locally; bridge resumes when WAN returns | +| Hub down | Router operates standalone; queue grows; reconnects automatically | +| Cell crash | Frigate auto-restarts via Balena; events buffered; selftest alerts in <30s | +| Camera offline | Detected by selftest; alert in SYNOPSIS and CAMS; doesn't block other cameras | +| Bridge cert expires | Selftest warns 90 days before; auto-renewal planned (currently manual) | +| GitOps push conflict | Reconcile fails, alerts via `_bridge/status`, last-known-good remains applied | +| Cell disk full | Retention rotates oldest first; soft alert at 75%, hard at 85%, evidence locker has separate quota | +| Operator forgets password | Router has local recovery via console port; hub has Keycloak admin recovery | + +## Out of scope (for design phase) + +- Mobile operator app — explicitly post-MVP. +- Real-time video streaming to operator over WAN — bandwidth-prohibitive, only on-demand clips. +- Automated camera positioning / PTZ control — vendor-specific, deferred. +- Cross-site real-time correlation (e.g., follow a vehicle across sites in real time) — analytics-grade feature, post-MVP. + +## See also + +- [`tiers.md`](tiers.md) — detailed roles per tier. +- [`mqtt-contract.md`](mqtt-contract.md) — MQTT topic schema. +- [`data-sovereignty.md`](data-sovereignty.md) — what stays local, what leaves. +- [`storage-retention.md`](storage-retention.md) — capacity planning. +- [`network-topology.md`](network-topology.md) — VLANs, firewall, segmentation.