diff --git a/docs/01-architecture/mqtt-contract.md b/docs/01-architecture/mqtt-contract.md new file mode 100644 index 0000000..296aa5e --- /dev/null +++ b/docs/01-architecture/mqtt-contract.md @@ -0,0 +1,196 @@ +# MQTT contract + +The MQTT topic schema is the inter-process contract of the platform. All publishers and subscribers must respect this schema. Versioned in `packages/proto-mqtt/` of the future code repo as JSON Schema files. + +See [ADR-0003](../../decisions/0003-mqtt-como-espina-dorsal.md) for why MQTT is the spine. + +## Naming conventions + +- Lowercase, slash-separated. +- Site prefix only when bridged to hub: `sites//...`. +- Local topics (within a site) **don't** include the site prefix. +- Underscore prefix (`_registry`, `_cmd`, `_bridge`) marks system-level topics, distinct from device topics. +- `+` is single-level wildcard, `#` is multi-level (MQTT standard). + +## Topic catalog + +### Device events + +``` +hai//events/ +``` + +- `event_type` ∈ `person | car | truck | motorcycle | bicycle | animal | package | license-plate | custom`. +- Payload (JSON): + ```json + { + "id": "evt-8a2f", + "label": "person", + "conf": 0.92, + "zone": "entrance", + "ts": "2026-05-09T19:34:22.142Z", + "ts_nts": "2026-05-09T19:34:22.142Z", + "sha_clip": "a1f7...", + "sha_keyframe": "b2c8..." + } + ``` +- QoS: 1. +- Retained: no. +- Bridged: yes, with prefix `sites//`. + +### Device state + +``` +hai//state/ +``` + +- `key` ∈ `online | fps | npu_pct | last_keepalive | error`. +- Retained: yes. +- Bridged: yes (status info needed at hub). + +### Snapshots + +``` +hai//snapshots/ +``` + +- Binary JPEG, ~50-100KB. +- Retained: yes (last frame). +- **Never bridged** (sovereignty rule, see [ADR-0005](../../decisions/0005-sovereignty-vs-hyperscaler.md)). + +### Telemetry (raw) + +``` +hai//telemetry/ +``` + +- `key` ∈ `fps | bitrate | latency | drops`. +- High-frequency, low-value individually. +- Retained: no. +- **Never bridged** raw — aggregated and republished by `healthd`. + +### Service registry + +``` +hai/_registry/announce +``` + +- Published by Cell containers and other discoverable services. +- Payload: + ```json + { + "type": "frigate | enricher | re-id | healthd | hai-console", + "id": "cell-01", + "host": "192.168.20.10", + "port": 5000, + "caps": ["frigate", "reid"], + "version": "0.4.2", + "started_at": "2026-04-25T14:18:00Z" + } + ``` +- Retained: yes. +- Bridged: yes (hub needs to know what's deployed where). + +``` +hai/_registry/heartbeat/ +``` + +- Published every 30s by each registered service. +- Retained: no. +- Bridged: no (hub uses bridge state for liveness). + +### Health reports + +``` +hai/healthd/health +``` + +- Aggregated selftest result, published every 5min. +- Payload: structured object with results per test. +- Retained: yes. +- Bridged: yes (hub needs health for fleet view). + +### Bridge status + +``` +hai/_bridge/status +``` + +- Published by router to indicate bridge health. +- Payload: `{ "connected": true, "queue": 0, "rtt_ms": 38, "last_disconnect": "..." }`. +- Retained: yes. +- Special: this topic is **published locally by the router** about its own bridge state. Operators see it in the MQTT panel. + +### Commands (incoming from hub) + +``` +_cmd/ +``` + +- Published by the hub, consumed by router or Cell services. +- Payload includes `requested_by`, `ts`, `correlation_id`. +- Bridged inbound: yes, with hub-side topic `sites//_cmd/` mapped to local `_cmd/`. + +### Configuration changes (incoming from hub) + +``` +_config/ +``` + +- Used by hub to push GitOps-equivalent config updates. +- In practice, GitOps does most of this; this topic is reserved for low-latency overrides. +- Retained: yes. +- Bridged inbound: yes. + +### Evidence chain (post-MVP, see ADR-0007) + +``` +hai/_evidence/manifest +``` + +- Published whenever a clip is finalized with its signed manifest. +- Bridged: yes (hub keeps a second copy for hash-chain verification). + +## QoS guidelines + +- **QoS 0**: telemetry, snapshots, frequent low-value data. +- **QoS 1**: events, state, registry, health, commands. Default for anything that should not be lost. +- **QoS 2**: not used. Cost not justified for our use cases. + +## ACL guidelines + +- Cell containers can `pub/sub` only `hai//...` and `hai/_registry/announce`. +- Router can `pub/sub` everything locally. +- Bridge writes only to remapped namespaces, can't pub locally. +- Operator console (read-only) can subscribe to `hai/#` but not publish. +- Future: per-operator ACLs based on Keycloak roles. + +## Bridge policy (sovereignty matrix) + +What goes UP (site → hub): + +``` +hai/+/events/# → sites//hai/+/events/# QoS 1 +hai/+/state/# → sites//hai/+/state/# QoS 1, retained +hai/_registry/announce → sites//_registry/announce QoS 1, retained +hai/+/health → sites//health/+ QoS 0 +hai/_bridge/status → sites//_bridge/status QoS 1, retained +hai/_evidence/manifest → sites//_evidence/manifest QoS 1 +``` + +What goes DOWN (hub → site): + +``` +sites//_cmd/# → _cmd/# QoS 1 +sites//_config/# → _config/# QoS 1, retained +``` + +What is **never bridged**: + +``` +hai/+/snapshots/# (binary JPEGs, sovereignty) +hai/+/telemetry/# (raw, aggregated only) +$SYS/# (broker internals) +``` + +This policy lives in `/etc/mosquitto/conf.d/bridge.conf` on the router, version-controlled in the site-config repo. The MQTT panel in the console renders it from the live config.