# Multi-site fleet deployment Pattern for customers with 5+ sites managed centrally. ## Topology ``` ┌──────────────────────────────┐ │ Blocao Hub (Hetzner DE/FI) │ │ - mosquitto │ │ - keycloak │ │ - qdrant + timescaledb │ │ - sites overview UI │ └────────────┬─────────────────┘ │ MQTT bridges over TLS │ ┌───────────────┬───────┴───────┬────────────────┐ │ │ │ │ ┌────▼────┐ ┌────▼────┐ ┌────▼────┐ ┌────▼────┐ │BL-LAB-1 │ │BL-LAB-2 │ │BL-WH-N │ │BL-WH-S │ │ R+1 │ │ R+1 │ │ R+2 │ │ R+1 │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ ``` Each site is independent — operates without the hub if WAN goes down. The hub is a **coordination layer**, not a critical-path dependency. ## Two GitOps repos When fleet management is in play, the hub provisions two repos per customer: 1. **`fleet-config`** (org-wide common settings): - Default firewall rules. - Default Frigate model versions. - Default retention policy. - DNS allowlist baseline. - Common operator role definitions. 2. **`site-config-`** (per-site overrides): - Site identity (BL-...). - Camera definitions specific to this site. - Retention overrides if different from fleet default. - Network specifics. The router clones both. Reconcile applies fleet first, then site overrides. This separation means: - "Update Frigate to v0.15 across the fleet" → one commit to fleet-config, propagates to all sites in the next reconcile. - "Add a camera to BL-LAB-2" → one commit to site-config-bl-lab-2, only affects that site. ## Operator workflow A fleet operator (e.g., security ops at headquarters of a 30-store retailer) typically: 1. **Hub Sites Overview**: see all sites with health/alerts. 2. **Drill down**: click a site → Tailscale tunnel opens to that site's console. 3. **Investigation**: query forensics either at the site (single-site context) or at the hub (cross-site context). 4. **Bulk policy**: edit fleet-config repo for org-wide changes. ## Cross-site forensic search Implemented in Epic 6 (post-MVP). The hub maintains a consolidated embeddings index. Each site publishes embeddings (with site_id) to its bridge. Hub merges them. Operator query at hub level fans out: - Direct lookup in consolidated index for "find vehicle plate L-7234" or "find this face". - Re-ranking with site-specific context. - Deep dive into a site's full data via Tailscale tunnel. Raw video stays at sites. Embeddings + metadata at hub. Sovereignty preserved. ## Sites Overview UI Separate from the per-site router console. Implemented in `apps/hub` (future code repo). Mockup not yet created — design TBD post-MVP. Likely: - Map of sites with status pins. - Aggregated health panel (% of sites green/warn/err). - Aggregated alerts panel (active across the fleet). - Bulk actions (update fleet-config, push command to N sites). ## Pricing model considerations A 30-site customer should pay more than a 1-site customer. Subscription tiers: | Tier | Sites | Monthly per site | |---|---|---| | Starter | 1-5 | €30-50 | | Standard | 6-25 | €25-40 | | Fleet | 26-100 | €20-30 | | Enterprise | 100+ | Custom | Hardware sold separately. Support tiers add a flat monthly. (All numbers placeholder; finalize with sales lead.) ## Operational considerations - **Hub HA**: production hub should be at least 2 nodes (active-passive at minimum). For >50 sites, active-active with shared MinIO. - **Hub backup**: daily snapshots to a second region (OVH France as standard secondary). - **Site offline handling**: alerts after 5 min of bridge silence. Auto-resolve on reconnect. - **Cert management**: each site's mTLS cert renews automatically every 6 months. Monitoring alerts at 90 days. ## Customer journey 1. **Pilot**: 1-2 sites, hub provisioned, validate fleet workflow. 2. **Rollout**: phased install of remaining sites. 3. **Optimization**: 3 months in, review which fleet-config defaults to tighten. 4. **Steady state**: ongoing ops, occasional new sites, regular fleet-config updates. Total time from pilot to 30 sites in production: typically 6-9 months for a customer with established cabling/infra at each site.