Files
wdmUI/docs/02-products/transcript-overview.md

2.9 KiB

Transcript Forensics · product overview

Pitch

Audio counterpart to Blocao video forensics. Recordings of meetings, phone calls, depositions are processed on-site by Whisper + pyannote. The console offers natural-language query ("what was said about contract Acme and who was most insistent") with speaker identification, topic detection, and waveform-anchored playback.

Built on the same platform as Blocao: same router, same Cell hardware (with a microphone or audio ingest interface), same MQTT bus, same GitOps, same hub.

What you get

  • Transcript console with chat-style query, waveform with speaker bands, transcript navigable by speaker, audio player with live caption.
  • Speaker library: voiceprints stored per-customer, automatically matched against new sessions.
  • Topic detection: NER + clustering surfaces what was discussed; per-speaker breakdown.
  • Insistence ranking: identifies who pushed hardest on each topic (intervention rate, repetition).
  • Pin to case: same case management as Blocao video.
  • Cross-product correlation (Year 2): correlate audio transcripts with video events at a site.

Use cases

  1. Meeting forensics for legal/compliance: review of depositions, board meetings, contract negotiations.
  2. Customer service quality: search support call recordings for problematic patterns.
  3. Mediation and arbitration: who said what, when, with audio proof.
  4. Insurance claim review: verify what was promised in pre-purchase calls.
  5. Internal investigations: HR or compliance review of recorded conversations.

Why same platform as Blocao

The transcriptor and Blocao share enough infrastructure that running them on separate stacks would be wasteful:

  • Same router, same Cell, same MQTT, same GitOps, same hub.
  • Same console patterns (chat, waveform/timeline, player, case management).
  • Same sovereignty model.
  • Same evidence chain (post-MVP).

The only product-specific pieces are:

  • The Whisper + pyannote container in the Cell stack.
  • The speaker library schema in Qdrant.
  • The transcript-specific console panel.

See shared-stack.md for the breakdown of shared vs distinct.

Differentiators

Vs Transcript Forensics differentiator
Otter.ai / Sonix / Fireflies On-site processing + sovereignty + speaker library across sessions
Microsoft Azure Speech / AWS Transcribe No vendor lock-in + open-source models + same UX as Blocao
Open-source Whisper alone Fleet-managed + speaker identification + topic detection + UX

Status

Mockup complete (mockups/transcript-forensics.html). Backlog at backlog/transcript-sprint-backlog.md. Many stories shared with Blocao backlog (see backlog/shared-stories.md).

The transcript product launches after Blocao MVP is shipping (target: Year 2). Until then, design lives in this repo to ensure the platform is built with audio in mind from day one.