Day 6 · Student handout

Integrated demo and architecture memo

Learners package the demo flow, architecture memo, known limits, validation hooks, and customer-acceptance evidence.

June 2026 Canonical in the 7-day tutorial Full local lesson

Day 6: Integrated demo and architecture memo

今日目標

把前五天的知識變成可以展示的 evidence。

Demo scope

Demo 可以是 semi-real-time,不必假裝 production-ready。重要的是量測、界線與可重現。

Flow:

audio file or microphone
-> VAD / ASR
-> optional diarization label
-> PII redaction
-> RAG retrieve
-> agent feedback
-> output guardrail
-> TTS or text response
-> audit log

畫面或 README 必須列:

hardware
model names
runtime
input audio condition
ASR latency
LLM first token latency
TTS first audio latency
end-to-end latency
known limitations

Architecture memo template

標題:

Enterprise Voice Agent Gateway v0 Architecture Proposal

第一段:capability

本架構支援多種企業 AI Coach 任務。共同層負責 audio intake、model routing、
identity、policy、PII、tool permission、memory scope、audit、evaluation、
red teaming;adapter 層負責不同任務的 taxonomy、RAG corpus、output schema、
policy rules、tool permissions、evaluator。

第二段:layers

Audio layer:
  VAD / ASR / diarization / hotword correction / TTS

Knowledge layer:
  parsing / metadata / embedding / vector DB / reranker / citation

Agent layer:
  orchestrator / tool registry / memory / approval queue

Governance layer:
  identity / RBAC / ABAC / PII / policy / audit / red teaming

Deployment layer:
  Docker / K8s / vLLM / observability / rollback

第三段:validation

Functional:
  完成指定 coaching / report / retrieval task

Quality:
  ASR WER/CER, keyword accuracy, RAG hit@k/MRR, answer faithfulness

Latency:
  p50/p95 by component and end-to-end

Security:
  OWASP/NIST mapped red-team pass rate, PII leakage tests

Ops:
  health checks, logs, GPU memory, rollback, acceptance checklist

今日產出

建立:

demo-script.md
architecture-memo.md
known-limitations.md
acceptance-checklist.md

Known limitations 要用正向 scope-control 語言:

Current scope:
  The demo supports controlled audio-file or short-turn microphone input.

Validation layer:
  Real-time barge-in, overlap-heavy audio, and noisy far-field audio require
  separate latency and accuracy validation before production deployment.