Product Brief Distillate: Quorum

Core Concept

Quorum is an AI-native product development platform where the AI agents ARE the team, not a feature bolted onto a tool
Human is the "conductor," AI agents are the "orchestra" — baton always stays in human's hand
Reversed flow: start with full vision, filter through three pillars (Desirability, Feasibility, Viability), only survivors move forward
Same platform shapeshifts between solo founder (full AI team) and enterprise (humans lead AI agents) — same architecture, different interface
Target experience: "the smartest person on my team who never makes me feel dumb"

AI Agent Team — Named Roster & Behaviors

Canonical names (BMAD skills): John PO, Winston Dev Lead, Mary UX Researcher, Kinsley Product Designer, Luca Motion Designer, Jaymes Frontend, Damien Backend (conditional), Quinn QA, Cipher Security, Bob Scrum Master. Amelia (Developer) remains BMAD story executor alongside external coding-tool orchestration.

John (PO): priorities, stakeholder reports, roadmap tracking, ceremony agendas, auto-generated VP reports, autonomous mode with per-action approval dials — bmad-agent-pm
Winston (Dev Lead): complexity estimates, tech debt tracking, sprint health, iceberg flagging, self-challenge for solo users, estimate accuracy over time — bmad-agent-architect
Mary (UX Researcher): passive/active data collection, synthesis, user segmentation, feedback loop, proficiency detection, contradiction resolution — bmad-agent-analyst
Kinsley (Product Designer): step 1 co-facilitates design-thinking session → Figma Make prompt; 2a tightens prompt + 2b executes in Figma Make; live concept filter updates; post-PRD/journey refinement (high-fid, tokens, DS); IA; spatial UAT — bmad-agent-ux-designer
Luca (Motion Designer): transitions, micro-interactions, loading states; motion at 5.5 after journeys + 5.25 refined designs; feeling + exact values + reduced-motion; temporal UAT — bmad-agent-motion-designer
Jaymes (Frontend Developer, always in build): pixel-perfect UI from Kinsley + Luca via AI coding tools; coordinates with Damien on APIs — agent-jaymes
Damien (Backend Developer, conditional): database, auth, API, deployment via AI coding tools when needed — agent-damien
Quinn (QA Engineer): acceptance tests, automation, motion validation — bmad-agent-qa
Cipher (Security Agent): mandatory post-build audit before ship — agent-cipher
Bob (Scrum Master): sprint cadence, ceremonies, capacity, boards — bmad-agent-sm
BUILD ARCHITECTURE: Quorum does NOT write code. Quorum orchestrates external AI coding tools (Claude Code / Cursor / Windsurf / Copilot) by generating incredibly detailed story files with full context. Developer agents are directors — they know how to instruct coding tools for their domain. Stories are the critical handoff artifact: acceptance criteria, design specs, motion specs, architecture context, file paths, testing requirements. Quorum owns the thinking; external tools own the typing.
Agents challenge EACH OTHER AND the user — push back with backbone, not yes-men. Agent differentiation is structural (different evaluation frameworks, weighted criteria, data sources per agent), not just prompt personalities.
Agents actively seek human input when uncertain — know when to escalate vs handle independently
Solo founder: AI leans harder into challenger role, watches for rubber-stamping patterns
Golden Rule: AI never approves its own work into a sprint — researches, synthesizes, estimates, recommends, but human says "go"

Three-Pillar Filter — Mechanics

Every feature/idea passes through Desirability, Feasibility, and Viability research
Not a checklist — a decision framework the software executes
Inspired by "Continuous Discovery Habits" operationalized
Filter is not one-time gate — can be re-invoked as lightweight health check with current data (pillar re-validation)
Discovery findings from production MUST re-enter the three-pillar filter before earning a place in the plan — this is the scope creep firewall

Feedback Loop Architecture (V2 but fully designed)

Dual-purpose loop: (1) Validation — did predictions hold up? (2) Discovery — what new emerged that no prediction could catch?
Validation findings close loops, discovery findings open NEW loops — different lifecycles
Discovery findings route back through three-pillar filter before entering backlog (scope creep firewall)

Collection Mechanisms

Lightweight passive always-on (analytics events, drop-off, micro-signals — sips, not gulps)
AI-prompted milestone activation at intelligent thresholds (e.g., 60 days live)
Manual on-demand override — human can trigger specific research anytime
Smart nudge with configurable reminders ("remind me in X days / after next sprint / turn off")
AI suggests WHAT to test and WHY based on passive signals — closes gap between "we should research" and "research on what?"

Synthesis

Tier-adaptive: Solo gets single clean AI view; Enterprise gets full orchestrated stack with channel-aware synthesis
Contextual confidence, not scores: "Strong signal" / "Emerging signal" / "In consideration" with caveats
Caveats as a feature — conditional inclusion, not binary show/hide
Segment-aware contradiction resolution — "power users love it, new users confused" = two truths, not one contradiction
User proficiency detection through behavior signals — prevents panic from skewed tester populations
Default lens biases toward least technical users (inclusive design philosophy)

Routing & Decision Gates

Smart routing by issue type: design → Kinsley, motion → Luca, functionality → Quinn/dev, strategic → John (PO)
Mandatory complexity scan before approval — shows iceberg before they hit it
"Looks Simple" warning label for high gap between perceived and actual complexity
AI sprint impact analysis before anything touches a sprint — shows COST of adding things
Dual-path impact projection: Path A (act now) vs Path B (defer) with compounding deferral cost visualization

Ceremony Model

Findings batched into weekly digest, never streamed real-time (newspaper, not Twitter feed)
AI ceremony pre-brief: themes, impact, recommended discussion order — team walks in oriented
Solo: AI standup, 15 min | Small team: lightweight sync, 30 min | Enterprise: structured ceremony, 45-60 min
Crunch mode: quick freeze check, 5 min — delivery freeze, not collection freeze
HiPPO equalizer: every request enters same pipeline regardless of who suggested it

Governance

Enterprise: triad morning check-in (lead eng + Kinsley / lead designer + John / PO) controls intake
Pause button: stops delivering but keeps collecting, auto-backlogs with "collected during freeze" tag
Sprint boundary as natural reset point — automatic feedback intake checkpoint
AI as advisory fourth member of triad — recommends pause/resume based on capacity, never acts unilaterally
Critical finding escalation during freeze — fire alarm is a recommendation with evidence, not an override

Finding Lifecycle (Hospital / Morgue / Decomposer)

Approved → estimate review → sprint → shipped → generates new signals (cycle restarts)
Deferred → Hospital: actively monitored, new confirming signals trigger re-evaluation ("this finding is getting healthier")
Killed → Morgue: autopsy record (why killed, supporting data, learnings), searchable institutional knowledge
Decomposer: AI monitors morgue for patterns — similar findings keep dying in same area? Synthesizes dead findings into new living insights (nutrient cycle)
Every state transition logged with who/why/when — full audit trail

Backlog Health

Backlog health monitor with mandatory intake threshold — when crossed, fixed number of items become mandatory per sprint (PO picks which, but can't skip payment)
Backlog decay scoring — 3+ sprints triggers "still valid?" forcing re-validate, archive, or escalate
Roadmap flex budget: 10-15% capacity pre-allocated to backlog items every sprint — baked in, not disruptive
Stakeholder requests get "stakeholder-flagged" status, top of backlog, resurfaced every sprint planning with current impact
Transparent priority queue — stakeholders can SEE their request position and why it hasn't been pulled

Dynamic Agent Scaling

When findings exceed processing capacity, AI spawns specialized sub-agents per domain
Sub-agents develop domain expertise across sprints ("4th checkout finding in 3 weeks — synthesized into single recommendation")
Agent hierarchy: Human → AI Lead Agents → AI Sub-Agents → Raw work
Human never manages sub-agents directly — talks only to leadership layer
Carrying capacity is elastic — AI has no WIP limit, only compresses for human bandwidth

Technical Debt Model

Mandatory tech debt sprints at regular cadence (configurable, e.g., every 3rd-4th sprint) — first-class roadmap citizens
Feedback-driven debt tracked separately from organic/legacy debt — feedback loop accountable for its own mess
AI pre-analysis before tech debt sprint: prioritized by risk, blast radius, upcoming dependencies

Decision Records & Traceability

Every decision (act/defer/kill) recorded with rationale and projected impact at time of decision
Revisits show original context: "When you evaluated in Sprint 12, cost was X. Now it's Y"
Roadmap contradictions framed as context signals: "Context changed since this passed the filter — here's what changed and why"
Full traceability from finding → decision → outcome

JIRA / External Tool Integration (Enterprise)

JIRA/Linear/Asana integration is optional enterprise feature, never forced
Bi-directional sync with conflict detection ("Ticket moved to Done in JIRA but AI shows tests haven't passed — resolve?")
Solo users: platform IS the tracker — sprint board, backlog, feedback pipeline, ceremonies all built-in, managed by SM Agent
Ticket auto-creation by finding type: validation-UI needs Kinsley+Winston+John approval, discovery needs pillar scores, stakeholder requests flagged separately
All tickets tagged: source, confidence, complexity, feedback-driven

AI Trust & Failure Handling

When AI is wrong: three-question diagnostic (how determined? why not caught earlier? what changed?)
Severity gate: show-stopper (real damage) → pull area out of AI, human-only until fixed | correction (caught before action) → flag, log, learning signal
Human-only override mode per topic area — granular quarantine, not system-wide shutdown
AI confidence disclosure on all estimates with reasoning
Partnership model: decisions always shared, never delegated — no "auto-approve" toggle exists

Confidence × Complexity Decision Grid

Every finding pre-plotted by AI on 2D grid (confidence: Strong/Emerging/In Consideration × complexity: Simple/Medium/Complex/"Looks Simple")
Used as visual triage tool during ceremonies — "3 fast-tracks, 2 deferrals, 1 iceberg"
AI pre-plots, humans drag to adjust — every adjustment trains future plotting
Strong+Simple = fast-track | Emerging+Complex = backlog | "Looks Simple" = STOP, full scan first

Scope Signals for MVP (V1)

Full named team: John, Winston, Mary, Kinsley, Luca, Jaymes, Quinn, Cipher, Bob + Damien (conditional) — V1
Build phase orchestrates external AI coding tools (Claude Code / Cursor as primary V1 integration) — Quorum generates stories, coding tools write code — V1
Mandatory security audit (Cipher) after build, before ship — V1
Step 1 design-thinking session + detailed Figma Make prompt; 2a/2b tighten + execute in Figma Make before filter; tokens/DS deferred to 5.25 after PRD + journeys — V1
Three-pillar filter with live-updating concept visuals — V1
Full pipeline: 1 → 2a → 2b → 2c → 3 → 4 → 5 → 5.25 (refine+tokens+DS) → 5.5 (motion) → 6 → 7 → 8 → 9 → 10a → 10b → 10c (Cipher) → 10d (UAT) → 11 → 12 — V1
PRD deliverables: HTML + Word (.docx) from single source — V1
Built-in sprint board, backlog, roadmap (SM Agent managed) — V1
Solo + enterprise experiences, toggleable for demo — V1
Team room GUI (conversational, not dashboard) — V1
AI challenger behavior — agents push back on the user, not just each other — V1
Feedback loop, Hospital/Morgue/Decomposer, JIRA integration, decision grid, agent-to-agent visible debate, dynamic sub-agent scaling — all V2
Domain-specific agent expertise for regulated industries — V2
Product snapshots and branching (git for product strategy) — V2
Animated prototypes (V1 outputs motion specs, V2 outputs animated previews) — V2

Explicitly Out (All Versions)

Third-party research platform integrations (Userlytics etc.) — future partnership
Mobile app — web-first
Self-hosted / on-premise deployment
Custom agent training by end users

Rejected Ideas & Design Decisions

AI never originates product concepts — human idea first, always (conductor model)
No real-time feedback streaming — batched digests only (prevents reactive decision-making)
No "auto-approve" or "let AI decide" toggle — partnership model by design
No manual WIP limits for AI — AI scales elastically, compresses only for human consumption
Confidence scores rejected in favor of contextual confidence tiers with caveats ("In consideration" > "72% confidence")
No binary permission configuration — governance inferred from team structure

Open Questions

Specific LLM architecture and orchestration approach for multi-agent collaboration (not yet designed — technical research covers LangGraph, CrewAI, AutoGen options)
Pricing model details — "tiered + usage-based" stated but tiers/thresholds undefined. PREMORTEM: model power user cost (45-100+ sessions/month), not average user cost
Data privacy model for enterprise — how are agent conversations and user research data isolated?
Onboarding flow — how does a solo founder go from "I have an idea" to a populated Quorum workspace? PREMORTEM: 10-min activation hook via design-thinking session + Figma Make prompt → concept frames (refinement later)
Design system and visual identity for team room GUI — "conversational, not dashboard" needs concrete UX patterns. Layered artifact model established (spatial + temporal layers on same screen data)
How three-pillar filter research actually executes — what data sources, what methods, what's automated vs guided? PREMORTEM: must show granular work, specific citations, falsifiable claims
Agent personality calibration — how much challenger behavior is right for different user types? SHARK TANK: build distinguishability eval suite
Marketplace for specialized agents (2-3 year vision) — technical and business model implications
Design generation technology stack — concept vs refinement (5.25): what generates exploratory Figma Make output vs production-ready spatial specs + tokens? Core technology investment
Motion spec output format — structured per-screen spec with feeling + technical values + reduced-motion fallback. Future: animated prototypes