- Next.js 14+ with App Router and TypeScript - Tailwind CSS and ShadCN UI styling - Zustand state management - Dexie.js for IndexedDB (local-first data) - Auth.js v5 for authentication - BMAD framework integration Co-Authored-By: Claude <noreply@anthropic.com>
5.1 KiB
5.1 KiB
Test Design: Epic 2 - The Magic Mirror
Epic: 2 (Ghostwriter & Draft Refinement) Scope: Epic-Level Date: 2026-01-25 Author: QA Architect (AI)
1. Risk Assessment
Identified Risks
| Risk ID | Category | Title | Description | Probability (1-3) | Impact (1-3) | Score | Action |
|---|---|---|---|---|---|---|---|
| R-2.1 | BUS | Hallucination / Poor Quality | Ghostwriter generates content unrelated to the user's insight or creates fictional details. | 2 (Possible) | 3 (Critical) | 6 | MITIGATE |
| R-2.2 | TECH | Context Window Overflow | Long chat sessions exceed the token limit for the generation prompt, causing truncation or errors. | 2 (Possible) | 3 (Critical) | 6 | MITIGATE |
| R-2.3 | TECH | State Desynchronization | UI gets stuck in "Drafting" state if the LLM request hangs or fails silently. | 2 (Possible) | 2 (Degraded) | 4 | MONITOR |
| R-2.4 | TECH | Clipboard API Failures | "One-Click Copy" fails on certain mobile browsers due to permission policies. | 2 (Possible) | 2 (Degraded) | 4 | MONITOR |
| R-2.5 | UI | Markdown Rendering Issues | Generated artifacts break layout (e.g., extremely long code blocks, tables on mobile). | 1 (Unlikely) | 1 (Minor) | 1 | DOCUMENT |
Mitigation Strategies (High Risks)
R-2.1: Hallucination / Poor Quality (Score 6)
- Mitigation: Implement specific "Grounding" prompts. Use
evals(automated evaluation) to check if output tokens overlap with input "Insight" tokens. - Owner: Prompt Engineer / Dev
- Validation: Automated Prompt Tests (checking recall of key facts).
R-2.2: Context Window Overflow (Score 6)
- Mitigation: Implement strict token counting utility. Summarize or truncate chat history intelligently before sending to Ghostwriter.
- Owner: Dev Team
- Validation: Unit tests for
PromptEnginewith large mock inputs.
2. Test Coverage Plan
Acceptance Criteria Mapping
| Story | ID | Scenario | Level | Priority | Risk Link |
|---|---|---|---|---|---|
| 2.1 | 2.1.1 | Ghostwriter receives correct chat context (Prompt Construction) | Unit | P0 | R-2.1 |
| 2.1 | 2.1.2 | Token limit enforcement (Truncation/Error) | Unit | P0 | R-2.2 |
| 2.1 | 2.1.3 | Generated generation is valid Markdown | Unit | P1 | R-2.5 |
| 2.2 | 2.2.1 | Draft Sheet slides up upon completion | Component | P1 | - |
| 2.2 | 2.2.2 | Draft view renders Markdown correctly (Headers, lists) | Component | P2 | R-2.5 |
| 2.3 | 2.3.1 | "Thumbs Down" triggers feedback prompt | Integration | P1 | - |
| 2.3 | 2.3.2 | Regeneration respects user critique | E2E | P0 | R-2.1 |
| 2.4 | 2.4.1 | "Copy" button places text in clipboard | E2E | P0 | R-2.4 |
| 2.4 | 2.4.2 | "Save" marks session as completed in DB | Integration | P0 | - |
Test Levels Strategy
- Unit Tests:
PromptEngine: Verify context insertion and token limits.MarkdownParser: Verify safe rendering logic.
- Component Tests:
DraftSheet: Verify open/close animations and state binding (Zustand).MarkdownRenderer: Visual regression tests for styles.
- Integration Tests:
GhostwriterService: Mock LLM response -> Verify State Update -> Verify DB Update.
- E2E Tests:
- Full Flow (P0): Chat -> Generate -> Copy to Clipboard.
- Refinement Flow (P1): Generate -> Critique -> Regenerate.
3. Execution Plan
Smoke Tests (Pre-Merge)
- Unit:
PromptEnginesanity checks. - E2E: Basic Generation Flow (Mocked LLM).
Regression Suite (Nightly)
- Unit: Token limit edge cases.
- E2E: Clipboard functionality on mobile viewport emulation.
- Prompt Evals: Quality checks on sample inputs.
Resource Estimates
- P0 Scenarios: 5 tests (approx. 5 hours implementation).
- P1 Scenarios: 3 tests (approx. 2 hours implementation).
- P2 Scenarios: 1 test (approx. 0.5 hours implementation).
- Total Effort: ~1 day.
4. Quality Gate Criteria
- Pass Rate: 100% on P0 tests.
- Performance: Generation starts within 5s (mocked latency).
- Mitigation: Token limiter unit tests must pass.