Files
brachnha-insight/_bmad-output/implementation-artifacts/test-design-epic-2.md
Max 3fbbb1a93b Initial commit: Brachnha Insight project setup
- Next.js 14+ with App Router and TypeScript
- Tailwind CSS and ShadCN UI styling
- Zustand state management
- Dexie.js for IndexedDB (local-first data)
- Auth.js v5 for authentication
- BMAD framework integration

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-26 12:28:43 +07:00

90 lines
5.1 KiB
Markdown

# Test Design: Epic 2 - The Magic Mirror
**Epic:** 2 (Ghostwriter & Draft Refinement)
**Scope:** Epic-Level
**Date:** 2026-01-25
**Author:** QA Architect (AI)
## 1. Risk Assessment
### Identified Risks
| Risk ID | Category | Title | Description | Probability (1-3) | Impact (1-3) | Score | Action |
| :-------- | :------- | :------------------------------- | :------------------------------------------------------------------------------------------------- | :---------------- | :----------- | :---- | :----------- |
| **R-2.1** | BUS | **Hallucination / Poor Quality** | Ghostwriter generates content unrelated to the user's insight or creates fictional details. | 2 (Possible) | 3 (Critical) | **6** | **MITIGATE** |
| **R-2.2** | TECH | **Context Window Overflow** | Long chat sessions exceed the token limit for the generation prompt, causing truncation or errors. | 2 (Possible) | 3 (Critical) | **6** | **MITIGATE** |
| **R-2.3** | TECH | **State Desynchronization** | UI gets stuck in "Drafting" state if the LLM request hangs or fails silently. | 2 (Possible) | 2 (Degraded) | 4 | MONITOR |
| **R-2.4** | TECH | **Clipboard API Failures** | "One-Click Copy" fails on certain mobile browsers due to permission policies. | 2 (Possible) | 2 (Degraded) | 4 | MONITOR |
| **R-2.5** | UI | **Markdown Rendering Issues** | Generated artifacts break layout (e.g., extremely long code blocks, tables on mobile). | 1 (Unlikely) | 1 (Minor) | 1 | DOCUMENT |
### Mitigation Strategies (High Risks)
**R-2.1: Hallucination / Poor Quality (Score 6)**
* **Mitigation:** Implement specific "Grounding" prompts. Use `evals` (automated evaluation) to check if output tokens overlap with input "Insight" tokens.
* **Owner:** Prompt Engineer / Dev
* **Validation:** Automated Prompt Tests (checking recall of key facts).
**R-2.2: Context Window Overflow (Score 6)**
* **Mitigation:** Implement strict token counting utility. Summarize or truncate chat history intelligently before sending to Ghostwriter.
* **Owner:** Dev Team
* **Validation:** Unit tests for `PromptEngine` with large mock inputs.
---
## 2. Test Coverage Plan
### Acceptance Criteria Mapping
| Story | ID | Scenario | Level | Priority | Risk Link |
| :------ | :---- | :-------------------------------------------------------------- | :---------- | :------- | :-------- |
| **2.1** | 2.1.1 | Ghostwriter receives correct chat context (Prompt Construction) | Unit | **P0** | R-2.1 |
| **2.1** | 2.1.2 | Token limit enforcement (Truncation/Error) | Unit | **P0** | R-2.2 |
| **2.1** | 2.1.3 | Generated generation is valid Markdown | Unit | P1 | R-2.5 |
| **2.2** | 2.2.1 | Draft Sheet slides up upon completion | Component | P1 | - |
| **2.2** | 2.2.2 | Draft view renders Markdown correctly (Headers, lists) | Component | P2 | R-2.5 |
| **2.3** | 2.3.1 | "Thumbs Down" triggers feedback prompt | Integration | P1 | - |
| **2.3** | 2.3.2 | Regeneration respects user critique | E2E | **P0** | R-2.1 |
| **2.4** | 2.4.1 | "Copy" button places text in clipboard | E2E | **P0** | R-2.4 |
| **2.4** | 2.4.2 | "Save" marks session as completed in DB | Integration | **P0** | - |
### Test Levels Strategy
* **Unit Tests:**
* `PromptEngine`: Verify context insertion and token limits.
* `MarkdownParser`: Verify safe rendering logic.
* **Component Tests:**
* `DraftSheet`: Verify open/close animations and state binding (Zustand).
* `MarkdownRenderer`: Visual regression tests for styles.
* **Integration Tests:**
* `GhostwriterService`: Mock LLM response -> Verify State Update -> Verify DB Update.
* **E2E Tests:**
* **Full Flow (P0):** Chat -> Generate -> Copy to Clipboard.
* **Refinement Flow (P1):** Generate -> Critique -> Regenerate.
---
## 3. Execution Plan
### Smoke Tests (Pre-Merge)
1. **Unit:** `PromptEngine` sanity checks.
2. **E2E:** Basic Generation Flow (Mocked LLM).
### Regression Suite (Nightly)
1. **Unit:** Token limit edge cases.
2. **E2E:** Clipboard functionality on mobile viewport emulation.
3. **Prompt Evals:** Quality checks on sample inputs.
### Resource Estimates
* **P0 Scenarios:** 5 tests (approx. 5 hours implementation).
* **P1 Scenarios:** 3 tests (approx. 2 hours implementation).
* **P2 Scenarios:** 1 test (approx. 0.5 hours implementation).
* **Total Effort:** ~1 day.
---
## 4. Quality Gate Criteria
* **Pass Rate:** 100% on P0 tests.
* **Performance:** Generation starts within 5s (mocked latency).
* **Mitigation:** Token limiter unit tests must pass.