# Test Design: Epic 2 - The Magic Mirror

**Epic:** 2 (Ghostwriter & Draft Refinement)
**Scope:** Epic-Level
**Date:** 2026-01-25
**Author:** QA Architect (AI)

## 1. Risk Assessment

### Identified Risks

| Risk ID   | Category | Title                            | Description                                                                                        | Probability (1-3) | Impact (1-3) | Score | Action       |
| :-------- | :------- | :------------------------------- | :------------------------------------------------------------------------------------------------- | :---------------- | :----------- | :---- | :----------- |
| **R-2.1** | BUS      | **Hallucination / Poor Quality** | Ghostwriter generates content unrelated to the user's insight or creates fictional details.        | 2 (Possible)      | 3 (Critical) | **6** | **MITIGATE** |
| **R-2.2** | TECH     | **Context Window Overflow**      | Long chat sessions exceed the token limit for the generation prompt, causing truncation or errors. | 2 (Possible)      | 3 (Critical) | **6** | **MITIGATE** |
| **R-2.3** | TECH     | **State Desynchronization**      | UI gets stuck in "Drafting" state if the LLM request hangs or fails silently.                      | 2 (Possible)      | 2 (Degraded) | 4     | MONITOR      |
| **R-2.4** | TECH     | **Clipboard API Failures**       | "One-Click Copy" fails on certain mobile browsers due to permission policies.                      | 2 (Possible)      | 2 (Degraded) | 4     | MONITOR      |
| **R-2.5** | UI       | **Markdown Rendering Issues**    | Generated artifacts break layout (e.g., extremely long code blocks, tables on mobile).             | 1 (Unlikely)      | 1 (Minor)    | 1     | DOCUMENT     |

### Mitigation Strategies (High Risks)

**R-2.1: Hallucination / Poor Quality (Score 6)**
*   **Mitigation:** Implement specific "Grounding" prompts. Use `evals` (automated evaluation) to check if output tokens overlap with input "Insight" tokens.
*   **Owner:** Prompt Engineer / Dev
*   **Validation:** Automated Prompt Tests (checking recall of key facts).

**R-2.2: Context Window Overflow (Score 6)**
*   **Mitigation:** Implement strict token counting utility. Summarize or truncate chat history intelligently before sending to Ghostwriter.
*   **Owner:** Dev Team
*   **Validation:** Unit tests for `PromptEngine` with large mock inputs.

---

## 2. Test Coverage Plan

### Acceptance Criteria Mapping

| Story   | ID    | Scenario                                                        | Level       | Priority | Risk Link |
| :------ | :---- | :-------------------------------------------------------------- | :---------- | :------- | :-------- |
| **2.1** | 2.1.1 | Ghostwriter receives correct chat context (Prompt Construction) | Unit        | **P0**   | R-2.1     |
| **2.1** | 2.1.2 | Token limit enforcement (Truncation/Error)                      | Unit        | **P0**   | R-2.2     |
| **2.1** | 2.1.3 | Generated generation is valid Markdown                          | Unit        | P1       | R-2.5     |
| **2.2** | 2.2.1 | Draft Sheet slides up upon completion                           | Component   | P1       | -         |
| **2.2** | 2.2.2 | Draft view renders Markdown correctly (Headers, lists)          | Component   | P2       | R-2.5     |
| **2.3** | 2.3.1 | "Thumbs Down" triggers feedback prompt                          | Integration | P1       | -         |
| **2.3** | 2.3.2 | Regeneration respects user critique                             | E2E         | **P0**   | R-2.1     |
| **2.4** | 2.4.1 | "Copy" button places text in clipboard                          | E2E         | **P0**   | R-2.4     |
| **2.4** | 2.4.2 | "Save" marks session as completed in DB                         | Integration | **P0**   | -         |

### Test Levels Strategy

*   **Unit Tests:**
    *   `PromptEngine`: Verify context insertion and token limits.
    *   `MarkdownParser`: Verify safe rendering logic.
*   **Component Tests:**
    *   `DraftSheet`: Verify open/close animations and state binding (Zustand).
    *   `MarkdownRenderer`: Visual regression tests for styles.
*   **Integration Tests:**
    *   `GhostwriterService`: Mock LLM response -> Verify State Update -> Verify DB Update.
*   **E2E Tests:**
    *   **Full Flow (P0):** Chat -> Generate -> Copy to Clipboard.
    *   **Refinement Flow (P1):** Generate -> Critique -> Regenerate.

---

## 3. Execution Plan

### Smoke Tests (Pre-Merge)
1.  **Unit:** `PromptEngine` sanity checks.
2.  **E2E:** Basic Generation Flow (Mocked LLM).

### Regression Suite (Nightly)
1.  **Unit:** Token limit edge cases.
2.  **E2E:** Clipboard functionality on mobile viewport emulation.
3.  **Prompt Evals:** Quality checks on sample inputs.

### Resource Estimates
*   **P0 Scenarios:** 5 tests (approx. 5 hours implementation).
*   **P1 Scenarios:** 3 tests (approx. 2 hours implementation).
*   **P2 Scenarios:** 1 test (approx. 0.5 hours implementation).
*   **Total Effort:** ~1 day.

---

## 4. Quality Gate Criteria

*   **Pass Rate:** 100% on P0 tests.
*   **Performance:** Generation starts within 5s (mocked latency).
*   **Mitigation:** Token limiter unit tests must pass.