brachnha-insight/_bmad-output/planning-artifacts/test-design-system.md

# System-Level Test Design

## Testability Assessment

- **Controllability: PASS**
  - **Dexie/IndexedDB**: Highly controllable. DB can be seeded, cleared, and inspected programmatically for tests.
  - **State Management (Zustand)**: Store is decoupled from UI, allowing direct state manipulation during component testing.
  - **Environment**: "Local-First" nature reduces dependency on flaky external staging environments for core logic.
  - *Concern*: LLM API nondeterminism. Requires strict mocking/recording (Polly.js or Playwright HAR) for stable regression testing.

- **Observability: PASS**
  - **Client-Side Logs**: Architecture mandates a "Client-Side Transaction Log" which provides excellent visibility into sync states.
  - **Network Interception**: Playwright can easily inspect Vercel Edge Function calls to validate privacy (keys not leaked).
  - *Concern*: Debugging production issues in a PWA requires robust telemetry (Sentry/LogRocket) since we can't access user's local DB directly.

- **Reliability: PASS**
  - **Service Layer Pattern**: "Logic Sandwich" isolates business logic from UI, enabling reliable unit/integration testing of complex sync logic.
  - **Offline-First**: Inherently more reliable architecture for testing flaky network conditions (can simulate offline easily).

## Architecturally Significant Requirements (ASRs)

| ASR ID    | Requirement                       | Impact   | Testing Challenge                                                                 | Risk Score (P x I) |
| --------- | --------------------------------- | -------- | --------------------------------------------------------------------------------- | ------------------ |
| **ASR-1** | **Local-First / Offline Support** | Critical | Requires simulating network drops, tab closures, and sync resumption.             | 9 (3x3)            |
| **ASR-2** | **Privacy (Zero-Knowledge)**      | Critical | Must verify NO user data leaves client. Negative testing required.                | 9 (3x3)            |
| **ASR-3** | **Dual-Agent Pipeline**           | High     | Complex state machine (Venting -> Insight -> Draft). Nondeterministic AI outputs. | 6 (2x3)            |
| **ASR-4** | **Latency (<3s)**                 | Medium   | Requires performance profiling of Edge Functions and Client-Side rendering.       | 4 (2x2)            |

## Test Levels Strategy

Given the "Local-First PWA" architecture:

- **Unit: 40%**
  - **Focus**: Business logic in `services/`, Zustand selectors, prompt engineering utilities, data transformers.
  - **Rationale**: Core complexity is in state management and data transformation, not server interaction.
  - **Tool**: Vitest.

- **Integration: 30%**
  - **Focus**: `Service <-> Dexie` interactions, Sync Queue processing, API Proxy contracts.
  - **Rationale**: Validating that offline actions are correctly queued and later synced is the critical "integration" path.
  - **Tool**: Vitest (with in-memory Dexie) or Playwright (API/Component).

- **E2E: 30%**
  - **Focus**: Full "Vent to Draft" journey, PWA Installability, Offline-to-Online transitions.
  - **Rationale**: Critical user journeys depend on browser APIs (Service Worker, IndexedDB) that act differently in real environments.
  - **Tool**: Playwright.

## NFR Testing Approach

- **Security (SEC)**:
  - **Key Leakage**: Playwright network interception to verify `Authorization` headers and ensure API keys never appear in DOM or console.
  - **Data Sovereignty**: Automated checks to ensure sensitive "Vents" are NOT in network payloads (except to LLM endpoint).

- **Performance (PERF)**:
  - **Lighthouse CI**: Automate Core Web Vitals checks on every PR (TTI < 1.5s).
  - **Latency**: Measure "Time to First Token" simulation in E2E tests.

- **Reliability (OPS)**:
  - **Chaos Testing**: Simulate 429s (Rate Limits) and 500s from LLM Provider. Verify "Graceful Degradation" UI appears.
  - **Persistence**: Verify data survives browser restart and Service Worker updates.

- **Maintainability (TECH)**:
  - **Strict Boundaries**: ESLint rules to prevent UI components from importing `db` layer directly.

## Test Environment Requirements

- **Local (Dev)**: `localhost` with mocked LLM responses (Fast, Free).
- **CI (GitHub Actions)**: Headless browser support. Matrix testing for Mobile Safari (WebKit) emulation.
- **Staging**: Vercel Preview Deployment. Connected to live (but rate-limited) LLM keys for "Smoke Tests".

## Testability Concerns (if any)

- **LLM Cost/Flakiness**: Running E2E tests against real OpenAI/DeepSeek APIs is slow and expensive.
  - *Recommendation*: Use VCR/HAR recording for 90% of E2E runs. Only run "Live" tests on release branches.
- **Mobile Safari Debugging**: PWA bugs are notorious on iOS.
  - *Recommendation*: Manual "Sanity Check" on real iOS device is required before major releases until automated WebKit testing is proven reliable.

## Recommendations for Remaining Implementation

1.  **Strict Mocking Strategy**: Implement a "MockLLMService" immediately to unblock UI testing without burning API credits.
2.  **Visual Regression**: Add snapshot tests for the "Draft View" typography (Merriweather) since it's the core value artifact.
3.  **Sync Torture Test**: Create a specialized test suite that toggles network status rapidly during a "Venting" session to ensure no data loss.