# System-Level Test Design ## Testability Assessment - **Controllability: PASS** - **Dexie/IndexedDB**: Highly controllable. DB can be seeded, cleared, and inspected programmatically for tests. - **State Management (Zustand)**: Store is decoupled from UI, allowing direct state manipulation during component testing. - **Environment**: "Local-First" nature reduces dependency on flaky external staging environments for core logic. - *Concern*: LLM API nondeterminism. Requires strict mocking/recording (Polly.js or Playwright HAR) for stable regression testing. - **Observability: PASS** - **Client-Side Logs**: Architecture mandates a "Client-Side Transaction Log" which provides excellent visibility into sync states. - **Network Interception**: Playwright can easily inspect Vercel Edge Function calls to validate privacy (keys not leaked). - *Concern*: Debugging production issues in a PWA requires robust telemetry (Sentry/LogRocket) since we can't access user's local DB directly. - **Reliability: PASS** - **Service Layer Pattern**: "Logic Sandwich" isolates business logic from UI, enabling reliable unit/integration testing of complex sync logic. - **Offline-First**: Inherently more reliable architecture for testing flaky network conditions (can simulate offline easily). ## Architecturally Significant Requirements (ASRs) | ASR ID | Requirement | Impact | Testing Challenge | Risk Score (P x I) | | --------- | --------------------------------- | -------- | --------------------------------------------------------------------------------- | ------------------ | | **ASR-1** | **Local-First / Offline Support** | Critical | Requires simulating network drops, tab closures, and sync resumption. | 9 (3x3) | | **ASR-2** | **Privacy (Zero-Knowledge)** | Critical | Must verify NO user data leaves client. Negative testing required. | 9 (3x3) | | **ASR-3** | **Dual-Agent Pipeline** | High | Complex state machine (Venting -> Insight -> Draft). Nondeterministic AI outputs. | 6 (2x3) | | **ASR-4** | **Latency (<3s)** | Medium | Requires performance profiling of Edge Functions and Client-Side rendering. | 4 (2x2) | ## Test Levels Strategy Given the "Local-First PWA" architecture: - **Unit: 40%** - **Focus**: Business logic in `services/`, Zustand selectors, prompt engineering utilities, data transformers. - **Rationale**: Core complexity is in state management and data transformation, not server interaction. - **Tool**: Vitest. - **Integration: 30%** - **Focus**: `Service <-> Dexie` interactions, Sync Queue processing, API Proxy contracts. - **Rationale**: Validating that offline actions are correctly queued and later synced is the critical "integration" path. - **Tool**: Vitest (with in-memory Dexie) or Playwright (API/Component). - **E2E: 30%** - **Focus**: Full "Vent to Draft" journey, PWA Installability, Offline-to-Online transitions. - **Rationale**: Critical user journeys depend on browser APIs (Service Worker, IndexedDB) that act differently in real environments. - **Tool**: Playwright. ## NFR Testing Approach - **Security (SEC)**: - **Key Leakage**: Playwright network interception to verify `Authorization` headers and ensure API keys never appear in DOM or console. - **Data Sovereignty**: Automated checks to ensure sensitive "Vents" are NOT in network payloads (except to LLM endpoint). - **Performance (PERF)**: - **Lighthouse CI**: Automate Core Web Vitals checks on every PR (TTI < 1.5s). - **Latency**: Measure "Time to First Token" simulation in E2E tests. - **Reliability (OPS)**: - **Chaos Testing**: Simulate 429s (Rate Limits) and 500s from LLM Provider. Verify "Graceful Degradation" UI appears. - **Persistence**: Verify data survives browser restart and Service Worker updates. - **Maintainability (TECH)**: - **Strict Boundaries**: ESLint rules to prevent UI components from importing `db` layer directly. ## Test Environment Requirements - **Local (Dev)**: `localhost` with mocked LLM responses (Fast, Free). - **CI (GitHub Actions)**: Headless browser support. Matrix testing for Mobile Safari (WebKit) emulation. - **Staging**: Vercel Preview Deployment. Connected to live (but rate-limited) LLM keys for "Smoke Tests". ## Testability Concerns (if any) - **LLM Cost/Flakiness**: Running E2E tests against real OpenAI/DeepSeek APIs is slow and expensive. - *Recommendation*: Use VCR/HAR recording for 90% of E2E runs. Only run "Live" tests on release branches. - **Mobile Safari Debugging**: PWA bugs are notorious on iOS. - *Recommendation*: Manual "Sanity Check" on real iOS device is required before major releases until automated WebKit testing is proven reliable. ## Recommendations for Remaining Implementation 1. **Strict Mocking Strategy**: Implement a "MockLLMService" immediately to unblock UI testing without burning API credits. 2. **Visual Regression**: Add snapshot tests for the "Draft View" typography (Merriweather) since it's the core value artifact. 3. **Sync Torture Test**: Create a specialized test suite that toggles network status rapidly during a "Venting" session to ensure no data loss.