- Next.js 14+ with App Router and TypeScript - Tailwind CSS and ShadCN UI styling - Zustand state management - Dexie.js for IndexedDB (local-first data) - Auth.js v5 for authentication - BMAD framework integration Co-Authored-By: Claude <noreply@anthropic.com>
83 lines
5.2 KiB
Markdown
83 lines
5.2 KiB
Markdown
# System-Level Test Design
|
|
|
|
## Testability Assessment
|
|
|
|
- **Controllability: PASS**
|
|
- **Dexie/IndexedDB**: Highly controllable. DB can be seeded, cleared, and inspected programmatically for tests.
|
|
- **State Management (Zustand)**: Store is decoupled from UI, allowing direct state manipulation during component testing.
|
|
- **Environment**: "Local-First" nature reduces dependency on flaky external staging environments for core logic.
|
|
- *Concern*: LLM API nondeterminism. Requires strict mocking/recording (Polly.js or Playwright HAR) for stable regression testing.
|
|
|
|
- **Observability: PASS**
|
|
- **Client-Side Logs**: Architecture mandates a "Client-Side Transaction Log" which provides excellent visibility into sync states.
|
|
- **Network Interception**: Playwright can easily inspect Vercel Edge Function calls to validate privacy (keys not leaked).
|
|
- *Concern*: Debugging production issues in a PWA requires robust telemetry (Sentry/LogRocket) since we can't access user's local DB directly.
|
|
|
|
- **Reliability: PASS**
|
|
- **Service Layer Pattern**: "Logic Sandwich" isolates business logic from UI, enabling reliable unit/integration testing of complex sync logic.
|
|
- **Offline-First**: Inherently more reliable architecture for testing flaky network conditions (can simulate offline easily).
|
|
|
|
## Architecturally Significant Requirements (ASRs)
|
|
|
|
| ASR ID | Requirement | Impact | Testing Challenge | Risk Score (P x I) |
|
|
| --------- | --------------------------------- | -------- | --------------------------------------------------------------------------------- | ------------------ |
|
|
| **ASR-1** | **Local-First / Offline Support** | Critical | Requires simulating network drops, tab closures, and sync resumption. | 9 (3x3) |
|
|
| **ASR-2** | **Privacy (Zero-Knowledge)** | Critical | Must verify NO user data leaves client. Negative testing required. | 9 (3x3) |
|
|
| **ASR-3** | **Dual-Agent Pipeline** | High | Complex state machine (Venting -> Insight -> Draft). Nondeterministic AI outputs. | 6 (2x3) |
|
|
| **ASR-4** | **Latency (<3s)** | Medium | Requires performance profiling of Edge Functions and Client-Side rendering. | 4 (2x2) |
|
|
|
|
## Test Levels Strategy
|
|
|
|
Given the "Local-First PWA" architecture:
|
|
|
|
- **Unit: 40%**
|
|
- **Focus**: Business logic in `services/`, Zustand selectors, prompt engineering utilities, data transformers.
|
|
- **Rationale**: Core complexity is in state management and data transformation, not server interaction.
|
|
- **Tool**: Vitest.
|
|
|
|
- **Integration: 30%**
|
|
- **Focus**: `Service <-> Dexie` interactions, Sync Queue processing, API Proxy contracts.
|
|
- **Rationale**: Validating that offline actions are correctly queued and later synced is the critical "integration" path.
|
|
- **Tool**: Vitest (with in-memory Dexie) or Playwright (API/Component).
|
|
|
|
- **E2E: 30%**
|
|
- **Focus**: Full "Vent to Draft" journey, PWA Installability, Offline-to-Online transitions.
|
|
- **Rationale**: Critical user journeys depend on browser APIs (Service Worker, IndexedDB) that act differently in real environments.
|
|
- **Tool**: Playwright.
|
|
|
|
## NFR Testing Approach
|
|
|
|
- **Security (SEC)**:
|
|
- **Key Leakage**: Playwright network interception to verify `Authorization` headers and ensure API keys never appear in DOM or console.
|
|
- **Data Sovereignty**: Automated checks to ensure sensitive "Vents" are NOT in network payloads (except to LLM endpoint).
|
|
|
|
- **Performance (PERF)**:
|
|
- **Lighthouse CI**: Automate Core Web Vitals checks on every PR (TTI < 1.5s).
|
|
- **Latency**: Measure "Time to First Token" simulation in E2E tests.
|
|
|
|
- **Reliability (OPS)**:
|
|
- **Chaos Testing**: Simulate 429s (Rate Limits) and 500s from LLM Provider. Verify "Graceful Degradation" UI appears.
|
|
- **Persistence**: Verify data survives browser restart and Service Worker updates.
|
|
|
|
- **Maintainability (TECH)**:
|
|
- **Strict Boundaries**: ESLint rules to prevent UI components from importing `db` layer directly.
|
|
|
|
## Test Environment Requirements
|
|
|
|
- **Local (Dev)**: `localhost` with mocked LLM responses (Fast, Free).
|
|
- **CI (GitHub Actions)**: Headless browser support. Matrix testing for Mobile Safari (WebKit) emulation.
|
|
- **Staging**: Vercel Preview Deployment. Connected to live (but rate-limited) LLM keys for "Smoke Tests".
|
|
|
|
## Testability Concerns (if any)
|
|
|
|
- **LLM Cost/Flakiness**: Running E2E tests against real OpenAI/DeepSeek APIs is slow and expensive.
|
|
- *Recommendation*: Use VCR/HAR recording for 90% of E2E runs. Only run "Live" tests on release branches.
|
|
- **Mobile Safari Debugging**: PWA bugs are notorious on iOS.
|
|
- *Recommendation*: Manual "Sanity Check" on real iOS device is required before major releases until automated WebKit testing is proven reliable.
|
|
|
|
## Recommendations for Remaining Implementation
|
|
|
|
1. **Strict Mocking Strategy**: Implement a "MockLLMService" immediately to unblock UI testing without burning API credits.
|
|
2. **Visual Regression**: Add snapshot tests for the "Draft View" typography (Merriweather) since it's the core value artifact.
|
|
3. **Sync Torture Test**: Create a specialized test suite that toggles network status rapidly during a "Venting" session to ensure no data loss.
|