Files

Max e9e6fadb1d fix: ChatBubble crash and DeepSeek API compatibility

- Fix ChatBubble to handle non-string content with String() wrapper
- Fix API route to use generateText for non-streaming requests
- Add @ai-sdk/openai-compatible for non-OpenAI providers (DeepSeek, etc.)
- Use Chat Completions API instead of Responses API for compatible providers
- Update ChatBubble tests and fix component exports to kebab-case
- Remove stale PascalCase ChatBubble.tsx file

2026-01-26 16:55:05 +07:00

5.1 KiB

Raw Blame History

Test Design: Epic 2 - The Magic Mirror

Epic: 2 (Ghostwriter & Draft Refinement) Scope: Epic-Level Date: 2026-01-25 Author: QA Architect (AI)

1. Risk Assessment

Identified Risks

Risk ID	Category	Title	Description	Probability (1-3)	Impact (1-3)	Score	Action
R-2.1	BUS	Hallucination / Poor Quality	Ghostwriter generates content unrelated to the user's insight or creates fictional details.	2 (Possible)	3 (Critical)	6	MITIGATE
R-2.2	TECH	Context Window Overflow	Long chat sessions exceed the token limit for the generation prompt, causing truncation or errors.	2 (Possible)	3 (Critical)	6	MITIGATE
R-2.3	TECH	State Desynchronization	UI gets stuck in "Drafting" state if the LLM request hangs or fails silently.	2 (Possible)	2 (Degraded)	4	MONITOR
R-2.4	TECH	Clipboard API Failures	"One-Click Copy" fails on certain mobile browsers due to permission policies.	2 (Possible)	2 (Degraded)	4	MONITOR
R-2.5	UI	Markdown Rendering Issues	Generated artifacts break layout (e.g., extremely long code blocks, tables on mobile).	1 (Unlikely)	1 (Minor)	1	DOCUMENT

Mitigation Strategies (High Risks)

R-2.1: Hallucination / Poor Quality (Score 6)

Mitigation: Implement specific "Grounding" prompts. Use evals (automated evaluation) to check if output tokens overlap with input "Insight" tokens.
Owner: Prompt Engineer / Dev
Validation: Automated Prompt Tests (checking recall of key facts).

R-2.2: Context Window Overflow (Score 6)

Mitigation: Implement strict token counting utility. Summarize or truncate chat history intelligently before sending to Ghostwriter.
Owner: Dev Team
Validation: Unit tests for PromptEngine with large mock inputs.

2. Test Coverage Plan

Acceptance Criteria Mapping

Story	ID	Scenario	Level	Priority	Risk Link
2.1	2.1.1	Ghostwriter receives correct chat context (Prompt Construction)	Unit	P0	R-2.1
2.1	2.1.2	Token limit enforcement (Truncation/Error)	Unit	P0	R-2.2
2.1	2.1.3	Generated generation is valid Markdown	Unit	P1	R-2.5
2.2	2.2.1	Draft Sheet slides up upon completion	Component	P1	-
2.2	2.2.2	Draft view renders Markdown correctly (Headers, lists)	Component	P2	R-2.5
2.3	2.3.1	"Thumbs Down" triggers feedback prompt	Integration	P1	-
2.3	2.3.2	Regeneration respects user critique	E2E	P0	R-2.1
2.4	2.4.1	"Copy" button places text in clipboard	E2E	P0	R-2.4
2.4	2.4.2	"Save" marks session as completed in DB	Integration	P0	-

Test Levels Strategy

Unit Tests:
- PromptEngine: Verify context insertion and token limits.
- MarkdownParser: Verify safe rendering logic.
Component Tests:
- DraftSheet: Verify open/close animations and state binding (Zustand).
- MarkdownRenderer: Visual regression tests for styles.
Integration Tests:
- GhostwriterService: Mock LLM response -> Verify State Update -> Verify DB Update.
E2E Tests:
- Full Flow (P0): Chat -> Generate -> Copy to Clipboard.
- Refinement Flow (P1): Generate -> Critique -> Regenerate.

3. Execution Plan

Smoke Tests (Pre-Merge)

Unit: PromptEngine sanity checks.
E2E: Basic Generation Flow (Mocked LLM).

Regression Suite (Nightly)

Unit: Token limit edge cases.
E2E: Clipboard functionality on mobile viewport emulation.
Prompt Evals: Quality checks on sample inputs.

Resource Estimates

P0 Scenarios: 5 tests (approx. 5 hours implementation).
P1 Scenarios: 3 tests (approx. 2 hours implementation).
P2 Scenarios: 1 test (approx. 0.5 hours implementation).
Total Effort: ~1 day.

4. Quality Gate Criteria

Pass Rate: 100% on P0 tests.
Performance: Generation starts within 5s (mocked latency).
Mitigation: Token limiter unit tests must pass.

5.1 KiB Raw Blame History