Ignore and untrack BMad directories

2026-01-26 15:49:36 +07:00
parent 7b732372e3
commit 6b113e0392
525 changed files with 2 additions and 112645 deletions
--- a/_bmad/bmm/workflows/testarch/atdd/atdd-checklist-template.md
+++ b/_bmad/bmm/workflows/testarch/atdd/atdd-checklist-template.md
@@ -1,364 +0,0 @@
-# ATDD Checklist - Epic {epic_num}, Story {story_num}: {story_title}
-
-**Date:** {date}
-**Author:** {user_name}
-**Primary Test Level:** {primary_level}
-
---
-
-## Story Summary
-
-{Brief 2-3 sentence summary of the user story}
-
-**As a** {user_role}
-**I want** {feature_description}
-**So that** {business_value}
-
---
-
-## Acceptance Criteria
-
-{List all testable acceptance criteria from the story}
-
-1. {Acceptance criterion 1}
-2. {Acceptance criterion 2}
-3. {Acceptance criterion 3}
-
---
-
-## Failing Tests Created (RED Phase)
-
-### E2E Tests ({e2e_test_count} tests)
-
-**File:** `{e2e_test_file_path}` ({line_count} lines)
-
-{List each E2E test with its current status and expected failure reason}
-
- ✅ **Test:** {test_name}
-  - **Status:** RED - {failure_reason}
-  - **Verifies:** {what_this_test_validates}
-
-### API Tests ({api_test_count} tests)
-
-**File:** `{api_test_file_path}` ({line_count} lines)
-
-{List each API test with its current status and expected failure reason}
-
- ✅ **Test:** {test_name}
-  - **Status:** RED - {failure_reason}
-  - **Verifies:** {what_this_test_validates}
-
-### Component Tests ({component_test_count} tests)
-
-**File:** `{component_test_file_path}` ({line_count} lines)
-
-{List each component test with its current status and expected failure reason}
-
- ✅ **Test:** {test_name}
-  - **Status:** RED - {failure_reason}
-  - **Verifies:** {what_this_test_validates}
-
---
-
-## Data Factories Created
-
-{List all data factory files created with their exports}
-
-### {Entity} Factory
-
-**File:** `tests/support/factories/{entity}.factory.ts`
-
-**Exports:**
-
- `create{Entity}(overrides?)` - Create single entity with optional overrides
- `create{Entity}s(count)` - Create array of entities
-
-**Example Usage:**
-
-```typescript
-const user = createUser({ email: 'specific@example.com' });
-const users = createUsers(5); // Generate 5 random users
-```
-
---
-
-## Fixtures Created
-
-{List all test fixture files created with their fixture names and descriptions}
-
-### {Feature} Fixtures
-
-**File:** `tests/support/fixtures/{feature}.fixture.ts`
-
-**Fixtures:**
-
- `{fixtureName}` - {description_of_what_fixture_provides}
-  - **Setup:** {what_setup_does}
-  - **Provides:** {what_test_receives}
-  - **Cleanup:** {what_cleanup_does}
-
-**Example Usage:**
-
-```typescript
-import { test } from './fixtures/{feature}.fixture';
-
-test('should do something', async ({ {fixtureName} }) => {
-  // {fixtureName} is ready to use with auto-cleanup
-});
-```
-
---
-
-## Mock Requirements
-
-{Document external services that need mocking and their requirements}
-
-### {Service Name} Mock
-
-**Endpoint:** `{HTTP_METHOD} {endpoint_url}`
-
-**Success Response:**
-
-```json
-{
-  {success_response_example}
-}
-```
-
-**Failure Response:**
-
-```json
-{
-  {failure_response_example}
-}
-```
-
-**Notes:** {any_special_mock_requirements}
-
---
-
-## Required data-testid Attributes
-
-{List all data-testid attributes required in UI implementation for test stability}
-
-### {Page or Component Name}
-
- `{data-testid-name}` - {description_of_element}
- `{data-testid-name}` - {description_of_element}
-
-**Implementation Example:**
-
-```tsx
-<button data-testid="login-button">Log In</button>
-<input data-testid="email-input" type="email" />
-<div data-testid="error-message">{errorText}</div>
-```
-
---
-
-## Implementation Checklist
-
-{Map each failing test to concrete implementation tasks that will make it pass}
-
-### Test: {test_name_1}
-
-**File:** `{test_file_path}`
-
-**Tasks to make this test pass:**
-
- [ ] {Implementation task 1}
- [ ] {Implementation task 2}
- [ ] {Implementation task 3}
- [ ] Add required data-testid attributes: {list_of_testids}
- [ ] Run test: `{test_execution_command}`
- [ ] ✅ Test passes (green phase)
-
-**Estimated Effort:** {effort_estimate} hours
-
---
-
-### Test: {test_name_2}
-
-**File:** `{test_file_path}`
-
-**Tasks to make this test pass:**
-
- [ ] {Implementation task 1}
- [ ] {Implementation task 2}
- [ ] {Implementation task 3}
- [ ] Add required data-testid attributes: {list_of_testids}
- [ ] Run test: `{test_execution_command}`
- [ ] ✅ Test passes (green phase)
-
-**Estimated Effort:** {effort_estimate} hours
-
---
-
-## Running Tests
-
-```bash
-# Run all failing tests for this story
-{test_command_all}
-
-# Run specific test file
-{test_command_specific_file}
-
-# Run tests in headed mode (see browser)
-{test_command_headed}
-
-# Debug specific test
-{test_command_debug}
-
-# Run tests with coverage
-{test_command_coverage}
-```
-
---
-
-## Red-Green-Refactor Workflow
-
-### RED Phase (Complete) ✅
-
-**TEA Agent Responsibilities:**
-
- ✅ All tests written and failing
- ✅ Fixtures and factories created with auto-cleanup
- ✅ Mock requirements documented
- ✅ data-testid requirements listed
- ✅ Implementation checklist created
-
-**Verification:**
-
- All tests run and fail as expected
- Failure messages are clear and actionable
- Tests fail due to missing implementation, not test bugs
-
---
-
-### GREEN Phase (DEV Team - Next Steps)
-
-**DEV Agent Responsibilities:**
-
-1. **Pick one failing test** from implementation checklist (start with highest priority)
-2. **Read the test** to understand expected behavior
-3. **Implement minimal code** to make that specific test pass
-4. **Run the test** to verify it now passes (green)
-5. **Check off the task** in implementation checklist
-6. **Move to next test** and repeat
-
-**Key Principles:**
-
- One test at a time (don't try to fix all at once)
- Minimal implementation (don't over-engineer)
- Run tests frequently (immediate feedback)
- Use implementation checklist as roadmap
-
-**Progress Tracking:**
-
- Check off tasks as you complete them
- Share progress in daily standup
- Mark story as IN PROGRESS in `bmm-workflow-status.md`
-
---
-
-### REFACTOR Phase (DEV Team - After All Tests Pass)
-
-**DEV Agent Responsibilities:**
-
-1. **Verify all tests pass** (green phase complete)
-2. **Review code for quality** (readability, maintainability, performance)
-3. **Extract duplications** (DRY principle)
-4. **Optimize performance** (if needed)
-5. **Ensure tests still pass** after each refactor
-6. **Update documentation** (if API contracts change)
-
-**Key Principles:**
-
- Tests provide safety net (refactor with confidence)
- Make small refactors (easier to debug if tests fail)
- Run tests after each change
- Don't change test behavior (only implementation)
-
-**Completion:**
-
- All tests pass
- Code quality meets team standards
- No duplications or code smells
- Ready for code review and story approval
-
---
-
-## Next Steps
-
-1. **Share this checklist and failing tests** with the dev workflow (manual handoff)
-2. **Review this checklist** with team in standup or planning
-3. **Run failing tests** to confirm RED phase: `{test_command_all}`
-4. **Begin implementation** using implementation checklist as guide
-5. **Work one test at a time** (red → green for each)
-6. **Share progress** in daily standup
-7. **When all tests pass**, refactor code for quality
-8. **When refactoring complete**, manually update story status to 'done' in sprint-status.yaml
-
---
-
-## Knowledge Base References Applied
-
-This ATDD workflow consulted the following knowledge fragments:
-
- **fixture-architecture.md** - Test fixture patterns with setup/teardown and auto-cleanup using Playwright's `test.extend()`
- **data-factories.md** - Factory patterns using `@faker-js/faker` for random test data generation with overrides support
- **component-tdd.md** - Component test strategies using Playwright Component Testing
- **network-first.md** - Route interception patterns (intercept BEFORE navigation to prevent race conditions)
- **test-quality.md** - Test design principles (Given-When-Then, one assertion per test, determinism, isolation)
- **test-levels-framework.md** - Test level selection framework (E2E vs API vs Component vs Unit)
-
-See `tea-index.csv` for complete knowledge fragment mapping.
-
---
-
-## Test Execution Evidence
-
-### Initial Test Run (RED Phase Verification)
-
-**Command:** `{test_command_all}`
-
-**Results:**
-
-```
-{paste_test_run_output_showing_all_tests_failing}
-```
-
-**Summary:**
-
- Total tests: {total_test_count}
- Passing: 0 (expected)
- Failing: {total_test_count} (expected)
- Status: ✅ RED phase verified
-
-**Expected Failure Messages:**
-{list_expected_failure_messages_for_each_test}
-
---
-
-## Notes
-
-{Any additional notes, context, or special considerations for this story}
-
- {Note 1}
- {Note 2}
- {Note 3}
-
---
-
-## Contact
-
-**Questions or Issues?**
-
- Ask in team standup
- Tag @{tea_agent_username} in Slack/Discord
- Refer to `./bmm/docs/tea-README.md` for workflow documentation
- Consult `./bmm/testarch/knowledge` for testing best practices
-
---
-
-**Generated by BMad TEA Agent** - {date}
--- a/_bmad/bmm/workflows/testarch/atdd/checklist.md
+++ b/_bmad/bmm/workflows/testarch/atdd/checklist.md
@@ -1,374 +0,0 @@
-# ATDD Workflow Validation Checklist
-
-Use this checklist to validate that the ATDD workflow has been executed correctly and all deliverables meet quality standards.
-
-## Prerequisites
-
-Before starting this workflow, verify:
-
- [ ] Story approved with clear acceptance criteria (AC must be testable)
- [ ] Development sandbox/environment ready
- [ ] Framework scaffolding exists (run `framework` workflow if missing)
- [ ] Test framework configuration available (playwright.config.ts or cypress.config.ts)
- [ ] Package.json has test dependencies installed (Playwright or Cypress)
-
-**Halt if missing:** Framework scaffolding or story acceptance criteria
-
---
-
-## Step 1: Story Context and Requirements
-
- [ ] Story markdown file loaded and parsed successfully
- [ ] All acceptance criteria identified and extracted
- [ ] Affected systems and components identified
- [ ] Technical constraints documented
- [ ] Framework configuration loaded (playwright.config.ts or cypress.config.ts)
- [ ] Test directory structure identified from config
- [ ] Existing fixture patterns reviewed for consistency
- [ ] Similar test patterns searched and found in `{test_dir}`
- [ ] Knowledge base fragments loaded:
-  - [ ] `fixture-architecture.md`
-  - [ ] `data-factories.md`
-  - [ ] `component-tdd.md`
-  - [ ] `network-first.md`
-  - [ ] `test-quality.md`
-
---
-
-## Step 2: Test Level Selection and Strategy
-
- [ ] Each acceptance criterion analyzed for appropriate test level
- [ ] Test level selection framework applied (E2E vs API vs Component vs Unit)
- [ ] E2E tests: Critical user journeys and multi-system integration identified
- [ ] API tests: Business logic and service contracts identified
- [ ] Component tests: UI component behavior and interactions identified
- [ ] Unit tests: Pure logic and edge cases identified (if applicable)
- [ ] Duplicate coverage avoided (same behavior not tested at multiple levels unnecessarily)
- [ ] Tests prioritized using P0-P3 framework (if test-design document exists)
- [ ] Primary test level set in `primary_level` variable (typically E2E or API)
- [ ] Test levels documented in ATDD checklist
-
---
-
-## Step 3: Failing Tests Generated
-
-### Test File Structure Created
-
- [ ] Test files organized in appropriate directories:
-  - [ ] `tests/e2e/` for end-to-end tests
-  - [ ] `tests/api/` for API tests
-  - [ ] `tests/component/` for component tests
-  - [ ] `tests/support/` for infrastructure (fixtures, factories, helpers)
-
-### E2E Tests (If Applicable)
-
- [ ] E2E test files created in `tests/e2e/`
- [ ] All tests follow Given-When-Then format
- [ ] Tests use `data-testid` selectors (not CSS classes or fragile selectors)
- [ ] One assertion per test (atomic test design)
- [ ] No hard waits or sleeps (explicit waits only)
- [ ] Network-first pattern applied (route interception BEFORE navigation)
- [ ] Tests fail initially (RED phase verified by local test run)
- [ ] Failure messages are clear and actionable
-
-### API Tests (If Applicable)
-
- [ ] API test files created in `tests/api/`
- [ ] Tests follow Given-When-Then format
- [ ] API contracts validated (request/response structure)
- [ ] HTTP status codes verified
- [ ] Response body validation includes all required fields
- [ ] Error cases tested (400, 401, 403, 404, 500)
- [ ] Tests fail initially (RED phase verified)
-
-### Component Tests (If Applicable)
-
- [ ] Component test files created in `tests/component/`
- [ ] Tests follow Given-When-Then format
- [ ] Component mounting works correctly
- [ ] Interaction testing covers user actions (click, hover, keyboard)
- [ ] State management within component validated
- [ ] Props and events tested
- [ ] Tests fail initially (RED phase verified)
-
-### Test Quality Validation
-
- [ ] All tests use Given-When-Then structure with clear comments
- [ ] All tests have descriptive names explaining what they test
- [ ] No duplicate tests (same behavior tested multiple times)
- [ ] No flaky patterns (race conditions, timing issues)
- [ ] No test interdependencies (tests can run in any order)
- [ ] Tests are deterministic (same input always produces same result)
-
---
-
-## Step 4: Data Infrastructure Built
-
-### Data Factories Created
-
- [ ] Factory files created in `tests/support/factories/`
- [ ] All factories use `@faker-js/faker` for random data generation (no hardcoded values)
- [ ] Factories support overrides for specific test scenarios
- [ ] Factories generate complete valid objects matching API contracts
- [ ] Helper functions for bulk creation provided (e.g., `createUsers(count)`)
- [ ] Factory exports are properly typed (TypeScript)
-
-### Test Fixtures Created
-
- [ ] Fixture files created in `tests/support/fixtures/`
- [ ] All fixtures use Playwright's `test.extend()` pattern
- [ ] Fixtures have setup phase (arrange test preconditions)
- [ ] Fixtures provide data to tests via `await use(data)`
- [ ] Fixtures have teardown phase with auto-cleanup (delete created data)
- [ ] Fixtures are composable (can use other fixtures if needed)
- [ ] Fixtures are isolated (each test gets fresh data)
- [ ] Fixtures are type-safe (TypeScript types defined)
-
-### Mock Requirements Documented
-
- [ ] External service mocking requirements identified
- [ ] Mock endpoints documented with URLs and methods
- [ ] Success response examples provided
- [ ] Failure response examples provided
- [ ] Mock requirements documented in ATDD checklist for DEV team
-
-### data-testid Requirements Listed
-
- [ ] All required data-testid attributes identified from E2E tests
- [ ] data-testid list organized by page or component
- [ ] Each data-testid has clear description of element it targets
- [ ] data-testid list included in ATDD checklist for DEV team
-
---
-
-## Step 5: Implementation Checklist Created
-
- [ ] Implementation checklist created with clear structure
- [ ] Each failing test mapped to concrete implementation tasks
- [ ] Tasks include:
-  - [ ] Route/component creation
-  - [ ] Business logic implementation
-  - [ ] API integration
-  - [ ] data-testid attribute additions
-  - [ ] Error handling
-  - [ ] Test execution command
-  - [ ] Completion checkbox
- [ ] Red-Green-Refactor workflow documented in checklist
- [ ] RED phase marked as complete (TEA responsibility)
- [ ] GREEN phase tasks listed for DEV team
- [ ] REFACTOR phase guidance provided
- [ ] Execution commands provided:
-  - [ ] Run all tests: `npm run test:e2e`
-  - [ ] Run specific test file
-  - [ ] Run in headed mode
-  - [ ] Debug specific test
- [ ] Estimated effort included (hours or story points)
-
---
-
-## Step 6: Deliverables Generated
-
-### ATDD Checklist Document Created
-
- [ ] Output file created at `{output_folder}/atdd-checklist-{story_id}.md`
- [ ] Document follows template structure from `atdd-checklist-template.md`
- [ ] Document includes all required sections:
-  - [ ] Story summary
-  - [ ] Acceptance criteria breakdown
-  - [ ] Failing tests created (paths and line counts)
-  - [ ] Data factories created
-  - [ ] Fixtures created
-  - [ ] Mock requirements
-  - [ ] Required data-testid attributes
-  - [ ] Implementation checklist
-  - [ ] Red-green-refactor workflow
-  - [ ] Execution commands
-  - [ ] Next steps for DEV team
- [ ] Output shared with DEV workflow (manual handoff; not auto-consumed)
-
-### All Tests Verified to Fail (RED Phase)
-
- [ ] Full test suite run locally before finalizing
- [ ] All tests fail as expected (RED phase confirmed)
- [ ] No tests passing before implementation (if passing, test is invalid)
- [ ] Failure messages documented in ATDD checklist
- [ ] Failures are due to missing implementation, not test bugs
- [ ] Test run output captured for reference
-
-### Summary Provided
-
- [ ] Summary includes:
-  - [ ] Story ID
-  - [ ] Primary test level
-  - [ ] Test counts (E2E, API, Component)
-  - [ ] Test file paths
-  - [ ] Factory count
-  - [ ] Fixture count
-  - [ ] Mock requirements count
-  - [ ] data-testid count
-  - [ ] Implementation task count
-  - [ ] Estimated effort
-  - [ ] Next steps for DEV team
-  - [ ] Output file path
-  - [ ] Knowledge base references applied
-
---
-
-## Quality Checks
-
-### Test Design Quality
-
- [ ] Tests are readable (clear Given-When-Then structure)
- [ ] Tests are maintainable (use factories and fixtures, not hardcoded data)
- [ ] Tests are isolated (no shared state between tests)
- [ ] Tests are deterministic (no race conditions or flaky patterns)
- [ ] Tests are atomic (one assertion per test)
- [ ] Tests are fast (no unnecessary waits or delays)
-
-### Knowledge Base Integration
-
- [ ] fixture-architecture.md patterns applied to all fixtures
- [ ] data-factories.md patterns applied to all factories
- [ ] network-first.md patterns applied to E2E tests with network requests
- [ ] component-tdd.md patterns applied to component tests
- [ ] test-quality.md principles applied to all test design
-
-### Code Quality
-
- [ ] All TypeScript types are correct and complete
- [ ] No linting errors in generated test files
- [ ] Consistent naming conventions followed
- [ ] Imports are organized and correct
- [ ] Code follows project style guide
-
---
-
-## Integration Points
-
-### With DEV Agent
-
- [ ] ATDD checklist provides clear implementation guidance
- [ ] Implementation tasks are granular and actionable
- [ ] data-testid requirements are complete and clear
- [ ] Mock requirements include all necessary details
- [ ] Execution commands work correctly
-
-### With Story Workflow
-
- [ ] Story ID correctly referenced in output files
- [ ] Acceptance criteria from story accurately reflected in tests
- [ ] Technical constraints from story considered in test design
-
-### With Framework Workflow
-
- [ ] Test framework configuration correctly detected and used
- [ ] Directory structure matches framework setup
- [ ] Fixtures and helpers follow established patterns
- [ ] Naming conventions consistent with framework standards
-
-### With test-design Workflow (If Available)
-
- [ ] P0 scenarios from test-design prioritized in ATDD
- [ ] Risk assessment from test-design considered in test coverage
- [ ] Coverage strategy from test-design aligned with ATDD tests
-
---
-
-## Completion Criteria
-
-All of the following must be true before marking this workflow as complete:
-
- [ ] **Story acceptance criteria analyzed** and mapped to appropriate test levels
- [ ] **Failing tests created** at all appropriate levels (E2E, API, Component)
- [ ] **Given-When-Then format** used consistently across all tests
- [ ] **RED phase verified** by local test run (all tests failing as expected)
- [ ] **Network-first pattern** applied to E2E tests with network requests
- [ ] **Data factories created** using faker (no hardcoded test data)
- [ ] **Fixtures created** with auto-cleanup in teardown
- [ ] **Mock requirements documented** for external services
- [ ] **data-testid attributes listed** for DEV team
- [ ] **Implementation checklist created** mapping tests to code tasks
- [ ] **Red-green-refactor workflow documented** in ATDD checklist
- [ ] **Execution commands provided** and verified to work
- [ ] **ATDD checklist document created** and saved to correct location
- [ ] **Output file formatted correctly** using template structure
- [ ] **Knowledge base references applied** and documented in summary
- [ ] **No test quality issues** (flaky patterns, race conditions, hardcoded data)
-
---
-
-## Common Issues and Resolutions
-
-### Issue: Tests pass before implementation
-
-**Problem:** A test passes even though no implementation code exists yet.
-
-**Resolution:**
-
- Review test to ensure it's testing actual behavior, not mocked/stubbed behavior
- Check if test is accidentally using existing functionality
- Verify test assertions are correct and meaningful
- Rewrite test to fail until implementation is complete
-
-### Issue: Network-first pattern not applied
-
-**Problem:** Route interception happens after navigation, causing race conditions.
-
-**Resolution:**
-
- Move `await page.route()` calls BEFORE `await page.goto()`
- Review `network-first.md` knowledge fragment
- Update all E2E tests to follow network-first pattern
-
-### Issue: Hardcoded test data in tests
-
-**Problem:** Tests use hardcoded strings/numbers instead of factories.
-
-**Resolution:**
-
- Replace all hardcoded data with factory function calls
- Use `faker` for all random data generation
- Update data-factories to support all required test scenarios
-
-### Issue: Fixtures missing auto-cleanup
-
-**Problem:** Fixtures create data but don't clean it up in teardown.
-
-**Resolution:**
-
- Add cleanup logic after `await use(data)` in fixture
- Call deletion/cleanup functions in teardown
- Verify cleanup works by checking database/storage after test run
-
-### Issue: Tests have multiple assertions
-
-**Problem:** Tests verify multiple behaviors in single test (not atomic).
-
-**Resolution:**
-
- Split into separate tests (one assertion per test)
- Each test should verify exactly one behavior
- Use descriptive test names to clarify what each test verifies
-
-### Issue: Tests depend on execution order
-
-**Problem:** Tests fail when run in isolation or different order.
-
-**Resolution:**
-
- Remove shared state between tests
- Each test should create its own test data
- Use fixtures for consistent setup across tests
- Verify tests can run with `.only` flag
-
---
-
-## Notes for TEA Agent
-
- **Preflight halt is critical:** Do not proceed if story has no acceptance criteria or framework is missing
- **RED phase verification is mandatory:** Tests must fail before sharing with DEV team
- **Network-first pattern:** Route interception BEFORE navigation prevents race conditions
- **One assertion per test:** Atomic tests provide clear failure diagnosis
- **Auto-cleanup is non-negotiable:** Every fixture must clean up data in teardown
- **Use knowledge base:** Load relevant fragments (fixture-architecture, data-factories, network-first, component-tdd, test-quality) for guidance
- **Share with DEV agent:** ATDD checklist provides implementation roadmap from red to green
--- a/_bmad/bmm/workflows/testarch/atdd/instructions.md
+++ b/_bmad/bmm/workflows/testarch/atdd/instructions.md
@@ -1,806 +0,0 @@
-<!-- Powered by BMAD-CORE™ -->
-
-# Acceptance Test-Driven Development (ATDD)
-
-**Workflow ID**: `_bmad/bmm/testarch/atdd`
-**Version**: 4.0 (BMad v6)
-
---
-
-## Overview
-
-Generates failing acceptance tests BEFORE implementation following TDD's red-green-refactor cycle. This workflow creates comprehensive test coverage at appropriate levels (E2E, API, Component) with supporting infrastructure (fixtures, factories, mocks) and provides an implementation checklist to guide development.
-
-**Core Principle**: Tests fail first (red phase), then guide development to green, then enable confident refactoring.
-
---
-
-## Preflight Requirements
-
-**Critical:** Verify these requirements before proceeding. If any fail, HALT and notify the user.
-
- ✅ Story approved with clear acceptance criteria
- ✅ Development sandbox/environment ready
- ✅ Framework scaffolding exists (run `framework` workflow if missing)
- ✅ Test framework configuration available (playwright.config.ts or cypress.config.ts)
-
---
-
-## Step 1: Load Story Context and Requirements
-
-### Actions
-
-1. **Read Story Markdown**
-   - Load story file from `{story_file}` variable
-   - Extract acceptance criteria (all testable requirements)
-   - Identify affected systems and components
-   - Note any technical constraints or dependencies
-
-2. **Load Framework Configuration**
-   - Read framework config (playwright.config.ts or cypress.config.ts)
-   - Identify test directory structure
-   - Check existing fixture patterns
-   - Note test runner capabilities
-
-3. **Load Existing Test Patterns**
-   - Search `{test_dir}` for similar tests
-   - Identify reusable fixtures and helpers
-   - Check data factory patterns
-   - Note naming conventions
-
-4. **Check Playwright Utils Flag**
-
-   Read `{config_source}` and check `config.tea_use_playwright_utils`.
-
-5. **Load Knowledge Base Fragments**
-
-   **Critical:** Consult `{project-root}/_bmad/bmm/testarch/tea-index.csv` to load:
-
-   **Core Patterns (Always load):**
-   - `data-factories.md` - Factory patterns using faker (override patterns, nested factories, API seeding, 498 lines, 5 examples)
-   - `component-tdd.md` - Component test strategies (red-green-refactor, provider isolation, accessibility, visual regression, 480 lines, 4 examples)
-   - `test-quality.md` - Test design principles (deterministic tests, isolated with cleanup, explicit assertions, length limits, execution time optimization, 658 lines, 5 examples)
-   - `test-healing-patterns.md` - Common failure patterns and healing strategies (stale selectors, race conditions, dynamic data, network errors, hard waits, 648 lines, 5 examples)
-   - `selector-resilience.md` - Selector best practices (data-testid > ARIA > text > CSS hierarchy, dynamic patterns, anti-patterns, 541 lines, 4 examples)
-   - `timing-debugging.md` - Race condition prevention and async debugging (network-first, deterministic waiting, anti-patterns, 370 lines, 3 examples)
-
-   **If `config.tea_use_playwright_utils: true` (All Utilities):**
-   - `overview.md` - Playwright utils for ATDD patterns
-   - `api-request.md` - API test examples with schema validation
-   - `network-recorder.md` - HAR record/playback for UI acceptance tests
-   - `auth-session.md` - Auth setup for acceptance tests
-   - `intercept-network-call.md` - Network interception in ATDD scenarios
-   - `recurse.md` - Polling for async acceptance criteria
-   - `log.md` - Logging in ATDD tests
-   - `file-utils.md` - File download validation in acceptance tests
-   - `network-error-monitor.md` - Catch silent failures in ATDD
-   - `fixtures-composition.md` - Composing utilities for ATDD
-
-   **If `config.tea_use_playwright_utils: false`:**
-   - `fixture-architecture.md` - Test fixture patterns with auto-cleanup (pure function → fixture → mergeTests composition, 406 lines, 5 examples)
-   - `network-first.md` - Route interception patterns (intercept before navigate, HAR capture, deterministic waiting, 489 lines, 5 examples)
-
-**Halt Condition:** If story has no acceptance criteria or framework is missing, HALT with message: "ATDD requires clear acceptance criteria and test framework setup"
-
---
-
-## Step 1.5: Generation Mode Selection (NEW - Phase 2.5)
-
-### Actions
-
-1. **Detect Generation Mode**
-
-   Determine mode based on scenario complexity:
-
-   **AI Generation Mode (DEFAULT)**:
-   - Clear acceptance criteria with standard patterns
-   - Uses: AI-generated tests from requirements
-   - Appropriate for: CRUD, auth, navigation, API tests
-   - Fastest approach
-
-   **Recording Mode (OPTIONAL - Complex UI)**:
-   - Complex UI interactions (drag-drop, wizards, multi-page flows)
-   - Uses: Interactive test recording with Playwright MCP
-   - Appropriate for: Visual workflows, unclear requirements
-   - Only if config.tea_use_mcp_enhancements is true AND MCP available
-
-2. **AI Generation Mode (DEFAULT - Continue to Step 2)**
-
-   For standard scenarios:
-   - Continue with existing workflow (Step 2: Select Test Levels and Strategy)
-   - AI generates tests based on acceptance criteria from Step 1
-   - Use knowledge base patterns for test structure
-
-3. **Recording Mode (OPTIONAL - Complex UI Only)**
-
-   For complex UI scenarios AND config.tea_use_mcp_enhancements is true:
-
-   **A. Check MCP Availability**
-
-   If Playwright MCP tools are available in your IDE:
-   - Use MCP recording mode (Step 3.B)
-
-   If MCP unavailable:
-   - Fallback to AI generation mode (silent, automatic)
-   - Continue to Step 2
-
-   **B. Interactive Test Recording (MCP-Based)**
-
-   Use Playwright MCP test-generator tools:
-
-   **Setup:**
-
-   ```
-   1. Use generator_setup_page to initialize recording session
-   2. Navigate to application starting URL (from story context)
-   3. Ready to record user interactions
-   ```
-
-   **Recording Process (Per Acceptance Criterion):**
-
-   ```
-   4. Read acceptance criterion from story
-   5. Manually execute test scenario using browser_* tools:
-      - browser_navigate: Navigate to pages
-      - browser_click: Click buttons, links, elements
-      - browser_type: Fill form fields
-      - browser_select: Select dropdown options
-      - browser_check: Check/uncheck checkboxes
-   6. Add verification steps using browser_verify_* tools:
-      - browser_verify_text: Verify text content
-      - browser_verify_visible: Verify element visibility
-      - browser_verify_url: Verify URL navigation
-   7. Capture interaction log with generator_read_log
-   8. Generate test file with generator_write_test
-   9. Repeat for next acceptance criterion
-   ```
-
-   **Post-Recording Enhancement:**
-
-   ```
-   10. Review generated test code
-   11. Enhance with knowledge base patterns:
-       - Add Given-When-Then comments
-       - Replace recorded selectors with data-testid (if needed)
-       - Add network-first interception (from network-first.md)
-       - Add fixtures for auth/data setup (from fixture-architecture.md)
-       - Use factories for test data (from data-factories.md)
-   12. Verify tests fail (missing implementation)
-   13. Continue to Step 4 (Build Data Infrastructure)
-   ```
-
-   **When to Use Recording Mode:**
-   - ✅ Complex UI interactions (drag-drop, multi-step forms, wizards)
-   - ✅ Visual workflows (modals, dialogs, animations)
-   - ✅ Unclear requirements (exploratory, discovering expected behavior)
-   - ✅ Multi-page flows (checkout, registration, onboarding)
-   - ❌ NOT for simple CRUD (AI generation faster)
-   - ❌ NOT for API-only tests (no UI to record)
-
-   **When to Use AI Generation (Default):**
-   - ✅ Clear acceptance criteria available
-   - ✅ Standard patterns (login, CRUD, navigation)
-   - ✅ Need many tests quickly
-   - ✅ API/backend tests (no UI interaction)
-
-4. **Proceed to Test Level Selection**
-
-   After mode selection:
-   - AI Generation: Continue to Step 2 (Select Test Levels and Strategy)
-   - Recording: Skip to Step 4 (Build Data Infrastructure) - tests already generated
-
---
-
-## Step 2: Select Test Levels and Strategy
-
-### Actions
-
-1. **Analyze Acceptance Criteria**
-
-   For each acceptance criterion, determine:
-   - Does it require full user journey? → E2E test
-   - Does it test business logic/API contract? → API test
-   - Does it validate UI component behavior? → Component test
-   - Can it be unit tested? → Unit test
-
-2. **Apply Test Level Selection Framework**
-
-   **Knowledge Base Reference**: `test-levels-framework.md`
-
-   **E2E (End-to-End)**:
-   - Critical user journeys (login, checkout, core workflow)
-   - Multi-system integration
-   - User-facing acceptance criteria
-   - **Characteristics**: High confidence, slow execution, brittle
-
-   **API (Integration)**:
-   - Business logic validation
-   - Service contracts
-   - Data transformations
-   - **Characteristics**: Fast feedback, good balance, stable
-
-   **Component**:
-   - UI component behavior (buttons, forms, modals)
-   - Interaction testing
-   - Visual regression
-   - **Characteristics**: Fast, isolated, granular
-
-   **Unit**:
-   - Pure business logic
-   - Edge cases
-   - Error handling
-   - **Characteristics**: Fastest, most granular
-
-3. **Avoid Duplicate Coverage**
-
-   Don't test same behavior at multiple levels unless necessary:
-   - Use E2E for critical happy path only
-   - Use API tests for complex business logic variations
-   - Use component tests for UI interaction edge cases
-   - Use unit tests for pure logic edge cases
-
-4. **Prioritize Tests**
-
-   If test-design document exists, align with priority levels:
-   - P0 scenarios → Must cover in failing tests
-   - P1 scenarios → Should cover if time permits
-   - P2/P3 scenarios → Optional for this iteration
-
-**Decision Point:** Set `primary_level` variable to main test level for this story (typically E2E or API)
-
---
-
-## Step 3: Generate Failing Tests
-
-### Actions
-
-1. **Create Test File Structure**
-
-   ```
-   tests/
-   ├── e2e/
-   │   └── {feature-name}.spec.ts        # E2E acceptance tests
-   ├── api/
-   │   └── {feature-name}.api.spec.ts    # API contract tests
-   ├── component/
-   │   └── {ComponentName}.test.tsx      # Component tests
-   └── support/
-       ├── fixtures/                      # Test fixtures
-       ├── factories/                     # Data factories
-       └── helpers/                       # Utility functions
-   ```
-
-2. **Write Failing E2E Tests (If Applicable)**
-
-   **Use Given-When-Then format:**
-
-   ```typescript
-   import { test, expect } from '@playwright/test';
-
-   test.describe('User Login', () => {
-     test('should display error for invalid credentials', async ({ page }) => {
-       // GIVEN: User is on login page
-       await page.goto('/login');
-
-       // WHEN: User submits invalid credentials
-       await page.fill('[data-testid="email-input"]', 'invalid@example.com');
-       await page.fill('[data-testid="password-input"]', 'wrongpassword');
-       await page.click('[data-testid="login-button"]');
-
-       // THEN: Error message is displayed
-       await expect(page.locator('[data-testid="error-message"]')).toHaveText('Invalid email or password');
-     });
-   });
-   ```
-
-   **Critical patterns:**
-   - One assertion per test (atomic tests)
-   - Explicit waits (no hard waits/sleeps)
-   - Network-first approach (route interception before navigation)
-   - data-testid selectors for stability
-   - Clear Given-When-Then structure
-
-3. **Apply Network-First Pattern**
-
-   **Knowledge Base Reference**: `network-first.md`
-
-   ```typescript
-   test('should load user dashboard after login', async ({ page }) => {
-     // CRITICAL: Intercept routes BEFORE navigation
-     await page.route('**/api/user', (route) =>
-       route.fulfill({
-         status: 200,
-         body: JSON.stringify({ id: 1, name: 'Test User' }),
-       }),
-     );
-
-     // NOW navigate
-     await page.goto('/dashboard');
-
-     await expect(page.locator('[data-testid="user-name"]')).toHaveText('Test User');
-   });
-   ```
-
-4. **Write Failing API Tests (If Applicable)**
-
-   ```typescript
-   import { test, expect } from '@playwright/test';
-
-   test.describe('User API', () => {
-     test('POST /api/users - should create new user', async ({ request }) => {
-       // GIVEN: Valid user data
-       const userData = {
-         email: 'newuser@example.com',
-         name: 'New User',
-       };
-
-       // WHEN: Creating user via API
-       const response = await request.post('/api/users', {
-         data: userData,
-       });
-
-       // THEN: User is created successfully
-       expect(response.status()).toBe(201);
-       const body = await response.json();
-       expect(body).toMatchObject({
-         email: userData.email,
-         name: userData.name,
-         id: expect.any(Number),
-       });
-     });
-   });
-   ```
-
-5. **Write Failing Component Tests (If Applicable)**
-
-   **Knowledge Base Reference**: `component-tdd.md`
-
-   ```typescript
-   import { test, expect } from '@playwright/experimental-ct-react';
-   import { LoginForm } from './LoginForm';
-
-   test.describe('LoginForm Component', () => {
-     test('should disable submit button when fields are empty', async ({ mount }) => {
-       // GIVEN: LoginForm is mounted
-       const component = await mount(<LoginForm />);
-
-       // WHEN: Form is initially rendered
-       const submitButton = component.locator('button[type="submit"]');
-
-       // THEN: Submit button is disabled
-       await expect(submitButton).toBeDisabled();
-     });
-   });
-   ```
-
-6. **Verify Tests Fail Initially**
-
-   **Critical verification:**
-   - Run tests locally to confirm they fail
-   - Failure should be due to missing implementation, not test errors
-   - Failure messages should be clear and actionable
-   - All tests must be in RED phase before sharing with DEV
-
-**Important:** Tests MUST fail initially. If a test passes before implementation, it's not a valid acceptance test.
-
---
-
-## Step 4: Build Data Infrastructure
-
-### Actions
-
-1. **Create Data Factories**
-
-   **Knowledge Base Reference**: `data-factories.md`
-
-   ```typescript
-   // tests/support/factories/user.factory.ts
-   import { faker } from '@faker-js/faker';
-
-   export const createUser = (overrides = {}) => ({
-     id: faker.number.int(),
-     email: faker.internet.email(),
-     name: faker.person.fullName(),
-     createdAt: faker.date.recent().toISOString(),
-     ...overrides,
-   });
-
-   export const createUsers = (count: number) => Array.from({ length: count }, () => createUser());
-   ```
-
-   **Factory principles:**
-   - Use faker for random data (no hardcoded values)
-   - Support overrides for specific scenarios
-   - Generate complete valid objects
-   - Include helper functions for bulk creation
-
-2. **Create Test Fixtures**
-
-   **Knowledge Base Reference**: `fixture-architecture.md`
-
-   ```typescript
-   // tests/support/fixtures/auth.fixture.ts
-   import { test as base } from '@playwright/test';
-
-   export const test = base.extend({
-     authenticatedUser: async ({ page }, use) => {
-       // Setup: Create and authenticate user
-       const user = await createUser();
-       await page.goto('/login');
-       await page.fill('[data-testid="email"]', user.email);
-       await page.fill('[data-testid="password"]', 'password123');
-       await page.click('[data-testid="login-button"]');
-       await page.waitForURL('/dashboard');
-
-       // Provide to test
-       await use(user);
-
-       // Cleanup: Delete user
-       await deleteUser(user.id);
-     },
-   });
-   ```
-
-   **Fixture principles:**
-   - Auto-cleanup (always delete created data)
-   - Composable (fixtures can use other fixtures)
-   - Isolated (each test gets fresh data)
-   - Type-safe
-
-3. **Document Mock Requirements**
-
-   If external services need mocking, document requirements:
-
-   ```markdown
-   ### Mock Requirements for DEV Team
-
-   **Payment Gateway Mock**:
-
-   - Endpoint: `POST /api/payments`
-   - Success response: `{ status: 'success', transactionId: '123' }`
-   - Failure response: `{ status: 'failed', error: 'Insufficient funds' }`
-
-   **Email Service Mock**:
-
-   - Should not send real emails in test environment
-   - Log email contents for verification
-   ```
-
-4. **List Required data-testid Attributes**
-
-   ```markdown
-   ### Required data-testid Attributes
-
-   **Login Page**:
-
-   - `email-input` - Email input field
-   - `password-input` - Password input field
-   - `login-button` - Submit button
-   - `error-message` - Error message container
-
-   **Dashboard Page**:
-
-   - `user-name` - User name display
-   - `logout-button` - Logout button
-   ```
-
---
-
-## Step 5: Create Implementation Checklist
-
-### Actions
-
-1. **Map Tests to Implementation Tasks**
-
-   For each failing test, create corresponding implementation task:
-
-   ```markdown
-   ## Implementation Checklist
-
-   ### Epic X - User Authentication
-
-   #### Test: User Login with Valid Credentials
-
-   - [ ] Create `/login` route
-   - [ ] Implement login form component
-   - [ ] Add email/password validation
-   - [ ] Integrate authentication API
-   - [ ] Add `data-testid` attributes: `email-input`, `password-input`, `login-button`
-   - [ ] Implement error handling
-   - [ ] Run test: `npm run test:e2e -- login.spec.ts`
-   - [ ] ✅ Test passes (green phase)
-
-   #### Test: Display Error for Invalid Credentials
-
-   - [ ] Add error state management
-   - [ ] Display error message UI
-   - [ ] Add `data-testid="error-message"`
-   - [ ] Run test: `npm run test:e2e -- login.spec.ts`
-   - [ ] ✅ Test passes (green phase)
-   ```
-
-2. **Include Red-Green-Refactor Guidance**
-
-   ```markdown
-   ## Red-Green-Refactor Workflow
-
-   **RED Phase** (Complete):
-
-   - ✅ All tests written and failing
-   - ✅ Fixtures and factories created
-   - ✅ Mock requirements documented
-
-   **GREEN Phase** (DEV Team):
-
-   1. Pick one failing test
-   2. Implement minimal code to make it pass
-   3. Run test to verify green
-   4. Move to next test
-   5. Repeat until all tests pass
-
-   **REFACTOR Phase** (DEV Team):
-
-   1. All tests passing (green)
-   2. Improve code quality
-   3. Extract duplications
-   4. Optimize performance
-   5. Ensure tests still pass
-   ```
-
-3. **Add Execution Commands**
-
-   ````markdown
-   ## Running Tests
-
-   ```bash
-   # Run all failing tests
-   npm run test:e2e
-
-   # Run specific test file
-   npm run test:e2e -- login.spec.ts
-
-   # Run tests in headed mode (see browser)
-   npm run test:e2e -- --headed
-
-   # Debug specific test
-   npm run test:e2e -- login.spec.ts --debug
-   ```
-   ````
-
-   ```
-
-   ```
-
---
-
-## Step 6: Generate Deliverables
-
-### Actions
-
-1. **Create ATDD Checklist Document**
-
-   Use template structure at `{installed_path}/atdd-checklist-template.md`:
-   - Story summary
-   - Acceptance criteria breakdown
-   - Test files created (with paths)
-   - Data factories created
-   - Fixtures created
-   - Mock requirements
-   - Required data-testid attributes
-   - Implementation checklist
-   - Red-green-refactor workflow
-   - Execution commands
-
-2. **Verify All Tests Fail**
-
-   Before finalizing:
-   - Run full test suite locally
-   - Confirm all tests in RED phase
-   - Document expected failure messages
-   - Ensure failures are due to missing implementation, not test bugs
-
-3. **Write to Output File**
-
-   Save to `{output_folder}/atdd-checklist-{story_id}.md`
-
---
-
-## Important Notes
-
-### Red-Green-Refactor Cycle
-
-**RED Phase** (TEA responsibility):
-
- Write failing tests first
- Tests define expected behavior
- Tests must fail for right reason (missing implementation)
-
-**GREEN Phase** (DEV responsibility):
-
- Implement minimal code to pass tests
- One test at a time
- Don't over-engineer
-
-**REFACTOR Phase** (DEV responsibility):
-
- Improve code quality with confidence
- Tests provide safety net
- Extract duplications, optimize
-
-### Given-When-Then Structure
-
-**GIVEN** (Setup):
-
- Arrange test preconditions
- Create necessary data
- Navigate to starting point
-
-**WHEN** (Action):
-
- Execute the behavior being tested
- Single action per test
-
-**THEN** (Assertion):
-
- Verify expected outcome
- One assertion per test (atomic)
-
-### Network-First Testing
-
-**Critical pattern:**
-
-```typescript
-// ✅ CORRECT: Intercept BEFORE navigation
-await page.route('**/api/data', handler);
-await page.goto('/page');
-
-// ❌ WRONG: Navigate then intercept (race condition)
-await page.goto('/page');
-await page.route('**/api/data', handler); // Too late!
-```
-
-### Data Factory Best Practices
-
-**Use faker for all test data:**
-
-```typescript
-// ✅ CORRECT: Random data
-email: faker.internet.email();
-
-// ❌ WRONG: Hardcoded data (collisions, maintenance burden)
-email: 'test@example.com';
-```
-
-**Auto-cleanup principle:**
-
- Every factory that creates data must provide cleanup
- Fixtures automatically cleanup in teardown
- No manual cleanup in test code
-
-### One Assertion Per Test
-
-**Atomic test design:**
-
-```typescript
-// ✅ CORRECT: One assertion
-test('should display user name', async ({ page }) => {
-  await expect(page.locator('[data-testid="user-name"]')).toHaveText('John');
-});
-
-// ❌ WRONG: Multiple assertions (not atomic)
-test('should display user info', async ({ page }) => {
-  await expect(page.locator('[data-testid="user-name"]')).toHaveText('John');
-  await expect(page.locator('[data-testid="user-email"]')).toHaveText('john@example.com');
-});
-```
-
-**Why?** If second assertion fails, you don't know if first is still valid.
-
-### Component Test Strategy
-
-**When to use component tests:**
-
- Complex UI interactions (drag-drop, keyboard nav)
- Form validation logic
- State management within component
- Visual edge cases
-
-**When NOT to use:**
-
- Simple rendering (snapshot tests are sufficient)
- Integration with backend (use E2E or API tests)
- Full user journeys (use E2E tests)
-
-### Knowledge Base Integration
-
-**Core Fragments (Auto-loaded in Step 1):**
-
- `fixture-architecture.md` - Pure function → fixture → mergeTests patterns (406 lines, 5 examples)
- `data-factories.md` - Factory patterns with faker, overrides, API seeding (498 lines, 5 examples)
- `component-tdd.md` - Red-green-refactor, provider isolation, accessibility, visual regression (480 lines, 4 examples)
- `network-first.md` - Intercept before navigate, HAR capture, deterministic waiting (489 lines, 5 examples)
- `test-quality.md` - Deterministic tests, cleanup, explicit assertions, length/time limits (658 lines, 5 examples)
- `test-healing-patterns.md` - Common failure patterns: stale selectors, race conditions, dynamic data, network errors, hard waits (648 lines, 5 examples)
- `selector-resilience.md` - Selector hierarchy (data-testid > ARIA > text > CSS), dynamic patterns, anti-patterns (541 lines, 4 examples)
- `timing-debugging.md` - Race condition prevention, deterministic waiting, async debugging (370 lines, 3 examples)
-
-**Reference for Test Level Selection:**
-
- `test-levels-framework.md` - E2E vs API vs Component vs Unit decision framework (467 lines, 4 examples)
-
-**Manual Reference (Optional):**
-
- Use `tea-index.csv` to find additional specialized fragments as needed
-
---
-
-## Output Summary
-
-After completing this workflow, provide a summary:
-
-```markdown
-## ATDD Complete - Tests in RED Phase
-
-**Story**: {story_id}
-**Primary Test Level**: {primary_level}
-
-**Failing Tests Created**:
-
- E2E tests: {e2e_count} tests in {e2e_files}
- API tests: {api_count} tests in {api_files}
- Component tests: {component_count} tests in {component_files}
-
-**Supporting Infrastructure**:
-
- Data factories: {factory_count} factories created
- Fixtures: {fixture_count} fixtures with auto-cleanup
- Mock requirements: {mock_count} services documented
-
-**Implementation Checklist**:
-
- Total tasks: {task_count}
- Estimated effort: {effort_estimate} hours
-
-**Required data-testid Attributes**: {data_testid_count} attributes documented
-
-**Next Steps for DEV Team**:
-
-1. Run failing tests: `npm run test:e2e`
-2. Review implementation checklist
-3. Implement one test at a time (RED → GREEN)
-4. Refactor with confidence (tests provide safety net)
-5. Share progress in daily standup
-
-**Output File**: {output_file}
-**Manual Handoff**: Share `{output_file}` and failing tests with the dev workflow (not auto-consumed).
-
-**Knowledge Base References Applied**:
-
- Fixture architecture patterns
- Data factory patterns with faker
- Network-first route interception
- Component TDD strategies
- Test quality principles
-```
-
---
-
-## Validation
-
-After completing all steps, verify:
-
- [ ] Story acceptance criteria analyzed and mapped to tests
- [ ] Appropriate test levels selected (E2E, API, Component)
- [ ] All tests written in Given-When-Then format
- [ ] All tests fail initially (RED phase verified)
- [ ] Network-first pattern applied (route interception before navigation)
- [ ] Data factories created with faker
- [ ] Fixtures created with auto-cleanup
- [ ] Mock requirements documented for DEV team
- [ ] Required data-testid attributes listed
- [ ] Implementation checklist created with clear tasks
- [ ] Red-green-refactor workflow documented
- [ ] Execution commands provided
- [ ] Output file created and formatted correctly
-
-Refer to `checklist.md` for comprehensive validation criteria.
--- a/_bmad/bmm/workflows/testarch/atdd/workflow.yaml
+++ b/_bmad/bmm/workflows/testarch/atdd/workflow.yaml
@@ -1,45 +0,0 @@
-# Test Architect workflow: atdd
-name: testarch-atdd
-description: "Generate failing acceptance tests before implementation using TDD red-green-refactor cycle"
-author: "BMad"
-
-# Critical variables from config
-config_source: "{project-root}/_bmad/bmm/config.yaml"
-output_folder: "{config_source}:output_folder"
-user_name: "{config_source}:user_name"
-communication_language: "{config_source}:communication_language"
-document_output_language: "{config_source}:document_output_language"
-date: system-generated
-
-# Workflow components
-installed_path: "{project-root}/_bmad/bmm/workflows/testarch/atdd"
-instructions: "{installed_path}/instructions.md"
-validation: "{installed_path}/checklist.md"
-template: "{installed_path}/atdd-checklist-template.md"
-
-# Variables and inputs
-variables:
-  test_dir: "{project-root}/tests" # Root test directory
-
-# Output configuration
-default_output_file: "{output_folder}/atdd-checklist-{story_id}.md"
-
-# Required tools
-required_tools:
-  - read_file # Read story markdown, framework config
-  - write_file # Create test files, checklist, factory stubs
-  - create_directory # Create test directories
-  - list_files # Find existing fixtures and helpers
-  - search_repo # Search for similar test patterns
-
-tags:
-  - qa
-  - atdd
-  - test-architect
-  - tdd
-  - red-green-refactor
-
-execution_hints:
-  interactive: false # Minimize prompts
-  autonomous: true # Proceed without user input unless blocked
-  iterative: true
--- a/_bmad/bmm/workflows/testarch/automate/checklist.md
+++ b/_bmad/bmm/workflows/testarch/automate/checklist.md
@@ -1,582 +0,0 @@
-# Automate Workflow Validation Checklist
-
-Use this checklist to validate that the automate workflow has been executed correctly and all deliverables meet quality standards.
-
-## Prerequisites
-
-Before starting this workflow, verify:
-
- [ ] Framework scaffolding configured (playwright.config.ts or cypress.config.ts exists)
- [ ] Test directory structure exists (tests/ folder with subdirectories)
- [ ] Package.json has test framework dependencies installed
-
-**Halt only if:** Framework scaffolding is completely missing (run `framework` workflow first)
-
-**Note:** BMad artifacts (story, tech-spec, PRD) are OPTIONAL - workflow can run without them
-**Note:** `automate` generates tests; it does not run `*atdd` or `*test-review`. If ATDD outputs exist, use them as input and avoid duplicate coverage.
-
---
-
-## Step 1: Execution Mode Determination and Context Loading
-
-### Mode Detection
-
- [ ] Execution mode correctly determined:
-  - [ ] BMad-Integrated Mode (story_file variable set) OR
-  - [ ] Standalone Mode (target_feature or target_files set) OR
-  - [ ] Auto-discover Mode (no targets specified)
-
-### BMad Artifacts (If Available - OPTIONAL)
-
- [ ] Story markdown loaded (if `{story_file}` provided)
- [ ] Acceptance criteria extracted from story (if available)
- [ ] Tech-spec.md loaded (if `{use_tech_spec}` true and file exists)
- [ ] Test-design.md loaded (if `{use_test_design}` true and file exists)
- [ ] PRD.md loaded (if `{use_prd}` true and file exists)
- [ ] **Note**: Absence of BMad artifacts does NOT halt workflow
-
-### Framework Configuration
-
- [ ] Test framework config loaded (playwright.config.ts or cypress.config.ts)
- [ ] Test directory structure identified from `{test_dir}`
- [ ] Existing test patterns reviewed
- [ ] Test runner capabilities noted (parallel execution, fixtures, etc.)
-
-### Coverage Analysis
-
- [ ] Existing test files searched in `{test_dir}` (if `{analyze_coverage}` true)
- [ ] Tested features vs untested features identified
- [ ] Coverage gaps mapped (tests to source files)
- [ ] Existing fixture and factory patterns checked
-
-### Knowledge Base Fragments Loaded
-
- [ ] `test-levels-framework.md` - Test level selection
- [ ] `test-priorities.md` - Priority classification (P0-P3)
- [ ] `fixture-architecture.md` - Fixture patterns with auto-cleanup
- [ ] `data-factories.md` - Factory patterns using faker
- [ ] `selective-testing.md` - Targeted test execution strategies
- [ ] `ci-burn-in.md` - Flaky test detection patterns
- [ ] `test-quality.md` - Test design principles
-
---
-
-## Step 2: Automation Targets Identification
-
-### Target Determination
-
-**BMad-Integrated Mode (if story available):**
-
- [ ] Acceptance criteria mapped to test scenarios
- [ ] Features implemented in story identified
- [ ] Existing ATDD tests checked (if any)
- [ ] Expansion beyond ATDD planned (edge cases, negative paths)
-
-**Standalone Mode (if no story):**
-
- [ ] Specific feature analyzed (if `{target_feature}` specified)
- [ ] Specific files analyzed (if `{target_files}` specified)
- [ ] Features auto-discovered (if `{auto_discover_features}` true)
- [ ] Features prioritized by:
-  - [ ] No test coverage (highest priority)
-  - [ ] Complex business logic
-  - [ ] External integrations (API, database, auth)
-  - [ ] Critical user paths (login, checkout, etc.)
-
-### Test Level Selection
-
- [ ] Test level selection framework applied (from `test-levels-framework.md`)
- [ ] E2E tests identified: Critical user journeys, multi-system integration
- [ ] API tests identified: Business logic, service contracts, data transformations
- [ ] Component tests identified: UI behavior, interactions, state management
- [ ] Unit tests identified: Pure logic, edge cases, error handling
-
-### Duplicate Coverage Avoidance
-
- [ ] Same behavior NOT tested at multiple levels unnecessarily
- [ ] E2E used for critical happy path only
- [ ] API tests used for business logic variations
- [ ] Component tests used for UI interaction edge cases
- [ ] Unit tests used for pure logic edge cases
-
-### Priority Assignment
-
- [ ] Test priorities assigned using `test-priorities.md` framework
- [ ] P0 tests: Critical paths, security-critical, data integrity
- [ ] P1 tests: Important features, integration points, error handling
- [ ] P2 tests: Edge cases, less-critical variations, performance
- [ ] P3 tests: Nice-to-have, rarely-used features, exploratory
- [ ] Priority variables respected:
-  - [ ] `{include_p0}` = true (always include)
-  - [ ] `{include_p1}` = true (high priority)
-  - [ ] `{include_p2}` = true (medium priority)
-  - [ ] `{include_p3}` = false (low priority, skip by default)
-
-### Coverage Plan Created
-
- [ ] Test coverage plan documented
- [ ] What will be tested at each level listed
- [ ] Priorities assigned to each test
- [ ] Coverage strategy clear (critical-paths, comprehensive, or selective)
-
---
-
-## Step 3: Test Infrastructure Generated
-
-### Fixture Architecture
-
- [ ] Existing fixtures checked in `tests/support/fixtures/`
- [ ] Fixture architecture created/enhanced (if `{generate_fixtures}` true)
- [ ] All fixtures use Playwright's `test.extend()` pattern
- [ ] All fixtures have auto-cleanup in teardown
- [ ] Common fixtures created/enhanced:
-  - [ ] authenticatedUser (with auto-delete)
-  - [ ] apiRequest (authenticated client)
-  - [ ] mockNetwork (external service mocking)
-  - [ ] testDatabase (with auto-cleanup)
-
-### Data Factories
-
- [ ] Existing factories checked in `tests/support/factories/`
- [ ] Factory architecture created/enhanced (if `{generate_factories}` true)
- [ ] All factories use `@faker-js/faker` for random data (no hardcoded values)
- [ ] All factories support overrides for specific scenarios
- [ ] Common factories created/enhanced:
-  - [ ] User factory (email, password, name, role)
-  - [ ] Product factory (name, price, SKU)
-  - [ ] Order factory (items, total, status)
- [ ] Cleanup helpers provided (e.g., deleteUser(), deleteProduct())
-
-### Helper Utilities
-
- [ ] Existing helpers checked in `tests/support/helpers/` (if `{update_helpers}` true)
- [ ] Common utilities created/enhanced:
-  - [ ] waitFor (polling for complex conditions)
-  - [ ] retry (retry helper for flaky operations)
-  - [ ] testData (test data generation)
-  - [ ] assertions (custom assertion helpers)
-
---
-
-## Step 4: Test Files Generated
-
-### Test File Structure
-
- [ ] Test files organized correctly:
-  - [ ] `tests/e2e/` for E2E tests
-  - [ ] `tests/api/` for API tests
-  - [ ] `tests/component/` for component tests
-  - [ ] `tests/unit/` for unit tests
-  - [ ] `tests/support/` for fixtures/factories/helpers
-
-### E2E Tests (If Applicable)
-
- [ ] E2E test files created in `tests/e2e/`
- [ ] All tests follow Given-When-Then format
- [ ] All tests have priority tags ([P0], [P1], [P2], [P3]) in test name
- [ ] All tests use data-testid selectors (not CSS classes)
- [ ] One assertion per test (atomic design)
- [ ] No hard waits or sleeps (explicit waits only)
- [ ] Network-first pattern applied (route interception BEFORE navigation)
- [ ] Clear Given-When-Then comments in test code
-
-### API Tests (If Applicable)
-
- [ ] API test files created in `tests/api/`
- [ ] All tests follow Given-When-Then format
- [ ] All tests have priority tags in test name
- [ ] API contracts validated (request/response structure)
- [ ] HTTP status codes verified
- [ ] Response body validation includes required fields
- [ ] Error cases tested (400, 401, 403, 404, 500)
- [ ] JWT token format validated (if auth tests)
-
-### Component Tests (If Applicable)
-
- [ ] Component test files created in `tests/component/`
- [ ] All tests follow Given-When-Then format
- [ ] All tests have priority tags in test name
- [ ] Component mounting works correctly
- [ ] Interaction testing covers user actions (click, hover, keyboard)
- [ ] State management validated
- [ ] Props and events tested
-
-### Unit Tests (If Applicable)
-
- [ ] Unit test files created in `tests/unit/`
- [ ] All tests follow Given-When-Then format
- [ ] All tests have priority tags in test name
- [ ] Pure logic tested (no dependencies)
- [ ] Edge cases covered
- [ ] Error handling tested
-
-### Quality Standards Enforced
-
- [ ] All tests use Given-When-Then format with clear comments
- [ ] All tests have descriptive names with priority tags
- [ ] No duplicate tests (same behavior tested multiple times)
- [ ] No flaky patterns (race conditions, timing issues)
- [ ] No test interdependencies (tests can run in any order)
- [ ] Tests are deterministic (same input always produces same result)
- [ ] All tests use data-testid selectors (E2E tests)
- [ ] No hard waits: `await page.waitForTimeout()` (forbidden)
- [ ] No conditional flow: `if (await element.isVisible())` (forbidden)
- [ ] No try-catch for test logic (only for cleanup)
- [ ] No hardcoded test data (use factories with faker)
- [ ] No page object classes (tests are direct and simple)
- [ ] No shared state between tests
-
-### Network-First Pattern Applied
-
- [ ] Route interception set up BEFORE navigation (E2E tests with network requests)
- [ ] `page.route()` called before `page.goto()` to prevent race conditions
- [ ] Network-first pattern verified in all E2E tests that make API calls
-
---
-
-## Step 5: Test Validation and Healing (NEW - Phase 2.5)
-
-### Healing Configuration
-
- [ ] Healing configuration checked:
-  - [ ] `{auto_validate}` setting noted (default: true)
-  - [ ] `{auto_heal_failures}` setting noted (default: false)
-  - [ ] `{max_healing_iterations}` setting noted (default: 3)
-  - [ ] `{use_mcp_healing}` setting noted (default: true)
-
-### Healing Knowledge Fragments Loaded (If Healing Enabled)
-
- [ ] `test-healing-patterns.md` loaded (common failure patterns and fixes)
- [ ] `selector-resilience.md` loaded (selector refactoring guide)
- [ ] `timing-debugging.md` loaded (race condition fixes)
-
-### Test Execution and Validation
-
- [ ] Generated tests executed (if `{auto_validate}` true)
- [ ] Test results captured:
-  - [ ] Total tests run
-  - [ ] Passing tests count
-  - [ ] Failing tests count
-  - [ ] Error messages and stack traces captured
-
-### Healing Loop (If Enabled and Tests Failed)
-
- [ ] Healing loop entered (if `{auto_heal_failures}` true AND tests failed)
- [ ] For each failing test:
-  - [ ] Failure pattern identified (selector, timing, data, network, hard wait)
-  - [ ] Appropriate healing strategy applied:
-    - [ ] Stale selector → Replaced with data-testid or ARIA role
-    - [ ] Race condition → Added network-first interception or state waits
-    - [ ] Dynamic data → Replaced hardcoded values with regex/dynamic generation
-    - [ ] Network error → Added route mocking
-    - [ ] Hard wait → Replaced with event-based wait
-  - [ ] Healed test re-run to validate fix
-  - [ ] Iteration count tracked (max 3 attempts)
-
-### Unfixable Tests Handling
-
- [ ] Tests that couldn't be healed after 3 iterations marked with `test.fixme()` (if `{mark_unhealable_as_fixme}` true)
- [ ] Detailed comment added to test.fixme() tests:
-  - [ ] What failure occurred
-  - [ ] What healing was attempted (3 iterations)
-  - [ ] Why healing failed
-  - [ ] Manual investigation steps needed
- [ ] Original test logic preserved in comments
-
-### Healing Report Generated
-
- [ ] Healing report generated (if healing attempted)
- [ ] Report includes:
-  - [ ] Auto-heal enabled status
-  - [ ] Healing mode (MCP-assisted or Pattern-based)
-  - [ ] Iterations allowed (max_healing_iterations)
-  - [ ] Validation results (total, passing, failing)
-  - [ ] Successfully healed tests (count, file:line, fix applied)
-  - [ ] Unable to heal tests (count, file:line, reason)
-  - [ ] Healing patterns applied (selector fixes, timing fixes, data fixes)
-  - [ ] Knowledge base references used
-
---
-
-## Step 6: Documentation and Scripts Updated
-
-### Test README Updated
-
- [ ] `tests/README.md` created or updated (if `{update_readme}` true)
- [ ] Test suite structure overview included
- [ ] Test execution instructions provided (all, specific files, by priority)
- [ ] Fixture usage examples provided
- [ ] Factory usage examples provided
- [ ] Priority tagging convention explained ([P0], [P1], [P2], [P3])
- [ ] How to write new tests documented
- [ ] Common patterns documented
- [ ] Anti-patterns documented (what to avoid)
-
-### package.json Scripts Updated
-
- [ ] package.json scripts added/updated (if `{update_package_scripts}` true)
- [ ] `test:e2e` script for all E2E tests
- [ ] `test:e2e:p0` script for P0 tests only
- [ ] `test:e2e:p1` script for P0 + P1 tests
- [ ] `test:api` script for API tests
- [ ] `test:component` script for component tests
- [ ] `test:unit` script for unit tests (if applicable)
-
-### Test Suite Executed
-
- [ ] Test suite run locally (if `{run_tests_after_generation}` true)
- [ ] Test results captured (passing/failing counts)
- [ ] No flaky patterns detected (tests are deterministic)
- [ ] Setup requirements documented (if any)
- [ ] Known issues documented (if any)
-
---
-
-## Step 6: Automation Summary Generated
-
-### Automation Summary Document
-
- [ ] Output file created at `{output_summary}`
- [ ] Document includes execution mode (BMad-Integrated, Standalone, Auto-discover)
- [ ] Feature analysis included (source files, coverage gaps) - Standalone mode
- [ ] Tests created listed (E2E, API, Component, Unit) with counts and paths
- [ ] Infrastructure created listed (fixtures, factories, helpers)
- [ ] Test execution instructions provided
- [ ] Coverage analysis included:
-  - [ ] Total test count
-  - [ ] Priority breakdown (P0, P1, P2, P3 counts)
-  - [ ] Test level breakdown (E2E, API, Component, Unit counts)
-  - [ ] Coverage percentage (if calculated)
-  - [ ] Coverage status (acceptance criteria covered, gaps identified)
- [ ] Definition of Done checklist included
- [ ] Next steps provided
- [ ] Recommendations included (if Standalone mode)
-
-### Summary Provided to User
-
- [ ] Concise summary output provided
- [ ] Total tests created across test levels
- [ ] Priority breakdown (P0, P1, P2, P3 counts)
- [ ] Infrastructure counts (fixtures, factories, helpers)
- [ ] Test execution command provided
- [ ] Output file path provided
- [ ] Next steps listed
-
---
-
-## Quality Checks
-
-### Test Design Quality
-
- [ ] Tests are readable (clear Given-When-Then structure)
- [ ] Tests are maintainable (use factories/fixtures, not hardcoded data)
- [ ] Tests are isolated (no shared state between tests)
- [ ] Tests are deterministic (no race conditions or flaky patterns)
- [ ] Tests are atomic (one assertion per test)
- [ ] Tests are fast (no unnecessary waits or delays)
- [ ] Tests are lean (files under {max_file_lines} lines)
-
-### Knowledge Base Integration
-
- [ ] Test level selection framework applied (from `test-levels-framework.md`)
- [ ] Priority classification applied (from `test-priorities.md`)
- [ ] Fixture architecture patterns applied (from `fixture-architecture.md`)
- [ ] Data factory patterns applied (from `data-factories.md`)
- [ ] Selective testing strategies considered (from `selective-testing.md`)
- [ ] Flaky test detection patterns considered (from `ci-burn-in.md`)
- [ ] Test quality principles applied (from `test-quality.md`)
-
-### Code Quality
-
- [ ] All TypeScript types are correct and complete
- [ ] No linting errors in generated test files
- [ ] Consistent naming conventions followed
- [ ] Imports are organized and correct
- [ ] Code follows project style guide
- [ ] No console.log or debug statements in test code
-
---
-
-## Integration Points
-
-### With Framework Workflow
-
- [ ] Test framework configuration detected and used
- [ ] Directory structure matches framework setup
- [ ] Fixtures and helpers follow established patterns
- [ ] Naming conventions consistent with framework standards
-
-### With BMad Workflows (If Available - OPTIONAL)
-
-**With Story Workflow:**
-
- [ ] Story ID correctly referenced in output (if story available)
- [ ] Acceptance criteria from story reflected in tests (if story available)
- [ ] Technical constraints from story considered (if story available)
-
-**With test-design Workflow:**
-
- [ ] P0 scenarios from test-design prioritized (if test-design available)
- [ ] Risk assessment from test-design considered (if test-design available)
- [ ] Coverage strategy aligned with test-design (if test-design available)
-
-**With atdd Workflow:**
-
- [ ] ATDD artifacts provided or located (manual handoff; `atdd` not auto-run)
- [ ] Existing ATDD tests checked (if story had ATDD workflow run)
- [ ] Expansion beyond ATDD planned (edge cases, negative paths)
- [ ] No duplicate coverage with ATDD tests
-
-### With CI Pipeline
-
- [ ] Tests can run in CI environment
- [ ] Tests are parallelizable (no shared state)
- [ ] Tests have appropriate timeouts
- [ ] Tests clean up their data (no CI environment pollution)
-
---
-
-## Completion Criteria
-
-All of the following must be true before marking this workflow as complete:
-
- [ ] **Execution mode determined** (BMad-Integrated, Standalone, or Auto-discover)
- [ ] **Framework configuration loaded** and validated
- [ ] **Coverage analysis completed** (gaps identified if analyze_coverage true)
- [ ] **Automation targets identified** (what needs testing)
- [ ] **Test levels selected** appropriately (E2E, API, Component, Unit)
- [ ] **Duplicate coverage avoided** (same behavior not tested at multiple levels)
- [ ] **Test priorities assigned** (P0, P1, P2, P3)
- [ ] **Fixture architecture created/enhanced** with auto-cleanup
- [ ] **Data factories created/enhanced** using faker (no hardcoded data)
- [ ] **Helper utilities created/enhanced** (if needed)
- [ ] **Test files generated** at appropriate levels (E2E, API, Component, Unit)
- [ ] **Given-When-Then format used** consistently across all tests
- [ ] **Priority tags added** to all test names ([P0], [P1], [P2], [P3])
- [ ] **data-testid selectors used** in E2E tests (not CSS classes)
- [ ] **Network-first pattern applied** (route interception before navigation)
- [ ] **Quality standards enforced** (no hard waits, no flaky patterns, self-cleaning, deterministic)
- [ ] **Test README updated** with execution instructions and patterns
- [ ] **package.json scripts updated** with test execution commands
- [ ] **Test suite run locally** (if run_tests_after_generation true)
- [ ] **Tests validated** (if auto_validate enabled)
- [ ] **Failures healed** (if auto_heal_failures enabled and tests failed)
- [ ] **Healing report generated** (if healing attempted)
- [ ] **Unfixable tests marked** with test.fixme() and detailed comments (if any)
- [ ] **Automation summary created** and saved to correct location
- [ ] **Output file formatted correctly**
- [ ] **Knowledge base references applied** and documented (including healing fragments if used)
- [ ] **No test quality issues** (flaky patterns, race conditions, hardcoded data, page objects)
-
---
-
-## Common Issues and Resolutions
-
-### Issue: BMad artifacts not found
-
-**Problem:** Story, tech-spec, or PRD files not found when variables are set.
-
-**Resolution:**
-
- **automate does NOT require BMad artifacts** - they are OPTIONAL enhancements
- If files not found, switch to Standalone Mode automatically
- Analyze source code directly without BMad context
- Continue workflow without halting
-
-### Issue: Framework configuration not found
-
-**Problem:** No playwright.config.ts or cypress.config.ts found.
-
-**Resolution:**
-
- **HALT workflow** - framework is required
- Message: "Framework scaffolding required. Run `bmad tea *framework` first."
- User must run framework workflow before automate
-
-### Issue: No automation targets identified
-
-**Problem:** Neither story, target_feature, nor target_files specified, and auto-discover finds nothing.
-
-**Resolution:**
-
- Check if source_dir variable is correct
- Verify source code exists in project
- Ask user to specify target_feature or target_files explicitly
- Provide examples: `target_feature: "src/auth/"` or `target_files: "src/auth/login.ts,src/auth/session.ts"`
-
-### Issue: Duplicate coverage detected
-
-**Problem:** Same behavior tested at multiple levels (E2E + API + Component).
-
-**Resolution:**
-
- Review test level selection framework (test-levels-framework.md)
- Use E2E for critical happy path ONLY
- Use API for business logic variations
- Use Component for UI edge cases
- Remove redundant tests that duplicate coverage
-
-### Issue: Tests have hardcoded data
-
-**Problem:** Tests use hardcoded email addresses, passwords, or other data.
-
-**Resolution:**
-
- Replace all hardcoded data with factory function calls
- Use faker for all random data generation
- Update data-factories to support all required test scenarios
- Example: `createUser({ email: faker.internet.email() })`
-
-### Issue: Tests are flaky
-
-**Problem:** Tests fail intermittently, pass on retry.
-
-**Resolution:**
-
- Remove all hard waits (`page.waitForTimeout()`)
- Use explicit waits (`page.waitForSelector()`)
- Apply network-first pattern (route interception before navigation)
- Remove conditional flow (`if (await element.isVisible())`)
- Ensure tests are deterministic (no race conditions)
- Run burn-in loop (10 iterations) to detect flakiness
-
-### Issue: Fixtures don't clean up data
-
-**Problem:** Test data persists after test run, causing test pollution.
-
-**Resolution:**
-
- Ensure all fixtures have cleanup in teardown phase
- Cleanup happens AFTER `await use(data)`
- Call deletion/cleanup functions (deleteUser, deleteProduct, etc.)
- Verify cleanup works by checking database/storage after test run
-
-### Issue: Tests too slow
-
-**Problem:** Tests take longer than 90 seconds (max_test_duration).
-
-**Resolution:**
-
- Remove unnecessary waits and delays
- Use parallel execution where possible
- Mock external services (don't make real API calls)
- Use API tests instead of E2E for business logic
- Optimize test data creation (use in-memory database, etc.)
-
---
-
-## Notes for TEA Agent
-
- **automate is flexible:** Can work with or without BMad artifacts (story, tech-spec, PRD are OPTIONAL)
- **Standalone mode is powerful:** Analyze any codebase and generate tests independently
- **Auto-discover mode:** Scan codebase for features needing tests when no targets specified
- **Framework is the ONLY hard requirement:** HALT if framework config missing, otherwise proceed
- **Avoid duplicate coverage:** E2E for critical paths only, API/Component for variations
- **Priority tagging enables selective execution:** P0 tests run on every commit, P1 on PR, P2 nightly
- **Network-first pattern prevents race conditions:** Route interception BEFORE navigation
- **No page objects:** Keep tests simple, direct, and maintainable
- **Use knowledge base:** Load relevant fragments (test-levels, test-priorities, fixture-architecture, data-factories, healing patterns) for guidance
- **Deterministic tests only:** No hard waits, no conditional flow, no flaky patterns allowed
- **Optional healing:** auto_heal_failures disabled by default (opt-in for automatic test healing)
- **Graceful degradation:** Healing works without Playwright MCP (pattern-based fallback)
- **Unfixable tests handled:** Mark with test.fixme() and detailed comments (not silently broken)
--- a/_bmad/bmm/workflows/testarch/automate/instructions.md
+++ b/_bmad/bmm/workflows/testarch/automate/instructions.md
--- a/_bmad/bmm/workflows/testarch/automate/workflow.yaml
+++ b/_bmad/bmm/workflows/testarch/automate/workflow.yaml
@@ -1,52 +0,0 @@
-# Test Architect workflow: automate
-name: testarch-automate
-description: "Expand test automation coverage after implementation or analyze existing codebase to generate comprehensive test suite"
-author: "BMad"
-
-# Critical variables from config
-config_source: "{project-root}/_bmad/bmm/config.yaml"
-output_folder: "{config_source}:output_folder"
-user_name: "{config_source}:user_name"
-communication_language: "{config_source}:communication_language"
-document_output_language: "{config_source}:document_output_language"
-date: system-generated
-
-# Workflow components
-installed_path: "{project-root}/_bmad/bmm/workflows/testarch/automate"
-instructions: "{installed_path}/instructions.md"
-validation: "{installed_path}/checklist.md"
-template: false
-
-# Variables and inputs
-variables:
-  # Execution mode and targeting
-  standalone_mode: true # Can work without BMad artifacts (true) or integrate with BMad (false)
-  coverage_target: "critical-paths" # critical-paths, comprehensive, selective
-
-  # Directory paths
-  test_dir: "{project-root}/tests" # Root test directory
-  source_dir: "{project-root}/src" # Source code directory
-
-# Output configuration
-default_output_file: "{output_folder}/automation-summary.md"
-
-# Required tools
-required_tools:
-  - read_file # Read source code, existing tests, BMad artifacts
-  - write_file # Create test files, fixtures, factories, summaries
-  - create_directory # Create test directories
-  - list_files # Discover features and existing tests
-  - search_repo # Find coverage gaps and patterns
-  - glob # Find test files and source files
-
-tags:
-  - qa
-  - automation
-  - test-architect
-  - regression
-  - coverage
-
-execution_hints:
-  interactive: false # Minimize prompts
-  autonomous: true # Proceed without user input unless blocked
-  iterative: true
--- a/_bmad/bmm/workflows/testarch/ci/checklist.md
+++ b/_bmad/bmm/workflows/testarch/ci/checklist.md
@@ -1,248 +0,0 @@
-# CI/CD Pipeline Setup - Validation Checklist
-
-## Prerequisites
-
- [ ] Git repository initialized (`.git/` exists)
- [ ] Git remote configured (`git remote -v` shows origin)
- [ ] Test framework configured (`playwright.config._` or `cypress.config._`)
- [ ] Local tests pass (`npm run test:e2e` succeeds)
- [ ] Team agrees on CI platform
- [ ] Access to CI platform settings (if updating)
-
-Note: CI setup is typically a one-time task per repo and can be run any time after the test framework is configured.
-
-## Process Steps
-
-### Step 1: Preflight Checks
-
- [ ] Git repository validated
- [ ] Framework configuration detected
- [ ] Local test execution successful
- [ ] CI platform detected or selected
- [ ] Node version identified (.nvmrc or default)
- [ ] No blocking issues found
-
-### Step 2: CI Pipeline Configuration
-
- [ ] CI configuration file created (`.github/workflows/test.yml` or `.gitlab-ci.yml`)
- [ ] File is syntactically valid (no YAML errors)
- [ ] Correct framework commands configured
- [ ] Node version matches project
- [ ] Test directory paths correct
-
-### Step 3: Parallel Sharding
-
- [ ] Matrix strategy configured (4 shards default)
- [ ] Shard syntax correct for framework
- [ ] fail-fast set to false
- [ ] Shard count appropriate for test suite size
-
-### Step 4: Burn-In Loop
-
- [ ] Burn-in job created
- [ ] 10 iterations configured
- [ ] Proper exit on failure (`|| exit 1`)
- [ ] Runs on appropriate triggers (PR, cron)
- [ ] Failure artifacts uploaded
-
-### Step 5: Caching Configuration
-
- [ ] Dependency cache configured (npm/yarn)
- [ ] Cache key uses lockfile hash
- [ ] Browser cache configured (Playwright/Cypress)
- [ ] Restore-keys defined for fallback
- [ ] Cache paths correct for platform
-
-### Step 6: Artifact Collection
-
- [ ] Artifacts upload on failure only
- [ ] Correct artifact paths (test-results/, traces/, etc.)
- [ ] Retention days set (30 default)
- [ ] Artifact names unique per shard
- [ ] No sensitive data in artifacts
-
-### Step 7: Retry Logic
-
- [ ] Retry action/strategy configured
- [ ] Max attempts: 2-3
- [ ] Timeout appropriate (30 min)
- [ ] Retry only on transient errors
-
-### Step 8: Helper Scripts
-
- [ ] `scripts/test-changed.sh` created
- [ ] `scripts/ci-local.sh` created
- [ ] `scripts/burn-in.sh` created (optional)
- [ ] Scripts are executable (`chmod +x`)
- [ ] Scripts use correct test commands
- [ ] Shebang present (`#!/bin/bash`)
-
-### Step 9: Documentation
-
- [ ] `docs/ci.md` created with pipeline guide
- [ ] `docs/ci-secrets-checklist.md` created
- [ ] Required secrets documented
- [ ] Setup instructions clear
- [ ] Troubleshooting section included
- [ ] Badge URLs provided (optional)
-
-## Output Validation
-
-### Configuration Validation
-
- [ ] CI file loads without errors
- [ ] All paths resolve correctly
- [ ] No hardcoded values (use env vars)
- [ ] Triggers configured (push, pull_request, schedule)
- [ ] Platform-specific syntax correct
-
-### Execution Validation
-
- [ ] First CI run triggered (push to remote)
- [ ] Pipeline starts without errors
- [ ] All jobs appear in CI dashboard
- [ ] Caching works (check logs for cache hit)
- [ ] Tests execute in parallel
- [ ] Artifacts collected on failure
-
-### Performance Validation
-
- [ ] Lint stage: <2 minutes
- [ ] Test stage (per shard): <10 minutes
- [ ] Burn-in stage: <30 minutes
- [ ] Total pipeline: <45 minutes
- [ ] Cache reduces install time by 2-5 minutes
-
-## Quality Checks
-
-### Best Practices Compliance
-
- [ ] Burn-in loop follows production patterns
- [ ] Parallel sharding configured optimally
- [ ] Failure-only artifact collection
- [ ] Selective testing enabled (optional)
- [ ] Retry logic handles transient failures only
- [ ] No secrets in configuration files
-
-### Knowledge Base Alignment
-
- [ ] Burn-in pattern matches `ci-burn-in.md`
- [ ] Selective testing matches `selective-testing.md`
- [ ] Artifact collection matches `visual-debugging.md`
- [ ] Test quality matches `test-quality.md`
-
-### Security Checks
-
- [ ] No credentials in CI configuration
- [ ] Secrets use platform secret management
- [ ] Environment variables for sensitive data
- [ ] Artifact retention appropriate (not too long)
- [ ] No debug output exposing secrets
-
-## Integration Points
-
-### Status File Integration
-
- [ ] `bmm-workflow-status.md` exists
- [ ] CI setup logged in Quality & Testing Progress section
- [ ] Status updated with completion timestamp
- [ ] Platform and configuration noted
-
-### Knowledge Base Integration
-
- [ ] Relevant knowledge fragments loaded
- [ ] Patterns applied from knowledge base
- [ ] Documentation references knowledge base
- [ ] Knowledge base references in README
-
-### Workflow Dependencies
-
- [ ] `framework` workflow completed first
- [ ] Can proceed to `atdd` workflow after CI setup
- [ ] Can proceed to `automate` workflow
- [ ] CI integrates with `gate` workflow
-
-## Completion Criteria
-
-**All must be true:**
-
- [ ] All prerequisites met
- [ ] All process steps completed
- [ ] All output validations passed
- [ ] All quality checks passed
- [ ] All integration points verified
- [ ] First CI run successful
- [ ] Performance targets met
- [ ] Documentation complete
-
-## Post-Workflow Actions
-
-**User must complete:**
-
-1. [ ] Commit CI configuration
-2. [ ] Push to remote repository
-3. [ ] Configure required secrets in CI platform
-4. [ ] Open PR to trigger first CI run
-5. [ ] Monitor and verify pipeline execution
-6. [ ] Adjust parallelism if needed (based on actual run times)
-7. [ ] Set up notifications (optional)
-
-**Recommended next workflows:**
-
-1. [ ] Run `atdd` workflow for test generation
-2. [ ] Run `automate` workflow for coverage expansion
-3. [ ] Run `gate` workflow for quality gates
-
-## Rollback Procedure
-
-If workflow fails:
-
-1. [ ] Delete CI configuration file
-2. [ ] Remove helper scripts directory
-3. [ ] Remove documentation (docs/ci.md, etc.)
-4. [ ] Clear CI platform secrets (if added)
-5. [ ] Review error logs
-6. [ ] Fix issues and retry workflow
-
-## Notes
-
-### Common Issues
-
-**Issue**: CI file syntax errors
-
- **Solution**: Validate YAML syntax online or with linter
-
-**Issue**: Tests fail in CI but pass locally
-
- **Solution**: Use `scripts/ci-local.sh` to mirror CI environment
-
-**Issue**: Caching not working
-
- **Solution**: Check cache key formula, verify paths
-
-**Issue**: Burn-in too slow
-
- **Solution**: Reduce iterations or run on cron only
-
-### Platform-Specific
-
-**GitHub Actions:**
-
- Secrets: Repository Settings → Secrets and variables → Actions
- Runners: Ubuntu latest recommended
- Concurrency limits: 20 jobs for free tier
-
-**GitLab CI:**
-
- Variables: Project Settings → CI/CD → Variables
- Runners: Shared or project-specific
- Pipeline quota: 400 minutes/month free tier
-
---
-
-**Checklist Complete**: Sign off when all items validated.
-
-**Completed by:** {name}
-**Date:** {date}
-**Platform:** {GitHub Actions, GitLab CI, Other}
-**Notes:** {notes}
--- a/_bmad/bmm/workflows/testarch/ci/github-actions-template.yaml
+++ b/_bmad/bmm/workflows/testarch/ci/github-actions-template.yaml
@@ -1,198 +0,0 @@
-# GitHub Actions CI/CD Pipeline for Test Execution
-# Generated by BMad TEA Agent - Test Architect Module
-# Optimized for: Playwright/Cypress, Parallel Sharding, Burn-In Loop
-
-name: Test Pipeline
-
-on:
-  push:
-    branches: [main, develop]
-  pull_request:
-    branches: [main, develop]
-  schedule:
-    # Weekly burn-in on Sundays at 2 AM UTC
-    - cron: "0 2 * * 0"
-
-concurrency:
-  group: ${{ github.workflow }}-${{ github.ref }}
-  cancel-in-progress: true
-
-jobs:
-  # Lint stage - Code quality checks
-  lint:
-    name: Lint
-    runs-on: ubuntu-latest
-    timeout-minutes: 5
-
-    steps:
-      - uses: actions/checkout@v4
-
-      - name: Determine Node version
-        id: node-version
-        run: |
-          if [ -f .nvmrc ]; then
-            echo "value=$(cat .nvmrc)" >> "$GITHUB_OUTPUT"
-            echo "Using Node from .nvmrc"
-          else
-            echo "value=24" >> "$GITHUB_OUTPUT"
-            echo "Using default Node 24 (current LTS)"
-          fi
-
-      - name: Setup Node.js
-        uses: actions/setup-node@v4
-        with:
-          node-version: ${{ steps.node-version.outputs.value }}
-          cache: "npm"
-
-      - name: Install dependencies
-        run: npm ci
-
-      - name: Run linter
-        run: npm run lint
-
-  # Test stage - Parallel execution with sharding
-  test:
-    name: Test (Shard ${{ matrix.shard }})
-    runs-on: ubuntu-latest
-    timeout-minutes: 30
-    needs: lint
-
-    strategy:
-      fail-fast: false
-      matrix:
-        shard: [1, 2, 3, 4]
-
-    steps:
-      - uses: actions/checkout@v4
-
-      - name: Determine Node version
-        id: node-version
-        run: |
-          if [ -f .nvmrc ]; then
-            echo "value=$(cat .nvmrc)" >> "$GITHUB_OUTPUT"
-            echo "Using Node from .nvmrc"
-          else
-            echo "value=22" >> "$GITHUB_OUTPUT"
-            echo "Using default Node 22 (current LTS)"
-          fi
-
-      - name: Setup Node.js
-        uses: actions/setup-node@v4
-        with:
-          node-version: ${{ steps.node-version.outputs.value }}
-          cache: "npm"
-
-      - name: Cache Playwright browsers
-        uses: actions/cache@v4
-        with:
-          path: ~/.cache/ms-playwright
-          key: ${{ runner.os }}-playwright-${{ hashFiles('**/package-lock.json') }}
-          restore-keys: |
-            ${{ runner.os }}-playwright-
-
-      - name: Install dependencies
-        run: npm ci
-
-      - name: Install Playwright browsers
-        run: npx playwright install --with-deps chromium
-
-      - name: Run tests (shard ${{ matrix.shard }}/4)
-        run: npm run test:e2e -- --shard=${{ matrix.shard }}/4
-
-      - name: Upload test results
-        if: failure()
-        uses: actions/upload-artifact@v4
-        with:
-          name: test-results-${{ matrix.shard }}
-          path: |
-            test-results/
-            playwright-report/
-          retention-days: 30
-
-  # Burn-in stage - Flaky test detection
-  burn-in:
-    name: Burn-In (Flaky Detection)
-    runs-on: ubuntu-latest
-    timeout-minutes: 60
-    needs: test
-    # Only run burn-in on PRs to main/develop or on schedule
-    if: github.event_name == 'pull_request' || github.event_name == 'schedule'
-
-    steps:
-      - uses: actions/checkout@v4
-
-      - name: Determine Node version
-        id: node-version
-        run: |
-          if [ -f .nvmrc ]; then
-            echo "value=$(cat .nvmrc)" >> "$GITHUB_OUTPUT"
-            echo "Using Node from .nvmrc"
-          else
-            echo "value=22" >> "$GITHUB_OUTPUT"
-            echo "Using default Node 22 (current LTS)"
-          fi
-
-      - name: Setup Node.js
-        uses: actions/setup-node@v4
-        with:
-          node-version: ${{ steps.node-version.outputs.value }}
-          cache: "npm"
-
-      - name: Cache Playwright browsers
-        uses: actions/cache@v4
-        with:
-          path: ~/.cache/ms-playwright
-          key: ${{ runner.os }}-playwright-${{ hashFiles('**/package-lock.json') }}
-
-      - name: Install dependencies
-        run: npm ci
-
-      - name: Install Playwright browsers
-        run: npx playwright install --with-deps chromium
-
-      - name: Run burn-in loop (10 iterations)
-        run: |
-          echo "🔥 Starting burn-in loop - detecting flaky tests"
-          for i in {1..10}; do
-            echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
-            echo "🔥 Burn-in iteration $i/10"
-            echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
-            npm run test:e2e || exit 1
-          done
-          echo "✅ Burn-in complete - no flaky tests detected"
-
-      - name: Upload burn-in failure artifacts
-        if: failure()
-        uses: actions/upload-artifact@v4
-        with:
-          name: burn-in-failures
-          path: |
-            test-results/
-            playwright-report/
-          retention-days: 30
-
-  # Report stage - Aggregate and publish results
-  report:
-    name: Test Report
-    runs-on: ubuntu-latest
-    needs: [test, burn-in]
-    if: always()
-
-    steps:
-      - name: Download all artifacts
-        uses: actions/download-artifact@v4
-        with:
-          path: artifacts
-
-      - name: Generate summary
-        run: |
-          echo "## Test Execution Summary" >> $GITHUB_STEP_SUMMARY
-          echo "" >> $GITHUB_STEP_SUMMARY
-          echo "- **Status**: ${{ needs.test.result }}" >> $GITHUB_STEP_SUMMARY
-          echo "- **Burn-in**: ${{ needs.burn-in.result }}" >> $GITHUB_STEP_SUMMARY
-          echo "- **Shards**: 4" >> $GITHUB_STEP_SUMMARY
-          echo "" >> $GITHUB_STEP_SUMMARY
-
-          if [ "${{ needs.burn-in.result }}" == "failure" ]; then
-            echo "⚠️ **Flaky tests detected** - Review burn-in artifacts" >> $GITHUB_STEP_SUMMARY
-          fi
--- a/_bmad/bmm/workflows/testarch/ci/gitlab-ci-template.yaml
+++ b/_bmad/bmm/workflows/testarch/ci/gitlab-ci-template.yaml
@@ -1,149 +0,0 @@
-# GitLab CI/CD Pipeline for Test Execution
-# Generated by BMad TEA Agent - Test Architect Module
-# Optimized for: Playwright/Cypress, Parallel Sharding, Burn-In Loop
-
-stages:
-  - lint
-  - test
-  - burn-in
-  - report
-
-variables:
-  # Disable git depth for accurate change detection
-  GIT_DEPTH: 0
-  # Use npm ci for faster, deterministic installs
-  npm_config_cache: "$CI_PROJECT_DIR/.npm"
-  # Playwright browser cache
-  PLAYWRIGHT_BROWSERS_PATH: "$CI_PROJECT_DIR/.cache/ms-playwright"
-  # Default Node version when .nvmrc is missing
-  DEFAULT_NODE_VERSION: "24"
-
-# Caching configuration
-cache:
-  key:
-    files:
-      - package-lock.json
-  paths:
-    - .npm/
-    - .cache/ms-playwright/
-    - node_modules/
-
-# Lint stage - Code quality checks
-lint:
-  stage: lint
-  image: node:$DEFAULT_NODE_VERSION
-  before_script:
-    - |
-      NODE_VERSION=$(cat .nvmrc 2>/dev/null || echo "$DEFAULT_NODE_VERSION")
-      echo "Using Node $NODE_VERSION"
-      npm install -g n
-      n "$NODE_VERSION"
-      node -v
-    - npm ci
-  script:
-    - npm run lint
-  timeout: 5 minutes
-
-# Test stage - Parallel execution with sharding
-.test-template: &test-template
-  stage: test
-  image: node:$DEFAULT_NODE_VERSION
-  needs:
-    - lint
-  before_script:
-    - |
-      NODE_VERSION=$(cat .nvmrc 2>/dev/null || echo "$DEFAULT_NODE_VERSION")
-      echo "Using Node $NODE_VERSION"
-      npm install -g n
-      n "$NODE_VERSION"
-      node -v
-    - npm ci
-    - npx playwright install --with-deps chromium
-  artifacts:
-    when: on_failure
-    paths:
-      - test-results/
-      - playwright-report/
-    expire_in: 30 days
-  timeout: 30 minutes
-
-test:shard-1:
-  <<: *test-template
-  script:
-    - npm run test:e2e -- --shard=1/4
-
-test:shard-2:
-  <<: *test-template
-  script:
-    - npm run test:e2e -- --shard=2/4
-
-test:shard-3:
-  <<: *test-template
-  script:
-    - npm run test:e2e -- --shard=3/4
-
-test:shard-4:
-  <<: *test-template
-  script:
-    - npm run test:e2e -- --shard=4/4
-
-# Burn-in stage - Flaky test detection
-burn-in:
-  stage: burn-in
-  image: node:$DEFAULT_NODE_VERSION
-  needs:
-    - test:shard-1
-    - test:shard-2
-    - test:shard-3
-    - test:shard-4
-  # Only run burn-in on merge requests to main/develop or on schedule
-  rules:
-    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
-    - if: '$CI_PIPELINE_SOURCE == "schedule"'
-  before_script:
-    - |
-      NODE_VERSION=$(cat .nvmrc 2>/dev/null || echo "$DEFAULT_NODE_VERSION")
-      echo "Using Node $NODE_VERSION"
-      npm install -g n
-      n "$NODE_VERSION"
-      node -v
-    - npm ci
-    - npx playwright install --with-deps chromium
-  script:
-    - |
-      echo "🔥 Starting burn-in loop - detecting flaky tests"
-      for i in {1..10}; do
-        echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
-        echo "🔥 Burn-in iteration $i/10"
-        echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
-        npm run test:e2e || exit 1
-      done
-      echo "✅ Burn-in complete - no flaky tests detected"
-  artifacts:
-    when: on_failure
-    paths:
-      - test-results/
-      - playwright-report/
-    expire_in: 30 days
-  timeout: 60 minutes
-
-# Report stage - Aggregate results
-report:
-  stage: report
-  image: alpine:latest
-  needs:
-    - test:shard-1
-    - test:shard-2
-    - test:shard-3
-    - test:shard-4
-    - burn-in
-  when: always
-  script:
-    - |
-      echo "## Test Execution Summary"
-      echo ""
-      echo "- Pipeline: $CI_PIPELINE_ID"
-      echo "- Shards: 4"
-      echo "- Branch: $CI_COMMIT_REF_NAME"
-      echo ""
-      echo "View detailed results in job artifacts"
--- a/_bmad/bmm/workflows/testarch/ci/instructions.md
+++ b/_bmad/bmm/workflows/testarch/ci/instructions.md
@@ -1,536 +0,0 @@
-<!-- Powered by BMAD-CORE™ -->
-
-# CI/CD Pipeline Setup
-
-**Workflow ID**: `_bmad/bmm/testarch/ci`
-**Version**: 4.0 (BMad v6)
-
---
-
-## Overview
-
-Scaffolds a production-ready CI/CD quality pipeline with test execution, burn-in loops for flaky test detection, parallel sharding, artifact collection, and notification configuration. This workflow creates platform-specific CI configuration optimized for fast feedback and reliable test execution.
-
-Note: This is typically a one-time setup per repo; run it any time after the test framework exists, ideally before feature work starts.
-
---
-
-## Preflight Requirements
-
-**Critical:** Verify these requirements before proceeding. If any fail, HALT and notify the user.
-
- ✅ Git repository is initialized (`.git/` directory exists)
- ✅ Local test suite passes (`npm run test:e2e` succeeds)
- ✅ Test framework is configured (from `framework` workflow)
- ✅ Team agrees on target CI platform (GitHub Actions, GitLab CI, Circle CI, etc.)
- ✅ Access to CI platform settings/secrets available (if updating existing pipeline)
-
---
-
-## Step 1: Run Preflight Checks
-
-### Actions
-
-1. **Verify Git Repository**
-   - Check for `.git/` directory
-   - Confirm remote repository configured (`git remote -v`)
-   - If not initialized, HALT with message: "Git repository required for CI/CD setup"
-
-2. **Validate Test Framework**
-   - Look for `playwright.config.*` or `cypress.config.*`
-   - Read framework configuration to extract:
-     - Test directory location
-     - Test command
-     - Reporter configuration
-     - Timeout settings
-   - If not found, HALT with message: "Run `framework` workflow first to set up test infrastructure"
-
-3. **Run Local Tests**
-   - Execute `npm run test:e2e` (or equivalent from package.json)
-   - Ensure tests pass before CI setup
-   - If tests fail, HALT with message: "Fix failing tests before setting up CI/CD"
-
-4. **Detect CI Platform**
-   - Check for existing CI configuration:
-     - `.github/workflows/*.yml` (GitHub Actions)
-     - `.gitlab-ci.yml` (GitLab CI)
-     - `.circleci/config.yml` (Circle CI)
-     - `Jenkinsfile` (Jenkins)
-   - If found, ask user: "Update existing CI configuration or create new?"
-   - If not found, detect platform from git remote:
-     - `github.com` → GitHub Actions (default)
-     - `gitlab.com` → GitLab CI
-     - Ask user if unable to auto-detect
-
-5. **Read Environment Configuration**
-   - Use `.nvmrc` for Node version if present
-   - If missing, default to a current LTS (Node 24) or newer instead of a fixed old version
-   - Read `package.json` to identify dependencies (affects caching strategy)
-
-**Halt Condition:** If preflight checks fail, stop immediately and report which requirement failed.
-
---
-
-## Step 2: Scaffold CI Pipeline
-
-### Actions
-
-1. **Select CI Platform Template**
-
-   Based on detection or user preference, use the appropriate template:
-
-   **GitHub Actions** (`.github/workflows/test.yml`):
-   - Most common platform
-   - Excellent caching and matrix support
-   - Free for public repos, generous free tier for private
-
-   **GitLab CI** (`.gitlab-ci.yml`):
-   - Integrated with GitLab
-   - Built-in registry and runners
-   - Powerful pipeline features
-
-   **Circle CI** (`.circleci/config.yml`):
-   - Fast execution with parallelism
-   - Docker-first approach
-   - Enterprise features
-
-   **Jenkins** (`Jenkinsfile`):
-   - Self-hosted option
-   - Maximum customization
-   - Requires infrastructure management
-
-2. **Generate Pipeline Configuration**
-
-   Use templates from `{installed_path}/` directory:
-   - `github-actions-template.yml`
-   - `gitlab-ci-template.yml`
-
-   **Key pipeline stages:**
-
-   ```yaml
-   stages:
-     - lint # Code quality checks
-     - test # Test execution (parallel shards)
-     - burn-in # Flaky test detection
-     - report # Aggregate results and publish
-   ```
-
-3. **Configure Test Execution**
-
-   **Parallel Sharding:**
-
-   ```yaml
-   strategy:
-     fail-fast: false
-     matrix:
-       shard: [1, 2, 3, 4]
-
-   steps:
-     - name: Run tests
-       run: npm run test:e2e -- --shard=${{ matrix.shard }}/${{ strategy.job-total }}
-   ```
-
-   **Purpose:** Splits tests into N parallel jobs for faster execution (target: <10 min per shard)
-
-4. **Add Burn-In Loop**
-
-   **Critical pattern from production systems:**
-
-   ```yaml
-   burn-in:
-     name: Flaky Test Detection
-     runs-on: ubuntu-latest
-     steps:
-       - uses: actions/checkout@v4
-
-       - name: Setup Node
-         uses: actions/setup-node@v4
-         with:
-           node-version-file: '.nvmrc'
-
-       - name: Install dependencies
-         run: npm ci
-
-       - name: Run burn-in loop (10 iterations)
-         run: |
-           for i in {1..10}; do
-             echo "🔥 Burn-in iteration $i/10"
-             npm run test:e2e || exit 1
-           done
-
-       - name: Upload failure artifacts
-         if: failure()
-         uses: actions/upload-artifact@v4
-         with:
-           name: burn-in-failures
-           path: test-results/
-           retention-days: 30
-   ```
-
-   **Purpose:** Runs tests multiple times to catch non-deterministic failures before they reach main branch.
-
-   **When to run:**
-   - On pull requests to main/develop
-   - Weekly on cron schedule
-   - After significant test infrastructure changes
-
-5. **Configure Caching**
-
-   **Node modules cache:**
-
-   ```yaml
-   - name: Cache dependencies
-     uses: actions/cache@v4
-     with:
-       path: ~/.npm
-       key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
-       restore-keys: |
-         ${{ runner.os }}-node-
-   ```
-
-   **Browser binaries cache (Playwright):**
-
-   ```yaml
-   - name: Cache Playwright browsers
-     uses: actions/cache@v4
-     with:
-       path: ~/.cache/ms-playwright
-       key: ${{ runner.os }}-playwright-${{ hashFiles('**/package-lock.json') }}
-   ```
-
-   **Purpose:** Reduces CI execution time by 2-5 minutes per run.
-
-6. **Configure Artifact Collection**
-
-   **Failure artifacts only:**
-
-   ```yaml
-   - name: Upload test results
-     if: failure()
-     uses: actions/upload-artifact@v4
-     with:
-       name: test-results-${{ matrix.shard }}
-       path: |
-         test-results/
-         playwright-report/
-       retention-days: 30
-   ```
-
-   **Artifacts to collect:**
-   - Traces (Playwright) - full debugging context
-   - Screenshots - visual evidence of failures
-   - Videos - interaction playback
-   - HTML reports - detailed test results
-   - Console logs - error messages and warnings
-
-7. **Add Retry Logic**
-
-   ```yaml
-   - name: Run tests with retries
-     uses: nick-invision/retry@v2
-     with:
-       timeout_minutes: 30
-       max_attempts: 3
-       retry_on: error
-       command: npm run test:e2e
-   ```
-
-   **Purpose:** Handles transient failures (network issues, race conditions)
-
-8. **Configure Notifications** (Optional)
-
-   If `notify_on_failure` is enabled:
-
-   ```yaml
-   - name: Notify on failure
-     if: failure()
-     uses: 8398a7/action-slack@v3
-     with:
-       status: ${{ job.status }}
-       text: 'Test failures detected in PR #${{ github.event.pull_request.number }}'
-       webhook_url: ${{ secrets.SLACK_WEBHOOK }}
-   ```
-
-9. **Generate Helper Scripts**
-
-   **Selective testing script** (`scripts/test-changed.sh`):
-
-   ```bash
-   #!/bin/bash
-   # Run only tests for changed files
-
-   CHANGED_FILES=$(git diff --name-only HEAD~1)
-
-   if echo "$CHANGED_FILES" | grep -q "src/.*\.ts$"; then
-     echo "Running affected tests..."
-     npm run test:e2e -- --grep="$(echo $CHANGED_FILES | sed 's/src\///g' | sed 's/\.ts//g')"
-   else
-     echo "No test-affecting changes detected"
-   fi
-   ```
-
-   **Local mirror script** (`scripts/ci-local.sh`):
-
-   ```bash
-   #!/bin/bash
-   # Mirror CI execution locally for debugging
-
-   echo "🔍 Running CI pipeline locally..."
-
-   # Lint
-   npm run lint || exit 1
-
-   # Tests
-   npm run test:e2e || exit 1
-
-   # Burn-in (reduced iterations)
-   for i in {1..3}; do
-     echo "🔥 Burn-in $i/3"
-     npm run test:e2e || exit 1
-   done
-
-   echo "✅ Local CI pipeline passed"
-   ```
-
-10. **Generate Documentation**
-
-    **CI README** (`docs/ci.md`):
-    - Pipeline stages and purpose
-    - How to run locally
-    - Debugging failed CI runs
-    - Secrets and environment variables needed
-    - Notification setup
-    - Badge URLs for README
-
-    **Secrets checklist** (`docs/ci-secrets-checklist.md`):
-    - Required secrets list (SLACK_WEBHOOK, etc.)
-    - Where to configure in CI platform
-    - Security best practices
-
---
-
-## Step 3: Deliverables
-
-### Primary Artifacts Created
-
-1. **CI Configuration File**
-   - `.github/workflows/test.yml` (GitHub Actions)
-   - `.gitlab-ci.yml` (GitLab CI)
-   - `.circleci/config.yml` (Circle CI)
-
-2. **Pipeline Stages**
-   - **Lint**: Code quality checks (ESLint, Prettier)
-   - **Test**: Parallel test execution (4 shards)
-   - **Burn-in**: Flaky test detection (10 iterations)
-   - **Report**: Result aggregation and publishing
-
-3. **Helper Scripts**
-   - `scripts/test-changed.sh` - Selective testing
-   - `scripts/ci-local.sh` - Local CI mirror
-   - `scripts/burn-in.sh` - Standalone burn-in execution
-
-4. **Documentation**
-   - `docs/ci.md` - CI pipeline guide
-   - `docs/ci-secrets-checklist.md` - Required secrets
-   - Inline comments in CI configuration
-
-5. **Optimization Features**
-   - Dependency caching (npm, browser binaries)
-   - Parallel sharding (4 jobs default)
-   - Retry logic (2 retries on failure)
-   - Failure-only artifact upload
-
-### Performance Targets
-
- **Lint stage**: <2 minutes
- **Test stage** (per shard): <10 minutes
- **Burn-in stage**: <30 minutes (10 iterations)
- **Total pipeline**: <45 minutes
-
-**Speedup:** 20× faster than sequential execution through parallelism and caching.
-
---
-
-## Important Notes
-
-### Knowledge Base Integration
-
-**Critical:** Check configuration and load appropriate fragments.
-
-Read `{config_source}` and check `config.tea_use_playwright_utils`.
-
-**Core CI Patterns (Always load):**
-
- `ci-burn-in.md` - Burn-in loop patterns: 10-iteration detection, GitHub Actions workflow, shard orchestration, selective execution (678 lines, 4 examples)
- `selective-testing.md` - Changed test detection strategies: tag-based, spec filters, diff-based selection, promotion rules (727 lines, 4 examples)
- `visual-debugging.md` - Artifact collection best practices: trace viewer, HAR recording, custom artifacts, accessibility integration (522 lines, 5 examples)
- `test-quality.md` - CI-specific test quality criteria: deterministic tests, isolated with cleanup, explicit assertions, length/time optimization (658 lines, 5 examples)
- `playwright-config.md` - CI-optimized configuration: parallelization, artifact output, project dependencies, sharding (722 lines, 5 examples)
-
-**If `config.tea_use_playwright_utils: true`:**
-
-Load playwright-utils CI-relevant fragments:
-
- `burn-in.md` - Smart test selection with git diff analysis (very important for CI optimization)
- `network-error-monitor.md` - Automatic HTTP 4xx/5xx detection (recommend in CI pipelines)
-
-Recommend:
-
- Add burn-in script for pull request validation
- Enable network-error-monitor in merged fixtures for catching silent failures
- Reference full docs in `*framework` and `*automate` workflows
-
-### CI Platform-Specific Guidance
-
-**GitHub Actions:**
-
- Use `actions/cache` for caching
- Matrix strategy for parallelism
- Secrets in repository settings
- Free 2000 minutes/month for private repos
-
-**GitLab CI:**
-
- Use `.gitlab-ci.yml` in root
- `cache:` directive for caching
- Parallel execution with `parallel: 4`
- Variables in project CI/CD settings
-
-**Circle CI:**
-
- Use `.circleci/config.yml`
- Docker executors recommended
- Parallelism with `parallelism: 4`
- Context for shared secrets
-
-### Burn-In Loop Strategy
-
-**When to run:**
-
- ✅ On PRs to main/develop branches
- ✅ Weekly on schedule (cron)
- ✅ After test infrastructure changes
- ❌ Not on every commit (too slow)
-
-**Iterations:**
-
- **10 iterations** for thorough detection
- **3 iterations** for quick feedback
- **100 iterations** for high-confidence stability
-
-**Failure threshold:**
-
- Even ONE failure in burn-in → tests are flaky
- Must fix before merging
-
-### Artifact Retention
-
-**Failure artifacts only:**
-
- Saves storage costs
- Maintains debugging capability
- 30-day retention default
-
-**Artifact types:**
-
- Traces (Playwright) - 5-10 MB per test
- Screenshots - 100-500 KB per screenshot
- Videos - 2-5 MB per test
- HTML reports - 1-2 MB per run
-
-### Selective Testing
-
-**Detect changed files:**
-
-```bash
-git diff --name-only HEAD~1
-```
-
-**Run affected tests only:**
-
- Faster feedback for small changes
- Full suite still runs on main branch
- Reduces CI time by 50-80% for focused PRs
-
-**Trade-off:**
-
- May miss integration issues
- Run full suite at least on merge
-
-### Local CI Mirror
-
-**Purpose:** Debug CI failures locally
-
-**Usage:**
-
-```bash
-./scripts/ci-local.sh
-```
-
-**Mirrors CI environment:**
-
- Same Node version
- Same test command
- Same stages (lint → test → burn-in)
- Reduced burn-in iterations (3 vs 10)
-
---
-
-## Output Summary
-
-After completing this workflow, provide a summary:
-
-```markdown
-## CI/CD Pipeline Complete
-
-**Platform**: GitHub Actions (or GitLab CI, etc.)
-
-**Artifacts Created**:
-
- ✅ Pipeline configuration: .github/workflows/test.yml
- ✅ Burn-in loop: 10 iterations for flaky detection
- ✅ Parallel sharding: 4 jobs for fast execution
- ✅ Caching: Dependencies + browser binaries
- ✅ Artifact collection: Failure-only traces/screenshots/videos
- ✅ Helper scripts: test-changed.sh, ci-local.sh, burn-in.sh
- ✅ Documentation: docs/ci.md, docs/ci-secrets-checklist.md
-
-**Performance:**
-
- Lint: <2 min
- Test (per shard): <10 min
- Burn-in: <30 min
- Total: <45 min (20× speedup vs sequential)
-
-**Next Steps**:
-
-1. Commit CI configuration: `git add .github/workflows/test.yml && git commit -m "ci: add test pipeline"`
-2. Push to remote: `git push`
-3. Configure required secrets in CI platform settings (see docs/ci-secrets-checklist.md)
-4. Open a PR to trigger first CI run
-5. Monitor pipeline execution and adjust parallelism if needed
-
-**Knowledge Base References Applied**:
-
- Burn-in loop pattern (ci-burn-in.md)
- Selective testing strategy (selective-testing.md)
- Artifact collection (visual-debugging.md)
- Test quality criteria (test-quality.md)
-```
-
---
-
-## Validation
-
-After completing all steps, verify:
-
- [ ] CI configuration file created and syntactically valid
- [ ] Burn-in loop configured (10 iterations)
- [ ] Parallel sharding enabled (4 jobs)
- [ ] Caching configured (dependencies + browsers)
- [ ] Artifact collection on failure only
- [ ] Helper scripts created and executable (`chmod +x`)
- [ ] Documentation complete (ci.md, secrets checklist)
- [ ] No errors or warnings during scaffold
-
-Refer to `checklist.md` for comprehensive validation criteria.
--- a/_bmad/bmm/workflows/testarch/ci/workflow.yaml
+++ b/_bmad/bmm/workflows/testarch/ci/workflow.yaml
@@ -1,45 +0,0 @@
-# Test Architect workflow: ci
-name: testarch-ci
-description: "Scaffold CI/CD quality pipeline with test execution, burn-in loops, and artifact collection"
-author: "BMad"
-
-# Critical variables from config
-config_source: "{project-root}/_bmad/bmm/config.yaml"
-output_folder: "{config_source}:output_folder"
-user_name: "{config_source}:user_name"
-communication_language: "{config_source}:communication_language"
-document_output_language: "{config_source}:document_output_language"
-date: system-generated
-
-# Workflow components
-installed_path: "{project-root}/_bmad/bmm/workflows/testarch/ci"
-instructions: "{installed_path}/instructions.md"
-validation: "{installed_path}/checklist.md"
-
-# Variables and inputs
-variables:
-  ci_platform: "auto" # auto, github-actions, gitlab-ci, circle-ci, jenkins - user can override
-  test_dir: "{project-root}/tests" # Root test directory
-
-# Output configuration
-default_output_file: "{project-root}/.github/workflows/test.yml" # GitHub Actions default
-
-# Required tools
-required_tools:
-  - read_file # Read .nvmrc, package.json, framework config
-  - write_file # Create CI config, scripts, documentation
-  - create_directory # Create .github/workflows/ or .gitlab-ci/ directories
-  - list_files # Detect existing CI configuration
-  - search_repo # Find test files for selective testing
-
-tags:
-  - qa
-  - ci-cd
-  - test-architect
-  - pipeline
-  - automation
-
-execution_hints:
-  interactive: false # Minimize prompts, auto-detect when possible
-  autonomous: true # Proceed without user input unless blocked
-  iterative: true
--- a/_bmad/bmm/workflows/testarch/framework/checklist.md
+++ b/_bmad/bmm/workflows/testarch/framework/checklist.md
@@ -1,321 +0,0 @@
-# Test Framework Setup - Validation Checklist
-
-This checklist ensures the framework workflow completes successfully and all deliverables meet quality standards.
-
---
-
-## Prerequisites
-
-Before starting the workflow:
-
- [ ] Project root contains valid `package.json`
- [ ] No existing modern E2E framework detected (`playwright.config.*`, `cypress.config.*`)
- [ ] Project type identifiable (React, Vue, Angular, Next.js, Node, etc.)
- [ ] Bundler identifiable (Vite, Webpack, Rollup, esbuild) or not applicable
- [ ] User has write permissions to create directories and files
-
---
-
-## Process Steps
-
-### Step 1: Preflight Checks
-
- [ ] package.json successfully read and parsed
- [ ] Project type extracted correctly
- [ ] Bundler identified (or marked as N/A for backend projects)
- [ ] No framework conflicts detected
- [ ] Architecture documents located (if available)
-
-### Step 2: Framework Selection
-
- [ ] Framework auto-detection logic executed
- [ ] Framework choice justified (Playwright vs Cypress)
- [ ] Framework preference respected (if explicitly set)
- [ ] User notified of framework selection and rationale
-
-### Step 3: Directory Structure
-
- [ ] `tests/` root directory created
- [ ] `tests/e2e/` directory created (or user's preferred structure)
- [ ] `tests/support/` directory created (critical pattern)
- [ ] `tests/support/fixtures/` directory created
- [ ] `tests/support/fixtures/factories/` directory created
- [ ] `tests/support/helpers/` directory created
- [ ] `tests/support/page-objects/` directory created (if applicable)
- [ ] All directories have correct permissions
-
-**Note**: Test organization is flexible (e2e/, api/, integration/). The **support/** folder is the key pattern.
-
-### Step 4: Configuration Files
-
- [ ] Framework config file created (`playwright.config.ts` or `cypress.config.ts`)
- [ ] Config file uses TypeScript (if `use_typescript: true`)
- [ ] Timeouts configured correctly (action: 15s, navigation: 30s, test: 60s)
- [ ] Base URL configured with environment variable fallback
- [ ] Trace/screenshot/video set to retain-on-failure
- [ ] Multiple reporters configured (HTML + JUnit + console)
- [ ] Parallel execution enabled
- [ ] CI-specific settings configured (retries, workers)
- [ ] Config file is syntactically valid (no compilation errors)
-
-### Step 5: Environment Configuration
-
- [ ] `.env.example` created in project root
- [ ] `TEST_ENV` variable defined
- [ ] `BASE_URL` variable defined with default
- [ ] `API_URL` variable defined (if applicable)
- [ ] Authentication variables defined (if applicable)
- [ ] Feature flag variables defined (if applicable)
- [ ] `.nvmrc` created with appropriate Node version
-
-### Step 6: Fixture Architecture
-
- [ ] `tests/support/fixtures/index.ts` created
- [ ] Base fixture extended from Playwright/Cypress
- [ ] Type definitions for fixtures created
- [ ] mergeTests pattern implemented (if multiple fixtures)
- [ ] Auto-cleanup logic included in fixtures
- [ ] Fixture architecture follows knowledge base patterns
-
-### Step 7: Data Factories
-
- [ ] At least one factory created (e.g., UserFactory)
- [ ] Factories use @faker-js/faker for realistic data
- [ ] Factories track created entities (for cleanup)
- [ ] Factories implement `cleanup()` method
- [ ] Factories integrate with fixtures
- [ ] Factories follow knowledge base patterns
-
-### Step 8: Sample Tests
-
- [ ] Example test file created (`tests/e2e/example.spec.ts`)
- [ ] Test uses fixture architecture
- [ ] Test demonstrates data factory usage
- [ ] Test uses proper selector strategy (data-testid)
- [ ] Test follows Given-When-Then structure
- [ ] Test includes proper assertions
- [ ] Network interception demonstrated (if applicable)
-
-### Step 9: Helper Utilities
-
- [ ] API helper created (if API testing needed)
- [ ] Network helper created (if network mocking needed)
- [ ] Auth helper created (if authentication needed)
- [ ] Helpers follow functional patterns
- [ ] Helpers have proper error handling
-
-### Step 10: Documentation
-
- [ ] `tests/README.md` created
- [ ] Setup instructions included
- [ ] Running tests section included
- [ ] Architecture overview section included
- [ ] Best practices section included
- [ ] CI integration section included
- [ ] Knowledge base references included
- [ ] Troubleshooting section included
-
-### Step 11: Package.json Updates
-
- [ ] Minimal test script added to package.json: `test:e2e`
- [ ] Test framework dependency added (if not already present)
- [ ] Type definitions added (if TypeScript)
- [ ] Users can extend with additional scripts as needed
-
---
-
-## Output Validation
-
-### Configuration Validation
-
- [ ] Config file loads without errors
- [ ] Config file passes linting (if linter configured)
- [ ] Config file uses correct syntax for chosen framework
- [ ] All paths in config resolve correctly
- [ ] Reporter output directories exist or are created on test run
-
-### Test Execution Validation
-
- [ ] Sample test runs successfully
- [ ] Test execution produces expected output (pass/fail)
- [ ] Test artifacts generated correctly (traces, screenshots, videos)
- [ ] Test report generated successfully
- [ ] No console errors or warnings during test run
-
-### Directory Structure Validation
-
- [ ] All required directories exist
- [ ] Directory structure matches framework conventions
- [ ] No duplicate or conflicting directories
- [ ] Directories accessible with correct permissions
-
-### File Integrity Validation
-
- [ ] All generated files are syntactically correct
- [ ] No placeholder text left in files (e.g., "TODO", "FIXME")
- [ ] All imports resolve correctly
- [ ] No hardcoded credentials or secrets in files
- [ ] All file paths use correct separators for OS
-
---
-
-## Quality Checks
-
-### Code Quality
-
- [ ] Generated code follows project coding standards
- [ ] TypeScript types are complete and accurate (no `any` unless necessary)
- [ ] No unused imports or variables
- [ ] Consistent code formatting (matches project style)
- [ ] No linting errors in generated files
-
-### Best Practices Compliance
-
- [ ] Fixture architecture follows pure function → fixture → mergeTests pattern
- [ ] Data factories implement auto-cleanup
- [ ] Network interception occurs before navigation
- [ ] Selectors use data-testid strategy
- [ ] Artifacts only captured on failure
- [ ] Tests follow Given-When-Then structure
- [ ] No hard-coded waits or sleeps
-
-### Knowledge Base Alignment
-
- [ ] Fixture pattern matches `fixture-architecture.md`
- [ ] Data factories match `data-factories.md`
- [ ] Network handling matches `network-first.md`
- [ ] Config follows `playwright-config.md` or `test-config.md`
- [ ] Test quality matches `test-quality.md`
-
-### Security Checks
-
- [ ] No credentials in configuration files
- [ ] .env.example contains placeholders, not real values
- [ ] Sensitive test data handled securely
- [ ] API keys and tokens use environment variables
- [ ] No secrets committed to version control
-
---
-
-## Integration Points
-
-### Status File Integration
-
- [ ] `bmm-workflow-status.md` exists
- [ ] Framework initialization logged in Quality & Testing Progress section
- [ ] Status file updated with completion timestamp
- [ ] Status file shows framework: Playwright or Cypress
-
-### Knowledge Base Integration
-
- [ ] Relevant knowledge fragments identified from tea-index.csv
- [ ] Knowledge fragments successfully loaded
- [ ] Patterns from knowledge base applied correctly
- [ ] Knowledge base references included in documentation
-
-### Workflow Dependencies
-
- [ ] Can proceed to `ci` workflow after completion
- [ ] Can proceed to `test-design` workflow after completion
- [ ] Can proceed to `atdd` workflow after completion
- [ ] Framework setup compatible with downstream workflows
-
---
-
-## Completion Criteria
-
-**All of the following must be true:**
-
- [ ] All prerequisite checks passed
- [ ] All process steps completed without errors
- [ ] All output validations passed
- [ ] All quality checks passed
- [ ] All integration points verified
- [ ] Sample test executes successfully
- [ ] User can run `npm run test:e2e` without errors
- [ ] Documentation is complete and accurate
- [ ] No critical issues or blockers identified
-
---
-
-## Post-Workflow Actions
-
-**User must complete:**
-
-1. [ ] Copy `.env.example` to `.env`
-2. [ ] Fill in environment-specific values in `.env`
-3. [ ] Run `npm install` to install test dependencies
-4. [ ] Run `npm run test:e2e` to verify setup
-5. [ ] Review `tests/README.md` for project-specific guidance
-
-**Recommended next workflows:**
-
-1. [ ] Run `ci` workflow to set up CI/CD pipeline
-2. [ ] Run `test-design` workflow to plan test coverage
-3. [ ] Run `atdd` workflow when ready to develop stories
-
---
-
-## Rollback Procedure
-
-If workflow fails and needs to be rolled back:
-
-1. [ ] Delete `tests/` directory
-2. [ ] Remove test scripts from package.json
-3. [ ] Delete `.env.example` (if created)
-4. [ ] Delete `.nvmrc` (if created)
-5. [ ] Delete framework config file
-6. [ ] Remove test dependencies from package.json (if added)
-7. [ ] Run `npm install` to clean up node_modules
-
---
-
-## Notes
-
-### Common Issues
-
-**Issue**: Config file has TypeScript errors
-
- **Solution**: Ensure `@playwright/test` or `cypress` types are installed
-
-**Issue**: Sample test fails to run
-
- **Solution**: Check BASE_URL in .env, ensure app is running
-
-**Issue**: Fixture cleanup not working
-
- **Solution**: Verify cleanup() is called in fixture teardown
-
-**Issue**: Network interception not working
-
- **Solution**: Ensure route setup occurs before page.goto()
-
-### Framework-Specific Considerations
-
-**Playwright:**
-
- Requires Node.js 18+
- Browser binaries auto-installed on first run
- Trace viewer requires running `npx playwright show-trace`
-
-**Cypress:**
-
- Requires Node.js 18+
- Cypress app opens on first run
- Component testing requires additional setup
-
-### Version Compatibility
-
- [ ] Node.js version matches .nvmrc
- [ ] Framework version compatible with Node.js version
- [ ] TypeScript version compatible with framework
- [ ] All peer dependencies satisfied
-
---
-
-**Checklist Complete**: Sign off when all items checked and validated.
-
-**Completed by:** {name}
-**Date:** {date}
-**Framework:** { Playwright / Cypress or something else}
-**Notes:** {notes}
--- a/_bmad/bmm/workflows/testarch/framework/instructions.md
+++ b/_bmad/bmm/workflows/testarch/framework/instructions.md
@@ -1,481 +0,0 @@
-<!-- Powered by BMAD-CORE™ -->
-
-# Test Framework Setup
-
-**Workflow ID**: `_bmad/bmm/testarch/framework`
-**Version**: 4.0 (BMad v6)
-
---
-
-## Overview
-
-Initialize a production-ready test framework architecture (Playwright or Cypress) with fixtures, helpers, configuration, and best practices. This workflow scaffolds the complete testing infrastructure for modern web applications.
-
---
-
-## Preflight Requirements
-
-**Critical:** Verify these requirements before proceeding. If any fail, HALT and notify the user.
-
- ✅ `package.json` exists in project root
- ✅ No modern E2E test harness is already configured (check for existing `playwright.config.*` or `cypress.config.*`)
- ✅ Architectural/stack context available (project type, bundler, dependencies)
-
---
-
-## Step 1: Run Preflight Checks
-
-### Actions
-
-1. **Validate package.json**
-   - Read `{project-root}/package.json`
-   - Extract project type (React, Vue, Angular, Next.js, Node, etc.)
-   - Identify bundler (Vite, Webpack, Rollup, esbuild)
-   - Note existing test dependencies
-
-2. **Check for Existing Framework**
-   - Search for `playwright.config.*`, `cypress.config.*`, `cypress.json`
-   - Check `package.json` for `@playwright/test` or `cypress` dependencies
-   - If found, HALT with message: "Existing test framework detected. Use workflow `upgrade-framework` instead."
-
-3. **Gather Context**
-   - Look for architecture documents (`architecture.md`, `tech-spec*.md`)
-   - Check for API documentation or endpoint lists
-   - Identify authentication requirements
-
-**Halt Condition:** If preflight checks fail, stop immediately and report which requirement failed.
-
---
-
-## Step 2: Scaffold Framework
-
-### Actions
-
-1. **Framework Selection**
-
-   **Default Logic:**
-   - **Playwright** (recommended for):
-     - Large repositories (100+ files)
-     - Performance-critical applications
-     - Multi-browser support needed
-     - Complex user flows requiring video/trace debugging
-     - Projects requiring worker parallelism
-
-   - **Cypress** (recommended for):
-     - Small teams prioritizing developer experience
-     - Component testing focus
-     - Real-time reloading during test development
-     - Simpler setup requirements
-
-   **Detection Strategy:**
-   - Check `package.json` for existing preference
-   - Consider `project_size` variable from workflow config
-   - Use `framework_preference` variable if set
-   - Default to **Playwright** if uncertain
-
-2. **Create Directory Structure**
-
-   ```
-   {project-root}/
-   ├── tests/                        # Root test directory
-   │   ├── e2e/                      # Test files (users organize as needed)
-   │   ├── support/                  # Framework infrastructure (key pattern)
-   │   │   ├── fixtures/             # Test fixtures (data, mocks)
-   │   │   ├── helpers/              # Utility functions
-   │   │   └── page-objects/         # Page object models (optional)
-   │   └── README.md                 # Test suite documentation
-   ```
-
-   **Note**: Users organize test files (e2e/, api/, integration/, component/) as needed. The **support/** folder is the critical pattern for fixtures and helpers used across tests.
-
-3. **Generate Configuration File**
-
-   **For Playwright** (`playwright.config.ts` or `playwright.config.js`):
-
-   ```typescript
-   import { defineConfig, devices } from '@playwright/test';
-
-   export default defineConfig({
-     testDir: './tests/e2e',
-     fullyParallel: true,
-     forbidOnly: !!process.env.CI,
-     retries: process.env.CI ? 2 : 0,
-     workers: process.env.CI ? 1 : undefined,
-
-     timeout: 60 * 1000, // Test timeout: 60s
-     expect: {
-       timeout: 15 * 1000, // Assertion timeout: 15s
-     },
-
-     use: {
-       baseURL: process.env.BASE_URL || 'http://localhost:3000',
-       trace: 'retain-on-failure',
-       screenshot: 'only-on-failure',
-       video: 'retain-on-failure',
-       actionTimeout: 15 * 1000, // Action timeout: 15s
-       navigationTimeout: 30 * 1000, // Navigation timeout: 30s
-     },
-
-     reporter: [['html', { outputFolder: 'test-results/html' }], ['junit', { outputFile: 'test-results/junit.xml' }], ['list']],
-
-     projects: [
-       { name: 'chromium', use: { ...devices['Desktop Chrome'] } },
-       { name: 'firefox', use: { ...devices['Desktop Firefox'] } },
-       { name: 'webkit', use: { ...devices['Desktop Safari'] } },
-     ],
-   });
-   ```
-
-   **For Cypress** (`cypress.config.ts` or `cypress.config.js`):
-
-   ```typescript
-   import { defineConfig } from 'cypress';
-
-   export default defineConfig({
-     e2e: {
-       baseUrl: process.env.BASE_URL || 'http://localhost:3000',
-       specPattern: 'tests/e2e/**/*.cy.{js,jsx,ts,tsx}',
-       supportFile: 'tests/support/e2e.ts',
-       video: false,
-       screenshotOnRunFailure: true,
-
-       setupNodeEvents(on, config) {
-         // implement node event listeners here
-       },
-     },
-
-     retries: {
-       runMode: 2,
-       openMode: 0,
-     },
-
-     defaultCommandTimeout: 15000,
-     requestTimeout: 30000,
-     responseTimeout: 30000,
-     pageLoadTimeout: 60000,
-   });
-   ```
-
-4. **Generate Environment Configuration**
-
-   Create `.env.example`:
-
-   ```bash
-   # Test Environment Configuration
-   TEST_ENV=local
-   BASE_URL=http://localhost:3000
-   API_URL=http://localhost:3001/api
-
-   # Authentication (if applicable)
-   TEST_USER_EMAIL=test@example.com
-   TEST_USER_PASSWORD=
-
-   # Feature Flags (if applicable)
-   FEATURE_FLAG_NEW_UI=true
-
-   # API Keys (if applicable)
-   TEST_API_KEY=
-   ```
-
-5. **Generate Node Version File**
-
-   Create `.nvmrc`:
-
-   ```
-   20.11.0
-   ```
-
-   (Use Node version from existing `.nvmrc` or default to current LTS)
-
-6. **Implement Fixture Architecture**
-
-   **Knowledge Base Reference**: `testarch/knowledge/fixture-architecture.md`
-
-   Create `tests/support/fixtures/index.ts`:
-
-   ```typescript
-   import { test as base } from '@playwright/test';
-   import { UserFactory } from './factories/user-factory';
-
-   type TestFixtures = {
-     userFactory: UserFactory;
-   };
-
-   export const test = base.extend<TestFixtures>({
-     userFactory: async ({}, use) => {
-       const factory = new UserFactory();
-       await use(factory);
-       await factory.cleanup(); // Auto-cleanup
-     },
-   });
-
-   export { expect } from '@playwright/test';
-   ```
-
-7. **Implement Data Factories**
-
-   **Knowledge Base Reference**: `testarch/knowledge/data-factories.md`
-
-   Create `tests/support/fixtures/factories/user-factory.ts`:
-
-   ```typescript
-   import { faker } from '@faker-js/faker';
-
-   export class UserFactory {
-     private createdUsers: string[] = [];
-
-     async createUser(overrides = {}) {
-       const user = {
-         email: faker.internet.email(),
-         name: faker.person.fullName(),
-         password: faker.internet.password({ length: 12 }),
-         ...overrides,
-       };
-
-       // API call to create user
-       const response = await fetch(`${process.env.API_URL}/users`, {
-         method: 'POST',
-         headers: { 'Content-Type': 'application/json' },
-         body: JSON.stringify(user),
-       });
-
-       const created = await response.json();
-       this.createdUsers.push(created.id);
-       return created;
-     }
-
-     async cleanup() {
-       // Delete all created users
-       for (const userId of this.createdUsers) {
-         await fetch(`${process.env.API_URL}/users/${userId}`, {
-           method: 'DELETE',
-         });
-       }
-       this.createdUsers = [];
-     }
-   }
-   ```
-
-8. **Generate Sample Tests**
-
-   Create `tests/e2e/example.spec.ts`:
-
-   ```typescript
-   import { test, expect } from '../support/fixtures';
-
-   test.describe('Example Test Suite', () => {
-     test('should load homepage', async ({ page }) => {
-       await page.goto('/');
-       await expect(page).toHaveTitle(/Home/i);
-     });
-
-     test('should create user and login', async ({ page, userFactory }) => {
-       // Create test user
-       const user = await userFactory.createUser();
-
-       // Login
-       await page.goto('/login');
-       await page.fill('[data-testid="email-input"]', user.email);
-       await page.fill('[data-testid="password-input"]', user.password);
-       await page.click('[data-testid="login-button"]');
-
-       // Assert login success
-       await expect(page.locator('[data-testid="user-menu"]')).toBeVisible();
-     });
-   });
-   ```
-
-9. **Update package.json Scripts**
-
-   Add minimal test script to `package.json`:
-
-   ```json
-   {
-     "scripts": {
-       "test:e2e": "playwright test"
-     }
-   }
-   ```
-
-   **Note**: Users can add additional scripts as needed (e.g., `--ui`, `--headed`, `--debug`, `show-report`).
-
-10. **Generate Documentation**
-
-    Create `tests/README.md` with setup instructions (see Step 3 deliverables).
-
---
-
-## Step 3: Deliverables
-
-### Primary Artifacts Created
-
-1. **Configuration File**
-   - `playwright.config.ts` or `cypress.config.ts`
-   - Timeouts: action 15s, navigation 30s, test 60s
-   - Reporters: HTML + JUnit XML
-
-2. **Directory Structure**
-   - `tests/` with `e2e/`, `api/`, `support/` subdirectories
-   - `support/fixtures/` for test fixtures
-   - `support/helpers/` for utility functions
-
-3. **Environment Configuration**
-   - `.env.example` with `TEST_ENV`, `BASE_URL`, `API_URL`
-   - `.nvmrc` with Node version
-
-4. **Test Infrastructure**
-   - Fixture architecture (`mergeTests` pattern)
-   - Data factories (faker-based, with auto-cleanup)
-   - Sample tests demonstrating patterns
-
-5. **Documentation**
-   - `tests/README.md` with setup instructions
-   - Comments in config files explaining options
-
-### README Contents
-
-The generated `tests/README.md` should include:
-
- **Setup Instructions**: How to install dependencies, configure environment
- **Running Tests**: Commands for local execution, headed mode, debug mode
- **Architecture Overview**: Fixture pattern, data factories, page objects
- **Best Practices**: Selector strategy (data-testid), test isolation, cleanup
- **CI Integration**: How tests run in CI/CD pipeline
- **Knowledge Base References**: Links to relevant TEA knowledge fragments
-
---
-
-## Important Notes
-
-### Knowledge Base Integration
-
-**Critical:** Check configuration and load appropriate fragments.
-
-Read `{config_source}` and check `config.tea_use_playwright_utils`.
-
-**If `config.tea_use_playwright_utils: true` (Playwright Utils Integration):**
-
-Consult `{project-root}/_bmad/bmm/testarch/tea-index.csv` and load:
-
- `overview.md` - Playwright utils installation and design principles
- `fixtures-composition.md` - mergeTests composition with playwright-utils
- `auth-session.md` - Token persistence setup (if auth needed)
- `api-request.md` - API testing utilities (if API tests planned)
- `burn-in.md` - Smart test selection for CI (recommend during framework setup)
- `network-error-monitor.md` - Automatic HTTP error detection (recommend in merged fixtures)
- `data-factories.md` - Factory patterns with faker (498 lines, 5 examples)
-
-Recommend installing playwright-utils:
-
-```bash
-npm install -D @seontechnologies/playwright-utils
-```
-
-Recommend adding burn-in and network-error-monitor to merged fixtures for enhanced reliability.
-
-**If `config.tea_use_playwright_utils: false` (Traditional Patterns):**
-
-Consult `{project-root}/_bmad/bmm/testarch/tea-index.csv` and load:
-
- `fixture-architecture.md` - Pure function → fixture → `mergeTests` composition with auto-cleanup (406 lines, 5 examples)
- `data-factories.md` - Faker-based factories with overrides, nested factories, API seeding, auto-cleanup (498 lines, 5 examples)
- `network-first.md` - Network-first testing safeguards: intercept before navigate, HAR capture, deterministic waiting (489 lines, 5 examples)
- `playwright-config.md` - Playwright-specific configuration: environment-based, timeout standards, artifact output, parallelization, project config (722 lines, 5 examples)
- `test-quality.md` - Test design principles: deterministic, isolated with cleanup, explicit assertions, length/time limits (658 lines, 5 examples)
-
-### Framework-Specific Guidance
-
-**Playwright Advantages:**
-
- Worker parallelism (significantly faster for large suites)
- Trace viewer (powerful debugging with screenshots, network, console)
- Multi-language support (TypeScript, JavaScript, Python, C#, Java)
- Built-in API testing capabilities
- Better handling of multiple browser contexts
-
-**Cypress Advantages:**
-
- Superior developer experience (real-time reloading)
- Excellent for component testing (Cypress CT or use Vitest)
- Simpler setup for small teams
- Better suited for watch mode during development
-
-**Avoid Cypress when:**
-
- API chains are heavy and complex
- Multi-tab/window scenarios are common
- Worker parallelism is critical for CI performance
-
-### Selector Strategy
-
-**Always recommend**:
-
- `data-testid` attributes for UI elements
- `data-cy` attributes if Cypress is chosen
- Avoid brittle CSS selectors or XPath
-
-### Contract Testing
-
-For microservices architectures, **recommend Pact** for consumer-driven contract testing alongside E2E tests.
-
-### Failure Artifacts
-
-Configure **failure-only** capture:
-
- Screenshots: only on failure
- Videos: retain on failure (delete on success)
- Traces: retain on failure (Playwright)
-
-This reduces storage overhead while maintaining debugging capability.
-
---
-
-## Output Summary
-
-After completing this workflow, provide a summary:
-
-```markdown
-## Framework Scaffold Complete
-
-**Framework Selected**: Playwright (or Cypress)
-
-**Artifacts Created**:
-
- ✅ Configuration file: `playwright.config.ts`
- ✅ Directory structure: `tests/e2e/`, `tests/support/`
- ✅ Environment config: `.env.example`
- ✅ Node version: `.nvmrc`
- ✅ Fixture architecture: `tests/support/fixtures/`
- ✅ Data factories: `tests/support/fixtures/factories/`
- ✅ Sample tests: `tests/e2e/example.spec.ts`
- ✅ Documentation: `tests/README.md`
-
-**Next Steps**:
-
-1. Copy `.env.example` to `.env` and fill in environment variables
-2. Run `npm install` to install test dependencies
-3. Run `npm run test:e2e` to execute sample tests
-4. Review `tests/README.md` for detailed setup instructions
-
-**Knowledge Base References Applied**:
-
- Fixture architecture pattern (pure functions + mergeTests)
- Data factories with auto-cleanup (faker-based)
- Network-first testing safeguards
- Failure-only artifact capture
-```
-
---
-
-## Validation
-
-After completing all steps, verify:
-
- [ ] Configuration file created and valid
- [ ] Directory structure exists
- [ ] Environment configuration generated
- [ ] Sample tests run successfully
- [ ] Documentation complete and accurate
- [ ] No errors or warnings during scaffold
-
-Refer to `checklist.md` for comprehensive validation criteria.
--- a/_bmad/bmm/workflows/testarch/framework/workflow.yaml
+++ b/_bmad/bmm/workflows/testarch/framework/workflow.yaml
@@ -1,47 +0,0 @@
-# Test Architect workflow: framework
-name: testarch-framework
-description: "Initialize production-ready test framework architecture (Playwright or Cypress) with fixtures, helpers, and configuration"
-author: "BMad"
-
-# Critical variables from config
-config_source: "{project-root}/_bmad/bmm/config.yaml"
-output_folder: "{config_source}:output_folder"
-user_name: "{config_source}:user_name"
-communication_language: "{config_source}:communication_language"
-document_output_language: "{config_source}:document_output_language"
-date: system-generated
-
-# Workflow components
-installed_path: "{project-root}/_bmad/bmm/workflows/testarch/framework"
-instructions: "{installed_path}/instructions.md"
-validation: "{installed_path}/checklist.md"
-
-# Variables and inputs
-variables:
-  test_dir: "{project-root}/tests" # Root test directory
-  use_typescript: true # Prefer TypeScript configuration
-  framework_preference: "auto" # auto, playwright, cypress - user can override auto-detection
-  project_size: "auto" # auto, small, large - influences framework recommendation
-
-# Output configuration
-default_output_file: "{test_dir}/README.md" # Main deliverable is test setup README
-
-# Required tools
-required_tools:
-  - read_file # Read package.json, existing configs
-  - write_file # Create config files, helpers, fixtures, tests
-  - create_directory # Create test directory structure
-  - list_files # Check for existing framework
-  - search_repo # Find architecture docs
-
-tags:
-  - qa
-  - setup
-  - test-architect
-  - framework
-  - initialization
-
-execution_hints:
-  interactive: false # Minimize prompts; auto-detect when possible
-  autonomous: true # Proceed without user input unless blocked
-  iterative: true
--- a/_bmad/bmm/workflows/testarch/nfr-assess/checklist.md
+++ b/_bmad/bmm/workflows/testarch/nfr-assess/checklist.md
@@ -1,407 +0,0 @@
-# Non-Functional Requirements Assessment - Validation Checklist
-
-**Workflow:** `testarch-nfr`
-**Purpose:** Ensure comprehensive and evidence-based NFR assessment with actionable recommendations
-
---
-
-Note: `nfr-assess` evaluates existing evidence; it does not run tests or CI workflows.
-
-## Prerequisites Validation
-
- [ ] Implementation is deployed and accessible for evaluation
- [ ] Evidence sources are available (test results, metrics, logs, CI results)
- [ ] NFR categories are determined (performance, security, reliability, maintainability, custom)
- [ ] Evidence directories exist and are accessible (`test_results_dir`, `metrics_dir`, `logs_dir`)
- [ ] Knowledge base is loaded (nfr-criteria, ci-burn-in, test-quality)
-
---
-
-## Context Loading
-
- [ ] Tech-spec.md loaded successfully (if available)
- [ ] PRD.md loaded (if available)
- [ ] Story file loaded (if applicable)
- [ ] Relevant knowledge fragments loaded from `tea-index.csv`:
-  - [ ] `nfr-criteria.md`
-  - [ ] `ci-burn-in.md`
-  - [ ] `test-quality.md`
-  - [ ] `playwright-config.md` (if using Playwright)
-
---
-
-## NFR Categories and Thresholds
-
-### Performance
-
- [ ] Response time threshold defined or marked as UNKNOWN
- [ ] Throughput threshold defined or marked as UNKNOWN
- [ ] Resource usage thresholds defined or marked as UNKNOWN
- [ ] Scalability requirements defined or marked as UNKNOWN
-
-### Security
-
- [ ] Authentication requirements defined or marked as UNKNOWN
- [ ] Authorization requirements defined or marked as UNKNOWN
- [ ] Data protection requirements defined or marked as UNKNOWN
- [ ] Vulnerability management thresholds defined or marked as UNKNOWN
- [ ] Compliance requirements identified (GDPR, HIPAA, PCI-DSS, etc.)
-
-### Reliability
-
- [ ] Availability (uptime) threshold defined or marked as UNKNOWN
- [ ] Error rate threshold defined or marked as UNKNOWN
- [ ] MTTR (Mean Time To Recovery) threshold defined or marked as UNKNOWN
- [ ] Fault tolerance requirements defined or marked as UNKNOWN
- [ ] Disaster recovery requirements defined (RTO, RPO) or marked as UNKNOWN
-
-### Maintainability
-
- [ ] Test coverage threshold defined or marked as UNKNOWN
- [ ] Code quality threshold defined or marked as UNKNOWN
- [ ] Technical debt threshold defined or marked as UNKNOWN
- [ ] Documentation completeness threshold defined or marked as UNKNOWN
-
-### Custom NFR Categories (if applicable)
-
- [ ] Custom NFR category 1: Thresholds defined or marked as UNKNOWN
- [ ] Custom NFR category 2: Thresholds defined or marked as UNKNOWN
- [ ] Custom NFR category 3: Thresholds defined or marked as UNKNOWN
-
---
-
-## Evidence Gathering
-
-### Performance Evidence
-
- [ ] Load test results collected (JMeter, k6, Gatling, etc.)
- [ ] Application metrics collected (response times, throughput, resource usage)
- [ ] APM data collected (New Relic, Datadog, Dynatrace, etc.)
- [ ] Lighthouse reports collected (if web app)
- [ ] Playwright performance traces collected (if applicable)
-
-### Security Evidence
-
- [ ] SAST results collected (SonarQube, Checkmarx, Veracode, etc.)
- [ ] DAST results collected (OWASP ZAP, Burp Suite, etc.)
- [ ] Dependency scanning results collected (Snyk, Dependabot, npm audit)
- [ ] Penetration test reports collected (if available)
- [ ] Security audit logs collected
- [ ] Compliance audit results collected (if applicable)
-
-### Reliability Evidence
-
- [ ] Uptime monitoring data collected (Pingdom, UptimeRobot, StatusCake)
- [ ] Error logs collected
- [ ] Error rate metrics collected
- [ ] CI burn-in results collected (stability over time)
- [ ] Chaos engineering test results collected (if available)
- [ ] Failover/recovery test results collected (if available)
- [ ] Incident reports and postmortems collected (if applicable)
-
-### Maintainability Evidence
-
- [ ] Code coverage reports collected (Istanbul, NYC, c8, JaCoCo)
- [ ] Static analysis results collected (ESLint, SonarQube, CodeClimate)
- [ ] Technical debt metrics collected
- [ ] Documentation audit results collected
- [ ] Test review report collected (from test-review workflow, if available)
- [ ] Git metrics collected (code churn, commit frequency, etc.)
-
---
-
-## NFR Assessment with Deterministic Rules
-
-### Performance Assessment
-
- [ ] Response time assessed against threshold
- [ ] Throughput assessed against threshold
- [ ] Resource usage assessed against threshold
- [ ] Scalability assessed against requirements
- [ ] Status classified (PASS/CONCERNS/FAIL) with justification
- [ ] Evidence source documented (file path, metric name)
-
-### Security Assessment
-
- [ ] Authentication strength assessed against requirements
- [ ] Authorization controls assessed against requirements
- [ ] Data protection assessed against requirements
- [ ] Vulnerability management assessed against thresholds
- [ ] Compliance assessed against requirements
- [ ] Status classified (PASS/CONCERNS/FAIL) with justification
- [ ] Evidence source documented (file path, scan result)
-
-### Reliability Assessment
-
- [ ] Availability (uptime) assessed against threshold
- [ ] Error rate assessed against threshold
- [ ] MTTR assessed against threshold
- [ ] Fault tolerance assessed against requirements
- [ ] Disaster recovery assessed against requirements (RTO, RPO)
- [ ] CI burn-in assessed (stability over time)
- [ ] Status classified (PASS/CONCERNS/FAIL) with justification
- [ ] Evidence source documented (file path, monitoring data)
-
-### Maintainability Assessment
-
- [ ] Test coverage assessed against threshold
- [ ] Code quality assessed against threshold
- [ ] Technical debt assessed against threshold
- [ ] Documentation completeness assessed against threshold
- [ ] Test quality assessed (from test-review, if available)
- [ ] Status classified (PASS/CONCERNS/FAIL) with justification
- [ ] Evidence source documented (file path, coverage report)
-
-### Custom NFR Assessment (if applicable)
-
- [ ] Custom NFR 1 assessed against threshold with justification
- [ ] Custom NFR 2 assessed against threshold with justification
- [ ] Custom NFR 3 assessed against threshold with justification
-
---
-
-## Status Classification Validation
-
-### PASS Criteria Verified
-
- [ ] Evidence exists for PASS status
- [ ] Evidence meets or exceeds threshold
- [ ] No concerns flagged in evidence
- [ ] Quality is acceptable
-
-### CONCERNS Criteria Verified
-
- [ ] Threshold is UNKNOWN (documented) OR
- [ ] Evidence is MISSING or INCOMPLETE (documented) OR
- [ ] Evidence is close to threshold (within 10%, documented) OR
- [ ] Evidence shows intermittent issues (documented)
-
-### FAIL Criteria Verified
-
- [ ] Evidence exists BUT does not meet threshold (documented) OR
- [ ] Critical evidence is MISSING (documented) OR
- [ ] Evidence shows consistent failures (documented) OR
- [ ] Quality is unacceptable (documented)
-
-### No Threshold Guessing
-
- [ ] All thresholds are either defined or marked as UNKNOWN
- [ ] No thresholds were guessed or inferred
- [ ] All UNKNOWN thresholds result in CONCERNS status
-
---
-
-## Quick Wins and Recommended Actions
-
-### Quick Wins Identified
-
- [ ] Low-effort, high-impact improvements identified for CONCERNS/FAIL
- [ ] Configuration changes (no code changes) identified
- [ ] Optimization opportunities identified (caching, indexing, compression)
- [ ] Monitoring additions identified (detect issues before failures)
-
-### Recommended Actions
-
- [ ] Specific remediation steps provided (not generic advice)
- [ ] Priority assigned (CRITICAL, HIGH, MEDIUM, LOW)
- [ ] Estimated effort provided (hours, days)
- [ ] Owner suggestions provided (dev, ops, security)
-
-### Monitoring Hooks
-
- [ ] Performance monitoring suggested (APM, synthetic monitoring)
- [ ] Error tracking suggested (Sentry, Rollbar, error logs)
- [ ] Security monitoring suggested (intrusion detection, audit logs)
- [ ] Alerting thresholds suggested (notify before breach)
-
-### Fail-Fast Mechanisms
-
- [ ] Circuit breakers suggested for reliability
- [ ] Rate limiting suggested for performance
- [ ] Validation gates suggested for security
- [ ] Smoke tests suggested for maintainability
-
---
-
-## Deliverables Generated
-
-### NFR Assessment Report
-
- [ ] File created at `{output_folder}/nfr-assessment.md`
- [ ] Template from `nfr-report-template.md` used
- [ ] Executive summary included (overall status, critical issues)
- [ ] Assessment by category included (performance, security, reliability, maintainability)
- [ ] Evidence for each NFR documented
- [ ] Status classifications documented (PASS/CONCERNS/FAIL)
- [ ] Findings summary included (PASS count, CONCERNS count, FAIL count)
- [ ] Quick wins section included
- [ ] Recommended actions section included
- [ ] Evidence gaps checklist included
-
-### Gate YAML Snippet (if enabled)
-
- [ ] YAML snippet generated
- [ ] Date included
- [ ] Categories status included (performance, security, reliability, maintainability)
- [ ] Overall status included (PASS/CONCERNS/FAIL)
- [ ] Issue counts included (critical, high, medium, concerns)
- [ ] Blockers flag included (true/false)
- [ ] Recommendations included
-
-### Evidence Checklist (if enabled)
-
- [ ] All NFRs with MISSING or INCOMPLETE evidence listed
- [ ] Owners assigned for evidence collection
- [ ] Suggested evidence sources provided
- [ ] Deadlines set for evidence collection
-
-### Updated Story File (if enabled and requested)
-
- [ ] "NFR Assessment" section added to story markdown
- [ ] Link to NFR assessment report included
- [ ] Overall status and critical issues included
- [ ] Gate status included
-
---
-
-## Quality Assurance
-
-### Accuracy Checks
-
- [ ] All NFR categories assessed (none skipped)
- [ ] All thresholds documented (defined or UNKNOWN)
- [ ] All evidence sources documented (file paths, metric names)
- [ ] Status classifications are deterministic and consistent
- [ ] No false positives (status correctly assigned)
- [ ] No false negatives (all issues identified)
-
-### Completeness Checks
-
- [ ] All NFR categories covered (performance, security, reliability, maintainability, custom)
- [ ] All evidence sources checked (test results, metrics, logs, CI results)
- [ ] All status types used appropriately (PASS, CONCERNS, FAIL)
- [ ] All NFRs with CONCERNS/FAIL have recommendations
- [ ] All evidence gaps have owners and deadlines
-
-### Actionability Checks
-
- [ ] Recommendations are specific (not generic)
- [ ] Remediation steps are clear and actionable
- [ ] Priorities are assigned (CRITICAL, HIGH, MEDIUM, LOW)
- [ ] Effort estimates are provided (hours, days)
- [ ] Owners are suggested (dev, ops, security)
-
---
-
-## Integration with BMad Artifacts
-
-### With tech-spec.md
-
- [ ] Tech spec loaded for NFR requirements and thresholds
- [ ] Performance targets extracted
- [ ] Security requirements extracted
- [ ] Reliability SLAs extracted
- [ ] Architectural decisions considered
-
-### With test-design.md
-
- [ ] Test design loaded for NFR test plan
- [ ] Test priorities referenced (P0/P1/P2/P3)
- [ ] Assessment aligned with planned NFR validation
-
-### With PRD.md
-
- [ ] PRD loaded for product-level NFR context
- [ ] User experience goals considered
- [ ] Unstated requirements checked
- [ ] Product-level SLAs referenced
-
---
-
-## Quality Gates Validation
-
-### Release Blocker (FAIL)
-
- [ ] Critical NFR status checked (security, reliability)
- [ ] Performance failures assessed for user impact
- [ ] Release blocker flagged if critical NFR has FAIL status
-
-### PR Blocker (HIGH CONCERNS)
-
- [ ] High-priority NFR status checked
- [ ] Multiple CONCERNS assessed
- [ ] PR blocker flagged if HIGH priority issues exist
-
-### Warning (CONCERNS)
-
- [ ] Any NFR with CONCERNS status flagged
- [ ] Missing or incomplete evidence documented
- [ ] Warning issued to address before next release
-
-### Pass (PASS)
-
- [ ] All NFRs have PASS status
- [ ] No blockers or concerns exist
- [ ] Ready for release confirmed
-
---
-
-## Non-Prescriptive Validation
-
- [ ] NFR categories adapted to team needs
- [ ] Thresholds appropriate for project context
- [ ] Assessment criteria customized as needed
- [ ] Teams can extend with custom NFR categories
- [ ] Integration with external tools supported (New Relic, Datadog, SonarQube, JIRA)
-
---
-
-## Documentation and Communication
-
- [ ] NFR assessment report is readable and well-formatted
- [ ] Tables render correctly in markdown
- [ ] Code blocks have proper syntax highlighting
- [ ] Links are valid and accessible
- [ ] Recommendations are clear and prioritized
- [ ] Overall status is prominent and unambiguous
- [ ] Executive summary provides quick understanding
-
---
-
-## Final Validation
-
- [ ] All prerequisites met
- [ ] All NFR categories assessed with evidence (or gaps documented)
- [ ] No thresholds were guessed (all defined or UNKNOWN)
- [ ] Status classifications are deterministic and justified
- [ ] Quick wins identified for all CONCERNS/FAIL
- [ ] Recommended actions are specific and actionable
- [ ] Evidence gaps documented with owners and deadlines
- [ ] NFR assessment report generated and saved
- [ ] Gate YAML snippet generated (if enabled)
- [ ] Evidence checklist generated (if enabled)
- [ ] Workflow completed successfully
-
---
-
-## Sign-Off
-
-**NFR Assessment Status:**
-
- [ ] ✅ PASS - All NFRs meet requirements, ready for release
- [ ] ⚠️ CONCERNS - Some NFRs have concerns, address before next release
- [ ] ❌ FAIL - Critical NFRs not met, BLOCKER for release
-
-**Next Actions:**
-
- If PASS ✅: Proceed to `*gate` workflow or release
- If CONCERNS ⚠️: Address HIGH/CRITICAL issues, re-run `*nfr-assess`
- If FAIL ❌: Resolve FAIL status NFRs, re-run `*nfr-assess`
-
-**Critical Issues:** {COUNT}
-**High Priority Issues:** {COUNT}
-**Concerns:** {COUNT}
-
---
-
-<!-- Powered by BMAD-CORE™ -->
--- a/_bmad/bmm/workflows/testarch/nfr-assess/instructions.md
+++ b/_bmad/bmm/workflows/testarch/nfr-assess/instructions.md
@@ -1,722 +0,0 @@
-# Non-Functional Requirements Assessment - Instructions v4.0
-
-**Workflow:** `testarch-nfr`
-**Purpose:** Assess non-functional requirements (performance, security, reliability, maintainability) before release with evidence-based validation
-**Agent:** Test Architect (TEA)
-**Format:** Pure Markdown v4.0 (no XML blocks)
-
---
-
-## Overview
-
-This workflow performs a comprehensive assessment of non-functional requirements (NFRs) to validate that the implementation meets performance, security, reliability, and maintainability standards before release. It uses evidence-based validation with deterministic PASS/CONCERNS/FAIL rules and provides actionable recommendations for remediation.
-
-**Key Capabilities:**
-
- Assess multiple NFR categories (performance, security, reliability, maintainability, custom)
- Validate NFRs against defined thresholds from tech specs, PRD, or defaults
- Classify status deterministically (PASS/CONCERNS/FAIL) based on evidence
- Never guess thresholds - mark as CONCERNS if unknown
- Generate gate-ready YAML snippets for CI/CD integration
- Provide quick wins and recommended actions for remediation
- Create evidence checklists for gaps
-
---
-
-## Prerequisites
-
-**Required:**
-
- Implementation deployed locally or accessible for evaluation
- Evidence sources available (test results, metrics, logs, CI results)
-
-**Recommended:**
-
- NFR requirements defined in tech-spec.md, PRD.md, or story
- Test results from performance, security, reliability tests
- Application metrics (response times, error rates, throughput)
- CI/CD pipeline results for burn-in validation
-
-**Halt Conditions:**
-
- If NFR targets are undefined and cannot be obtained, halt and request definition
- If implementation is not accessible for evaluation, halt and request deployment
-
---
-
-## Workflow Steps
-
-### Step 1: Load Context and Knowledge Base
-
-**Actions:**
-
-1. Load relevant knowledge fragments from `{project-root}/_bmad/bmm/testarch/tea-index.csv`:
-   - `nfr-criteria.md` - Non-functional requirements criteria and thresholds (security, performance, reliability, maintainability with code examples, 658 lines, 4 examples)
-   - `ci-burn-in.md` - CI/CD burn-in patterns for reliability validation (10-iteration detection, sharding, selective execution, 678 lines, 4 examples)
-   - `test-quality.md` - Test quality expectations for maintainability (deterministic, isolated, explicit assertions, length/time limits, 658 lines, 5 examples)
-   - `playwright-config.md` - Performance configuration patterns: parallelization, timeout standards, artifact output (722 lines, 5 examples)
-   - `error-handling.md` - Reliability validation patterns: scoped exceptions, retry validation, telemetry logging, graceful degradation (736 lines, 4 examples)
-
-2. Read story file (if provided):
-   - Extract NFR requirements
-   - Identify specific thresholds or SLAs
-   - Note any custom NFR categories
-
-3. Read related BMad artifacts (if available):
-   - `tech-spec.md` - Technical NFR requirements and targets
-   - `PRD.md` - Product-level NFR context (user expectations)
-   - `test-design.md` - NFR test plan and priorities
-
-**Output:** Complete understanding of NFR targets, evidence sources, and validation criteria
-
---
-
-### Step 2: Identify NFR Categories and Thresholds
-
-**Actions:**
-
-1. Determine which NFR categories to assess (default: performance, security, reliability, maintainability):
-   - **Performance**: Response time, throughput, resource usage
-   - **Security**: Authentication, authorization, data protection, vulnerability scanning
-   - **Reliability**: Error handling, recovery, availability, fault tolerance
-   - **Maintainability**: Code quality, test coverage, documentation, technical debt
-
-2. Add custom NFR categories if specified (e.g., accessibility, internationalization, compliance)
-
-3. Gather thresholds for each NFR:
-   - From tech-spec.md (primary source)
-   - From PRD.md (product-level SLAs)
-   - From story file (feature-specific requirements)
-   - From workflow variables (default thresholds)
-   - Mark thresholds as UNKNOWN if not defined
-
-4. Never guess thresholds - if a threshold is unknown, mark the NFR as CONCERNS
-
-**Output:** Complete list of NFRs to assess with defined (or UNKNOWN) thresholds
-
---
-
-### Step 3: Gather Evidence
-
-**Actions:**
-
-1. For each NFR category, discover evidence sources:
-
-   **Performance Evidence:**
-   - Load test results (JMeter, k6, Lighthouse)
-   - Application metrics (response times, throughput, resource usage)
-   - Performance monitoring data (New Relic, Datadog, APM)
-   - Playwright performance traces (if applicable)
-
-   **Security Evidence:**
-   - Security scan results (SAST, DAST, dependency scanning)
-   - Authentication/authorization test results
-   - Penetration test reports
-   - Vulnerability assessment reports
-   - Compliance audit results
-
-   **Reliability Evidence:**
-   - Error logs and error rates
-   - Uptime monitoring data
-   - Chaos engineering test results
-   - Failover/recovery test results
-   - CI burn-in results (stability over time)
-
-   **Maintainability Evidence:**
-   - Code coverage reports (Istanbul, NYC, c8)
-   - Static analysis results (ESLint, SonarQube)
-   - Technical debt metrics
-   - Documentation completeness
-   - Test quality assessment (from test-review workflow)
-
-2. Read relevant files from evidence directories:
-   - `{test_results_dir}` for test execution results
-   - `{metrics_dir}` for application metrics
-   - `{logs_dir}` for application logs
-   - CI/CD pipeline results (if `include_ci_results` is true)
-
-3. Mark NFRs without evidence as "NO EVIDENCE" - never infer or assume
-
-**Output:** Comprehensive evidence inventory for each NFR
-
---
-
-### Step 4: Assess NFRs with Deterministic Rules
-
-**Actions:**
-
-1. For each NFR, apply deterministic PASS/CONCERNS/FAIL rules:
-
-   **PASS Criteria:**
-   - Evidence exists AND meets defined threshold
-   - No concerns flagged in evidence
-   - Example: Response time is 350ms (threshold: 500ms) → PASS
-
-   **CONCERNS Criteria:**
-   - Threshold is UNKNOWN (not defined)
-   - Evidence is MISSING or INCOMPLETE
-   - Evidence is close to threshold (within 10%)
-   - Evidence shows intermittent issues
-   - Example: Response time is 480ms (threshold: 500ms, 96% of threshold) → CONCERNS
-
-   **FAIL Criteria:**
-   - Evidence exists BUT does not meet threshold
-   - Critical evidence is MISSING
-   - Evidence shows consistent failures
-   - Example: Response time is 750ms (threshold: 500ms) → FAIL
-
-2. Document findings for each NFR:
-   - Status (PASS/CONCERNS/FAIL)
-   - Evidence source (file path, test name, metric name)
-   - Actual value vs threshold
-   - Justification for status classification
-
-3. Classify severity based on category:
-   - **CRITICAL**: Security failures, reliability failures (affect users immediately)
-   - **HIGH**: Performance failures, maintainability failures (affect users soon)
-   - **MEDIUM**: Concerns without failures (may affect users eventually)
-   - **LOW**: Missing evidence for non-critical NFRs
-
-**Output:** Complete NFR assessment with deterministic status classifications
-
---
-
-### Step 5: Identify Quick Wins and Recommended Actions
-
-**Actions:**
-
-1. For each NFR with CONCERNS or FAIL status, identify quick wins:
-   - Low-effort, high-impact improvements
-   - Configuration changes (no code changes needed)
-   - Optimization opportunities (caching, indexing, compression)
-   - Monitoring additions (detect issues before they become failures)
-
-2. Provide recommended actions for each issue:
-   - Specific steps to remediate (not generic advice)
-   - Priority (CRITICAL, HIGH, MEDIUM, LOW)
-   - Estimated effort (hours, days)
-   - Owner suggestion (dev, ops, security)
-
-3. Suggest monitoring hooks for gaps:
-   - Add performance monitoring (APM, synthetic monitoring)
-   - Add error tracking (Sentry, Rollbar, error logs)
-   - Add security monitoring (intrusion detection, audit logs)
-   - Add alerting thresholds (notify before thresholds are breached)
-
-4. Suggest fail-fast mechanisms:
-   - Add circuit breakers for reliability
-   - Add rate limiting for performance
-   - Add validation gates for security
-   - Add smoke tests for maintainability
-
-**Output:** Actionable remediation plan with prioritized recommendations
-
---
-
-### Step 6: Generate Deliverables
-
-**Actions:**
-
-1. Create NFR assessment markdown file:
-   - Use template from `nfr-report-template.md`
-   - Include executive summary (overall status, critical issues)
-   - Add NFR-by-NFR assessment (status, evidence, thresholds)
-   - Add findings summary (PASS count, CONCERNS count, FAIL count)
-   - Add quick wins section
-   - Add recommended actions section
-   - Add evidence gaps checklist
-   - Save to `{output_folder}/nfr-assessment.md`
-
-2. Generate gate YAML snippet (if enabled):
-
-   ```yaml
-   nfr_assessment:
-     date: '2025-10-14'
-     categories:
-       performance: 'PASS'
-       security: 'CONCERNS'
-       reliability: 'PASS'
-       maintainability: 'PASS'
-     overall_status: 'CONCERNS'
-     critical_issues: 0
-     high_priority_issues: 1
-     concerns: 2
-     blockers: false
-   ```
-
-3. Generate evidence checklist (if enabled):
-   - List all NFRs with MISSING or INCOMPLETE evidence
-   - Assign owners for evidence collection
-   - Suggest evidence sources (tests, metrics, logs)
-   - Set deadlines for evidence collection
-
-4. Update story file (if enabled and requested):
-   - Add "NFR Assessment" section to story markdown
-   - Link to NFR assessment report
-   - Include overall status and critical issues
-   - Add gate status
-
-**Output:** Complete NFR assessment documentation ready for review and CI/CD integration
-
---
-
-## Non-Prescriptive Approach
-
-**Minimal Examples:** This workflow provides principles and patterns, not rigid templates. Teams should adapt NFR categories, thresholds, and assessment criteria to their needs.
-
-**Key Patterns to Follow:**
-
- Use evidence-based validation (no guessing or inference)
- Apply deterministic rules (consistent PASS/CONCERNS/FAIL classification)
- Never guess thresholds (mark as CONCERNS if unknown)
- Provide actionable recommendations (specific steps, not generic advice)
- Generate gate-ready artifacts (YAML snippets for CI/CD)
-
-**Extend as Needed:**
-
- Add custom NFR categories (accessibility, internationalization, compliance)
- Integrate with external tools (New Relic, Datadog, SonarQube, JIRA)
- Add custom thresholds and rules
- Link to external assessment systems
-
---
-
-## NFR Categories and Criteria
-
-### Performance
-
-**Criteria:**
-
- Response time (p50, p95, p99 percentiles)
- Throughput (requests per second, transactions per second)
- Resource usage (CPU, memory, disk, network)
- Scalability (horizontal, vertical)
-
-**Thresholds (Default):**
-
- Response time p95: 500ms
- Throughput: 100 RPS
- CPU usage: < 70% average
- Memory usage: < 80% max
-
-**Evidence Sources:**
-
- Load test results (JMeter, k6, Gatling)
- APM data (New Relic, Datadog, Dynatrace)
- Lighthouse reports (for web apps)
- Playwright performance traces
-
---
-
-### Security
-
-**Criteria:**
-
- Authentication (login security, session management)
- Authorization (access control, permissions)
- Data protection (encryption, PII handling)
- Vulnerability management (SAST, DAST, dependency scanning)
- Compliance (GDPR, HIPAA, PCI-DSS)
-
-**Thresholds (Default):**
-
- Security score: >= 85/100
- Critical vulnerabilities: 0
- High vulnerabilities: < 3
- Authentication strength: MFA enabled
-
-**Evidence Sources:**
-
- SAST results (SonarQube, Checkmarx, Veracode)
- DAST results (OWASP ZAP, Burp Suite)
- Dependency scanning (Snyk, Dependabot, npm audit)
- Penetration test reports
- Security audit logs
-
---
-
-### Reliability
-
-**Criteria:**
-
- Availability (uptime percentage)
- Error handling (graceful degradation, error recovery)
- Fault tolerance (redundancy, failover)
- Disaster recovery (backup, restore, RTO/RPO)
- Stability (CI burn-in, chaos engineering)
-
-**Thresholds (Default):**
-
- Uptime: >= 99.9% (three nines)
- Error rate: < 0.1% (1 in 1000 requests)
- MTTR (Mean Time To Recovery): < 15 minutes
- CI burn-in: 100 consecutive successful runs
-
-**Evidence Sources:**
-
- Uptime monitoring (Pingdom, UptimeRobot, StatusCake)
- Error logs and error rates
- CI burn-in results (see `ci-burn-in.md`)
- Chaos engineering test results (Chaos Monkey, Gremlin)
- Incident reports and postmortems
-
---
-
-### Maintainability
-
-**Criteria:**
-
- Code quality (complexity, duplication, code smells)
- Test coverage (unit, integration, E2E)
- Documentation (code comments, README, architecture docs)
- Technical debt (debt ratio, code churn)
- Test quality (from test-review workflow)
-
-**Thresholds (Default):**
-
- Test coverage: >= 80%
- Code quality score: >= 85/100
- Technical debt ratio: < 5%
- Documentation completeness: >= 90%
-
-**Evidence Sources:**
-
- Coverage reports (Istanbul, NYC, c8, JaCoCo)
- Static analysis (ESLint, SonarQube, CodeClimate)
- Documentation audit (manual or automated)
- Test review report (from test-review workflow)
- Git metrics (code churn, commit frequency)
-
---
-
-## Deterministic Assessment Rules
-
-### PASS Rules
-
- Evidence exists
- Evidence meets or exceeds threshold
- No concerns flagged
- Quality is acceptable
-
-**Example:**
-
-```markdown
-NFR: Response Time p95
-Threshold: 500ms
-Evidence: Load test result shows 350ms p95
-Status: PASS ✅
-```
-
---
-
-### CONCERNS Rules
-
- Threshold is UNKNOWN
- Evidence is MISSING or INCOMPLETE
- Evidence is close to threshold (within 10%)
- Evidence shows intermittent issues
- Quality is marginal
-
-**Example:**
-
-```markdown
-NFR: Response Time p95
-Threshold: 500ms
-Evidence: Load test result shows 480ms p95 (96% of threshold)
-Status: CONCERNS ⚠️
-Recommendation: Optimize before production - very close to threshold
-```
-
---
-
-### FAIL Rules
-
- Evidence exists BUT does not meet threshold
- Critical evidence is MISSING
- Evidence shows consistent failures
- Quality is unacceptable
-
-**Example:**
-
-```markdown
-NFR: Response Time p95
-Threshold: 500ms
-Evidence: Load test result shows 750ms p95 (150% of threshold)
-Status: FAIL ❌
-Recommendation: BLOCKER - optimize performance before release
-```
-
---
-
-## Integration with BMad Artifacts
-
-### With tech-spec.md
-
- Primary source for NFR requirements and thresholds
- Load performance targets, security requirements, reliability SLAs
- Use architectural decisions to understand NFR trade-offs
-
-### With test-design.md
-
- Understand NFR test plan and priorities
- Reference test priorities (P0/P1/P2/P3) for severity classification
- Align assessment with planned NFR validation
-
-### With PRD.md
-
- Understand product-level NFR expectations
- Verify NFRs align with user experience goals
- Check for unstated NFR requirements (implied by product goals)
-
---
-
-## Quality Gates
-
-### Release Blocker (FAIL)
-
- Critical NFR has FAIL status (security, reliability)
- Performance failure affects user experience severely
- Do not release until FAIL is resolved
-
-### PR Blocker (HIGH CONCERNS)
-
- High-priority NFR has FAIL status
- Multiple CONCERNS exist
- Block PR merge until addressed
-
-### Warning (CONCERNS)
-
- Any NFR has CONCERNS status
- Evidence is missing or incomplete
- Address before next release
-
-### Pass (PASS)
-
- All NFRs have PASS status
- No blockers or concerns
- Ready for release
-
---
-
-## Example NFR Assessment
-
-````markdown
-# NFR Assessment - Story 1.3
-
-**Feature:** User Authentication
-**Date:** 2025-10-14
-**Overall Status:** CONCERNS ⚠️ (1 HIGH issue)
-
-## Executive Summary
-
-**Assessment:** 3 PASS, 1 CONCERNS, 0 FAIL
-**Blockers:** None
-**High Priority Issues:** 1 (Security - MFA not enforced)
-**Recommendation:** Address security concern before release
-
-## Performance Assessment
-
-### Response Time (p95)
-
- **Status:** PASS ✅
- **Threshold:** 500ms
- **Actual:** 320ms (64% of threshold)
- **Evidence:** Load test results (test-results/load-2025-10-14.json)
- **Findings:** Response time well below threshold across all percentiles
-
-### Throughput
-
- **Status:** PASS ✅
- **Threshold:** 100 RPS
- **Actual:** 250 RPS (250% of threshold)
- **Evidence:** Load test results (test-results/load-2025-10-14.json)
- **Findings:** System handles 2.5x target load without degradation
-
-## Security Assessment
-
-### Authentication Strength
-
- **Status:** CONCERNS ⚠️
- **Threshold:** MFA enabled for all users
- **Actual:** MFA optional (not enforced)
- **Evidence:** Security audit (security-audit-2025-10-14.md)
- **Findings:** MFA is implemented but not enforced by default
- **Recommendation:** HIGH - Enforce MFA for all new accounts, provide migration path for existing users
-
-### Data Protection
-
- **Status:** PASS ✅
- **Threshold:** PII encrypted at rest and in transit
- **Actual:** AES-256 at rest, TLS 1.3 in transit
- **Evidence:** Security scan (security-scan-2025-10-14.json)
- **Findings:** All PII properly encrypted
-
-## Reliability Assessment
-
-### Uptime
-
- **Status:** PASS ✅
- **Threshold:** 99.9% (three nines)
- **Actual:** 99.95% over 30 days
- **Evidence:** Uptime monitoring (uptime-report-2025-10-14.csv)
- **Findings:** Exceeds target with margin
-
-### Error Rate
-
- **Status:** PASS ✅
- **Threshold:** < 0.1% (1 in 1000)
- **Actual:** 0.05% (1 in 2000)
- **Evidence:** Error logs (logs/errors-2025-10.log)
- **Findings:** Error rate well below threshold
-
-## Maintainability Assessment
-
-### Test Coverage
-
- **Status:** PASS ✅
- **Threshold:** >= 80%
- **Actual:** 87%
- **Evidence:** Coverage report (coverage/lcov-report/index.html)
- **Findings:** Coverage exceeds threshold with good distribution
-
-### Code Quality
-
- **Status:** PASS ✅
- **Threshold:** >= 85/100
- **Actual:** 92/100
- **Evidence:** SonarQube analysis (sonarqube-report-2025-10-14.pdf)
- **Findings:** High code quality score with low technical debt
-
-## Quick Wins
-
-1. **Enforce MFA (Security)** - HIGH - 4 hours
-   - Add configuration flag to enforce MFA for new accounts
-   - No code changes needed, only config adjustment
-
-## Recommended Actions
-
-### Immediate (Before Release)
-
-1. **Enforce MFA for all new accounts** - HIGH - 4 hours - Security Team
-   - Add `ENFORCE_MFA=true` to production config
-   - Update user onboarding flow to require MFA setup
-   - Test MFA enforcement in staging environment
-
-### Short-term (Next Sprint)
-
-1. **Migrate existing users to MFA** - MEDIUM - 3 days - Product + Engineering
-   - Design migration UX (prompt, incentives, deadline)
-   - Implement migration flow with grace period
-   - Communicate migration to existing users
-
-## Evidence Gaps
-
- [ ] Chaos engineering test results (reliability)
-  - Owner: DevOps Team
-  - Deadline: 2025-10-21
-  - Suggested evidence: Run chaos monkey tests in staging
-
- [ ] Penetration test report (security)
-  - Owner: Security Team
-  - Deadline: 2025-10-28
-  - Suggested evidence: Schedule third-party pentest
-
-## Gate YAML Snippet
-
-```yaml
-nfr_assessment:
-  date: '2025-10-14'
-  story_id: '1.3'
-  categories:
-    performance: 'PASS'
-    security: 'CONCERNS'
-    reliability: 'PASS'
-    maintainability: 'PASS'
-  overall_status: 'CONCERNS'
-  critical_issues: 0
-  high_priority_issues: 1
-  medium_priority_issues: 0
-  concerns: 1
-  blockers: false
-  recommendations:
-    - 'Enforce MFA for all new accounts (HIGH - 4 hours)'
-  evidence_gaps: 2
-```
-````
-
-## Recommendations Summary
-
- **Release Blocker:** None ✅
- **High Priority:** 1 (Enforce MFA before release)
- **Medium Priority:** 1 (Migrate existing users to MFA)
- **Next Steps:** Address HIGH priority item, then proceed to gate workflow
-
-```
-
---
-
-## Validation Checklist
-
-Before completing this workflow, verify:
-
- ✅ All NFR categories assessed (performance, security, reliability, maintainability, custom)
- ✅ Thresholds defined or marked as UNKNOWN
- ✅ Evidence gathered for each NFR (or marked as MISSING)
- ✅ Status classified deterministically (PASS/CONCERNS/FAIL)
- ✅ No thresholds were guessed (marked as CONCERNS if unknown)
- ✅ Quick wins identified for CONCERNS/FAIL
- ✅ Recommended actions are specific and actionable
- ✅ Evidence gaps documented with owners and deadlines
- ✅ NFR assessment report generated and saved
- ✅ Gate YAML snippet generated (if enabled)
- ✅ Evidence checklist generated (if enabled)
-
---
-
-## Notes
-
- **Never Guess Thresholds:** If a threshold is unknown, mark as CONCERNS and recommend defining it
- **Evidence-Based:** Every assessment must be backed by evidence (tests, metrics, logs, CI results)
- **Deterministic Rules:** Use consistent PASS/CONCERNS/FAIL classification based on evidence
- **Actionable Recommendations:** Provide specific steps, not generic advice
- **Gate Integration:** Generate YAML snippets that can be consumed by CI/CD pipelines
-
---
-
-## Troubleshooting
-
-### "NFR thresholds not defined"
- Check tech-spec.md for NFR requirements
- Check PRD.md for product-level SLAs
- Check story file for feature-specific requirements
- If thresholds truly unknown, mark as CONCERNS and recommend defining them
-
-### "No evidence found"
- Check evidence directories (test-results, metrics, logs)
- Check CI/CD pipeline for test results
- If evidence truly missing, mark NFR as "NO EVIDENCE" and recommend generating it
-
-### "CONCERNS status but no threshold exceeded"
- CONCERNS is correct when threshold is UNKNOWN or evidence is MISSING/INCOMPLETE
- CONCERNS is also correct when evidence is close to threshold (within 10%)
- Document why CONCERNS was assigned
-
-### "FAIL status blocks release"
- This is intentional - FAIL means critical NFR not met
- Recommend remediation actions with specific steps
- Re-run assessment after remediation
-
---
-
-## Related Workflows
-
- **testarch-test-design** - Define NFR requirements and test plan
- **testarch-framework** - Set up performance/security testing frameworks
- **testarch-ci** - Configure CI/CD for NFR validation
- **testarch-gate** - Use NFR assessment as input for quality gate decisions
- **testarch-test-review** - Review test quality (maintainability NFR)
-
---
-
-<!-- Powered by BMAD-CORE™ -->
-```
--- a/_bmad/bmm/workflows/testarch/nfr-assess/nfr-report-template.md
+++ b/_bmad/bmm/workflows/testarch/nfr-assess/nfr-report-template.md
@@ -1,445 +0,0 @@
-# NFR Assessment - {FEATURE_NAME}
-
-**Date:** {DATE}
-**Story:** {STORY_ID} (if applicable)
-**Overall Status:** {OVERALL_STATUS} {STATUS_ICON}
-
---
-
-Note: This assessment summarizes existing evidence; it does not run tests or CI workflows.
-
-## Executive Summary
-
-**Assessment:** {PASS_COUNT} PASS, {CONCERNS_COUNT} CONCERNS, {FAIL_COUNT} FAIL
-
-**Blockers:** {BLOCKER_COUNT} {BLOCKER_DESCRIPTION}
-
-**High Priority Issues:** {HIGH_PRIORITY_COUNT} {HIGH_PRIORITY_DESCRIPTION}
-
-**Recommendation:** {OVERALL_RECOMMENDATION}
-
---
-
-## Performance Assessment
-
-### Response Time (p95)
-
- **Status:** {STATUS} {STATUS_ICON}
- **Threshold:** {THRESHOLD_VALUE}
- **Actual:** {ACTUAL_VALUE}
- **Evidence:** {EVIDENCE_SOURCE}
- **Findings:** {FINDINGS_DESCRIPTION}
-
-### Throughput
-
- **Status:** {STATUS} {STATUS_ICON}
- **Threshold:** {THRESHOLD_VALUE}
- **Actual:** {ACTUAL_VALUE}
- **Evidence:** {EVIDENCE_SOURCE}
- **Findings:** {FINDINGS_DESCRIPTION}
-
-### Resource Usage
-
- **CPU Usage**
-  - **Status:** {STATUS} {STATUS_ICON}
-  - **Threshold:** {THRESHOLD_VALUE}
-  - **Actual:** {ACTUAL_VALUE}
-  - **Evidence:** {EVIDENCE_SOURCE}
-
- **Memory Usage**
-  - **Status:** {STATUS} {STATUS_ICON}
-  - **Threshold:** {THRESHOLD_VALUE}
-  - **Actual:** {ACTUAL_VALUE}
-  - **Evidence:** {EVIDENCE_SOURCE}
-
-### Scalability
-
- **Status:** {STATUS} {STATUS_ICON}
- **Threshold:** {THRESHOLD_DESCRIPTION}
- **Actual:** {ACTUAL_DESCRIPTION}
- **Evidence:** {EVIDENCE_SOURCE}
- **Findings:** {FINDINGS_DESCRIPTION}
-
---
-
-## Security Assessment
-
-### Authentication Strength
-
- **Status:** {STATUS} {STATUS_ICON}
- **Threshold:** {THRESHOLD_DESCRIPTION}
- **Actual:** {ACTUAL_DESCRIPTION}
- **Evidence:** {EVIDENCE_SOURCE}
- **Findings:** {FINDINGS_DESCRIPTION}
- **Recommendation:** {RECOMMENDATION} (if CONCERNS or FAIL)
-
-### Authorization Controls
-
- **Status:** {STATUS} {STATUS_ICON}
- **Threshold:** {THRESHOLD_DESCRIPTION}
- **Actual:** {ACTUAL_DESCRIPTION}
- **Evidence:** {EVIDENCE_SOURCE}
- **Findings:** {FINDINGS_DESCRIPTION}
-
-### Data Protection
-
- **Status:** {STATUS} {STATUS_ICON}
- **Threshold:** {THRESHOLD_DESCRIPTION}
- **Actual:** {ACTUAL_DESCRIPTION}
- **Evidence:** {EVIDENCE_SOURCE}
- **Findings:** {FINDINGS_DESCRIPTION}
-
-### Vulnerability Management
-
- **Status:** {STATUS} {STATUS_ICON}
- **Threshold:** {THRESHOLD_DESCRIPTION} (e.g., "0 critical, <3 high vulnerabilities")
- **Actual:** {ACTUAL_DESCRIPTION} (e.g., "0 critical, 1 high, 5 medium vulnerabilities")
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "Snyk scan results - scan-2025-10-14.json")
- **Findings:** {FINDINGS_DESCRIPTION}
-
-### Compliance (if applicable)
-
- **Status:** {STATUS} {STATUS_ICON}
- **Standards:** {COMPLIANCE_STANDARDS} (e.g., "GDPR, HIPAA, PCI-DSS")
- **Actual:** {ACTUAL_COMPLIANCE_STATUS}
- **Evidence:** {EVIDENCE_SOURCE}
- **Findings:** {FINDINGS_DESCRIPTION}
-
---
-
-## Reliability Assessment
-
-### Availability (Uptime)
-
- **Status:** {STATUS} {STATUS_ICON}
- **Threshold:** {THRESHOLD_VALUE} (e.g., "99.9%")
- **Actual:** {ACTUAL_VALUE} (e.g., "99.95%")
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "Uptime monitoring - uptime-report-2025-10-14.csv")
- **Findings:** {FINDINGS_DESCRIPTION}
-
-### Error Rate
-
- **Status:** {STATUS} {STATUS_ICON}
- **Threshold:** {THRESHOLD_VALUE} (e.g., "<0.1%")
- **Actual:** {ACTUAL_VALUE} (e.g., "0.05%")
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "Error logs - logs/errors-2025-10.log")
- **Findings:** {FINDINGS_DESCRIPTION}
-
-### MTTR (Mean Time To Recovery)
-
- **Status:** {STATUS} {STATUS_ICON}
- **Threshold:** {THRESHOLD_VALUE} (e.g., "<15 minutes")
- **Actual:** {ACTUAL_VALUE} (e.g., "12 minutes")
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "Incident reports - incidents/")
- **Findings:** {FINDINGS_DESCRIPTION}
-
-### Fault Tolerance
-
- **Status:** {STATUS} {STATUS_ICON}
- **Threshold:** {THRESHOLD_DESCRIPTION}
- **Actual:** {ACTUAL_DESCRIPTION}
- **Evidence:** {EVIDENCE_SOURCE}
- **Findings:** {FINDINGS_DESCRIPTION}
-
-### CI Burn-In (Stability)
-
- **Status:** {STATUS} {STATUS_ICON}
- **Threshold:** {THRESHOLD_VALUE} (e.g., "100 consecutive successful runs")
- **Actual:** {ACTUAL_VALUE} (e.g., "150 consecutive successful runs")
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "CI burn-in results - ci-burn-in-2025-10-14.log")
- **Findings:** {FINDINGS_DESCRIPTION}
-
-### Disaster Recovery (if applicable)
-
- **RTO (Recovery Time Objective)**
-  - **Status:** {STATUS} {STATUS_ICON}
-  - **Threshold:** {THRESHOLD_VALUE}
-  - **Actual:** {ACTUAL_VALUE}
-  - **Evidence:** {EVIDENCE_SOURCE}
-
- **RPO (Recovery Point Objective)**
-  - **Status:** {STATUS} {STATUS_ICON}
-  - **Threshold:** {THRESHOLD_VALUE}
-  - **Actual:** {ACTUAL_VALUE}
-  - **Evidence:** {EVIDENCE_SOURCE}
-
---
-
-## Maintainability Assessment
-
-### Test Coverage
-
- **Status:** {STATUS} {STATUS_ICON}
- **Threshold:** {THRESHOLD_VALUE} (e.g., ">=80%")
- **Actual:** {ACTUAL_VALUE} (e.g., "87%")
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "Coverage report - coverage/lcov-report/index.html")
- **Findings:** {FINDINGS_DESCRIPTION}
-
-### Code Quality
-
- **Status:** {STATUS} {STATUS_ICON}
- **Threshold:** {THRESHOLD_VALUE} (e.g., ">=85/100")
- **Actual:** {ACTUAL_VALUE} (e.g., "92/100")
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "SonarQube analysis - sonarqube-report-2025-10-14.pdf")
- **Findings:** {FINDINGS_DESCRIPTION}
-
-### Technical Debt
-
- **Status:** {STATUS} {STATUS_ICON}
- **Threshold:** {THRESHOLD_VALUE} (e.g., "<5% debt ratio")
- **Actual:** {ACTUAL_VALUE} (e.g., "3.2% debt ratio")
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "CodeClimate analysis - codeclimate-2025-10-14.json")
- **Findings:** {FINDINGS_DESCRIPTION}
-
-### Documentation Completeness
-
- **Status:** {STATUS} {STATUS_ICON}
- **Threshold:** {THRESHOLD_VALUE} (e.g., ">=90%")
- **Actual:** {ACTUAL_VALUE} (e.g., "95%")
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "Documentation audit - docs-audit-2025-10-14.md")
- **Findings:** {FINDINGS_DESCRIPTION}
-
-### Test Quality (from test-review, if available)
-
- **Status:** {STATUS} {STATUS_ICON}
- **Threshold:** {THRESHOLD_DESCRIPTION}
- **Actual:** {ACTUAL_DESCRIPTION}
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "Test review report - test-review-2025-10-14.md")
- **Findings:** {FINDINGS_DESCRIPTION}
-
---
-
-## Custom NFR Assessments (if applicable)
-
-### {CUSTOM_NFR_NAME_1}
-
- **Status:** {STATUS} {STATUS_ICON}
- **Threshold:** {THRESHOLD_DESCRIPTION}
- **Actual:** {ACTUAL_DESCRIPTION}
- **Evidence:** {EVIDENCE_SOURCE}
- **Findings:** {FINDINGS_DESCRIPTION}
-
-### {CUSTOM_NFR_NAME_2}
-
- **Status:** {STATUS} {STATUS_ICON}
- **Threshold:** {THRESHOLD_DESCRIPTION}
- **Actual:** {ACTUAL_DESCRIPTION}
- **Evidence:** {EVIDENCE_SOURCE}
- **Findings:** {FINDINGS_DESCRIPTION}
-
---
-
-## Quick Wins
-
-{QUICK_WIN_COUNT} quick wins identified for immediate implementation:
-
-1. **{QUICK_WIN_TITLE_1}** ({NFR_CATEGORY}) - {PRIORITY} - {ESTIMATED_EFFORT}
-   - {QUICK_WIN_DESCRIPTION}
-   - No code changes needed / Minimal code changes
-
-2. **{QUICK_WIN_TITLE_2}** ({NFR_CATEGORY}) - {PRIORITY} - {ESTIMATED_EFFORT}
-   - {QUICK_WIN_DESCRIPTION}
-
---
-
-## Recommended Actions
-
-### Immediate (Before Release) - CRITICAL/HIGH Priority
-
-1. **{ACTION_TITLE_1}** - {PRIORITY} - {ESTIMATED_EFFORT} - {OWNER}
-   - {ACTION_DESCRIPTION}
-   - {SPECIFIC_STEPS}
-   - {VALIDATION_CRITERIA}
-
-2. **{ACTION_TITLE_2}** - {PRIORITY} - {ESTIMATED_EFFORT} - {OWNER}
-   - {ACTION_DESCRIPTION}
-   - {SPECIFIC_STEPS}
-   - {VALIDATION_CRITERIA}
-
-### Short-term (Next Sprint) - MEDIUM Priority
-
-1. **{ACTION_TITLE_3}** - {PRIORITY} - {ESTIMATED_EFFORT} - {OWNER}
-   - {ACTION_DESCRIPTION}
-
-2. **{ACTION_TITLE_4}** - {PRIORITY} - {ESTIMATED_EFFORT} - {OWNER}
-   - {ACTION_DESCRIPTION}
-
-### Long-term (Backlog) - LOW Priority
-
-1. **{ACTION_TITLE_5}** - {PRIORITY} - {ESTIMATED_EFFORT} - {OWNER}
-   - {ACTION_DESCRIPTION}
-
---
-
-## Monitoring Hooks
-
-{MONITORING_HOOK_COUNT} monitoring hooks recommended to detect issues before failures:
-
-### Performance Monitoring
-
- [ ] {MONITORING_TOOL_1} - {MONITORING_DESCRIPTION}
-  - **Owner:** {OWNER}
-  - **Deadline:** {DEADLINE}
-
- [ ] {MONITORING_TOOL_2} - {MONITORING_DESCRIPTION}
-  - **Owner:** {OWNER}
-  - **Deadline:** {DEADLINE}
-
-### Security Monitoring
-
- [ ] {MONITORING_TOOL_3} - {MONITORING_DESCRIPTION}
-  - **Owner:** {OWNER}
-  - **Deadline:** {DEADLINE}
-
-### Reliability Monitoring
-
- [ ] {MONITORING_TOOL_4} - {MONITORING_DESCRIPTION}
-  - **Owner:** {OWNER}
-  - **Deadline:** {DEADLINE}
-
-### Alerting Thresholds
-
- [ ] {ALERT_DESCRIPTION} - Notify when {THRESHOLD_CONDITION}
-  - **Owner:** {OWNER}
-  - **Deadline:** {DEADLINE}
-
---
-
-## Fail-Fast Mechanisms
-
-{FAIL_FAST_COUNT} fail-fast mechanisms recommended to prevent failures:
-
-### Circuit Breakers (Reliability)
-
- [ ] {CIRCUIT_BREAKER_DESCRIPTION}
-  - **Owner:** {OWNER}
-  - **Estimated Effort:** {EFFORT}
-
-### Rate Limiting (Performance)
-
- [ ] {RATE_LIMITING_DESCRIPTION}
-  - **Owner:** {OWNER}
-  - **Estimated Effort:** {EFFORT}
-
-### Validation Gates (Security)
-
- [ ] {VALIDATION_GATE_DESCRIPTION}
-  - **Owner:** {OWNER}
-  - **Estimated Effort:** {EFFORT}
-
-### Smoke Tests (Maintainability)
-
- [ ] {SMOKE_TEST_DESCRIPTION}
-  - **Owner:** {OWNER}
-  - **Estimated Effort:** {EFFORT}
-
---
-
-## Evidence Gaps
-
-{EVIDENCE_GAP_COUNT} evidence gaps identified - action required:
-
- [ ] **{NFR_NAME_1}** ({NFR_CATEGORY})
-  - **Owner:** {OWNER}
-  - **Deadline:** {DEADLINE}
-  - **Suggested Evidence:** {SUGGESTED_EVIDENCE_SOURCE}
-  - **Impact:** {IMPACT_DESCRIPTION}
-
- [ ] **{NFR_NAME_2}** ({NFR_CATEGORY})
-  - **Owner:** {OWNER}
-  - **Deadline:** {DEADLINE}
-  - **Suggested Evidence:** {SUGGESTED_EVIDENCE_SOURCE}
-  - **Impact:** {IMPACT_DESCRIPTION}
-
---
-
-## Findings Summary
-
-| Category        | PASS             | CONCERNS             | FAIL             | Overall Status                      |
-| --------------- | ---------------- | -------------------- | ---------------- | ----------------------------------- |
-| Performance     | {P_PASS_COUNT}   | {P_CONCERNS_COUNT}   | {P_FAIL_COUNT}   | {P_STATUS} {P_ICON}                 |
-| Security        | {S_PASS_COUNT}   | {S_CONCERNS_COUNT}   | {S_FAIL_COUNT}   | {S_STATUS} {S_ICON}                 |
-| Reliability     | {R_PASS_COUNT}   | {R_CONCERNS_COUNT}   | {R_FAIL_COUNT}   | {R_STATUS} {R_ICON}                 |
-| Maintainability | {M_PASS_COUNT}   | {M_CONCERNS_COUNT}   | {M_FAIL_COUNT}   | {M_STATUS} {M_ICON}                 |
-| **Total**       | **{TOTAL_PASS}** | **{TOTAL_CONCERNS}** | **{TOTAL_FAIL}** | **{OVERALL_STATUS} {OVERALL_ICON}** |
-
---
-
-## Gate YAML Snippet
-
-```yaml
-nfr_assessment:
-  date: '{DATE}'
-  story_id: '{STORY_ID}'
-  feature_name: '{FEATURE_NAME}'
-  categories:
-    performance: '{PERFORMANCE_STATUS}'
-    security: '{SECURITY_STATUS}'
-    reliability: '{RELIABILITY_STATUS}'
-    maintainability: '{MAINTAINABILITY_STATUS}'
-  overall_status: '{OVERALL_STATUS}'
-  critical_issues: { CRITICAL_COUNT }
-  high_priority_issues: { HIGH_COUNT }
-  medium_priority_issues: { MEDIUM_COUNT }
-  concerns: { CONCERNS_COUNT }
-  blockers: { BLOCKER_BOOLEAN } # true/false
-  quick_wins: { QUICK_WIN_COUNT }
-  evidence_gaps: { EVIDENCE_GAP_COUNT }
-  recommendations:
-    - '{RECOMMENDATION_1}'
-    - '{RECOMMENDATION_2}'
-    - '{RECOMMENDATION_3}'
-```
-
---
-
-## Related Artifacts
-
- **Story File:** {STORY_FILE_PATH} (if applicable)
- **Tech Spec:** {TECH_SPEC_PATH} (if available)
- **PRD:** {PRD_PATH} (if available)
- **Test Design:** {TEST_DESIGN_PATH} (if available)
- **Evidence Sources:**
-  - Test Results: {TEST_RESULTS_DIR}
-  - Metrics: {METRICS_DIR}
-  - Logs: {LOGS_DIR}
-  - CI Results: {CI_RESULTS_PATH}
-
---
-
-## Recommendations Summary
-
-**Release Blocker:** {RELEASE_BLOCKER_SUMMARY}
-
-**High Priority:** {HIGH_PRIORITY_SUMMARY}
-
-**Medium Priority:** {MEDIUM_PRIORITY_SUMMARY}
-
-**Next Steps:** {NEXT_STEPS_DESCRIPTION}
-
---
-
-## Sign-Off
-
-**NFR Assessment:**
-
- Overall Status: {OVERALL_STATUS} {OVERALL_ICON}
- Critical Issues: {CRITICAL_COUNT}
- High Priority Issues: {HIGH_COUNT}
- Concerns: {CONCERNS_COUNT}
- Evidence Gaps: {EVIDENCE_GAP_COUNT}
-
-**Gate Status:** {GATE_STATUS} {GATE_ICON}
-
-**Next Actions:**
-
- If PASS ✅: Proceed to `*gate` workflow or release
- If CONCERNS ⚠️: Address HIGH/CRITICAL issues, re-run `*nfr-assess`
- If FAIL ❌: Resolve FAIL status NFRs, re-run `*nfr-assess`
-
-**Generated:** {DATE}
-**Workflow:** testarch-nfr v4.0
-
---
-
-<!-- Powered by BMAD-CORE™ -->
--- a/_bmad/bmm/workflows/testarch/nfr-assess/workflow.yaml
+++ b/_bmad/bmm/workflows/testarch/nfr-assess/workflow.yaml
@@ -1,47 +0,0 @@
-# Test Architect workflow: nfr-assess
-name: testarch-nfr
-description: "Assess non-functional requirements (performance, security, reliability, maintainability) before release with evidence-based validation"
-author: "BMad"
-
-# Critical variables from config
-config_source: "{project-root}/_bmad/bmm/config.yaml"
-output_folder: "{config_source}:output_folder"
-user_name: "{config_source}:user_name"
-communication_language: "{config_source}:communication_language"
-document_output_language: "{config_source}:document_output_language"
-date: system-generated
-
-# Workflow components
-installed_path: "{project-root}/_bmad/bmm/workflows/testarch/nfr-assess"
-instructions: "{installed_path}/instructions.md"
-validation: "{installed_path}/checklist.md"
-template: "{installed_path}/nfr-report-template.md"
-
-# Variables and inputs
-variables:
-  # NFR category assessment (defaults to all categories)
-  custom_nfr_categories: "" # Optional additional categories beyond standard (security, performance, reliability, maintainability)
-
-# Output configuration
-default_output_file: "{output_folder}/nfr-assessment.md"
-
-# Required tools
-required_tools:
-  - read_file # Read story, test results, metrics, logs, BMad artifacts
-  - write_file # Create NFR assessment, gate YAML, evidence checklist
-  - list_files # Discover test results, metrics, logs
-  - search_repo # Find NFR-related tests and evidence
-  - glob # Find result files matching patterns
-
-tags:
-  - qa
-  - nfr
-  - test-architect
-  - performance
-  - security
-  - reliability
-
-execution_hints:
-  interactive: false # Minimize prompts
-  autonomous: true # Proceed without user input unless blocked
-  iterative: true
--- a/_bmad/bmm/workflows/testarch/test-design/checklist.md
+++ b/_bmad/bmm/workflows/testarch/test-design/checklist.md
@@ -1,235 +0,0 @@
-# Test Design and Risk Assessment - Validation Checklist
-
-## Prerequisites
-
- [ ] Story markdown with clear acceptance criteria exists
- [ ] PRD or epic documentation available
- [ ] Architecture documents available (optional)
- [ ] Requirements are testable and unambiguous
-
-## Process Steps
-
-### Step 1: Context Loading
-
- [ ] PRD.md read and requirements extracted
- [ ] Epics.md or specific epic documentation loaded
- [ ] Story markdown with acceptance criteria analyzed
- [ ] Architecture documents reviewed (if available)
- [ ] Existing test coverage analyzed
- [ ] Knowledge base fragments loaded (risk-governance, probability-impact, test-levels, test-priorities)
-
-### Step 2: Risk Assessment
-
- [ ] Genuine risks identified (not just features)
- [ ] Risks classified by category (TECH/SEC/PERF/DATA/BUS/OPS)
- [ ] Probability scored (1-3 for each risk)
- [ ] Impact scored (1-3 for each risk)
- [ ] Risk scores calculated (probability × impact)
- [ ] High-priority risks (score ≥6) flagged
- [ ] Mitigation plans defined for high-priority risks
- [ ] Owners assigned for each mitigation
- [ ] Timelines set for mitigations
- [ ] Residual risk documented
-
-### Step 3: Coverage Design
-
- [ ] Acceptance criteria broken into atomic scenarios
- [ ] Test levels selected (E2E/API/Component/Unit)
- [ ] No duplicate coverage across levels
- [ ] Priority levels assigned (P0/P1/P2/P3)
- [ ] P0 scenarios meet strict criteria (blocks core + high risk + no workaround)
- [ ] Data prerequisites identified
- [ ] Tooling requirements documented
- [ ] Execution order defined (smoke → P0 → P1 → P2/P3)
-
-### Step 4: Deliverables Generation
-
- [ ] Risk assessment matrix created
- [ ] Coverage matrix created
- [ ] Execution order documented
- [ ] Resource estimates calculated
- [ ] Quality gate criteria defined
- [ ] Output file written to correct location
- [ ] Output file uses template structure
-
-## Output Validation
-
-### Risk Assessment Matrix
-
- [ ] All risks have unique IDs (R-001, R-002, etc.)
- [ ] Each risk has category assigned
- [ ] Probability values are 1, 2, or 3
- [ ] Impact values are 1, 2, or 3
- [ ] Scores calculated correctly (P × I)
- [ ] High-priority risks (≥6) clearly marked
- [ ] Mitigation strategies specific and actionable
-
-### Coverage Matrix
-
- [ ] All requirements mapped to test levels
- [ ] Priorities assigned to all scenarios
- [ ] Risk linkage documented
- [ ] Test counts realistic
- [ ] Owners assigned where applicable
- [ ] No duplicate coverage (same behavior at multiple levels)
-
-### Execution Order
-
- [ ] Smoke tests defined (<5 min target)
- [ ] P0 tests listed (<10 min target)
- [ ] P1 tests listed (<30 min target)
- [ ] P2/P3 tests listed (<60 min target)
- [ ] Order optimizes for fast feedback
-
-### Resource Estimates
-
- [ ] P0 hours calculated (count × 2 hours)
- [ ] P1 hours calculated (count × 1 hour)
- [ ] P2 hours calculated (count × 0.5 hours)
- [ ] P3 hours calculated (count × 0.25 hours)
- [ ] Total hours summed
- [ ] Days estimate provided (hours / 8)
- [ ] Estimates include setup time
-
-### Quality Gate Criteria
-
- [ ] P0 pass rate threshold defined (should be 100%)
- [ ] P1 pass rate threshold defined (typically ≥95%)
- [ ] High-risk mitigation completion required
- [ ] Coverage targets specified (≥80% recommended)
-
-## Quality Checks
-
-### Evidence-Based Assessment
-
- [ ] Risk assessment based on documented evidence
- [ ] No speculation on business impact
- [ ] Assumptions clearly documented
- [ ] Clarifications requested where needed
- [ ] Historical data referenced where available
-
-### Risk Classification Accuracy
-
- [ ] TECH risks are architecture/integration issues
- [ ] SEC risks are security vulnerabilities
- [ ] PERF risks are performance/scalability concerns
- [ ] DATA risks are data integrity issues
- [ ] BUS risks are business/revenue impacts
- [ ] OPS risks are deployment/operational issues
-
-### Priority Assignment Accuracy
-
- [ ] P0: Truly blocks core functionality
- [ ] P0: High-risk (score ≥6)
- [ ] P0: No workaround exists
- [ ] P1: Important but not blocking
- [ ] P2/P3: Nice-to-have or edge cases
-
-### Test Level Selection
-
- [ ] E2E used only for critical paths
- [ ] API tests cover complex business logic
- [ ] Component tests for UI interactions
- [ ] Unit tests for edge cases and algorithms
- [ ] No redundant coverage
-
-## Integration Points
-
-### Knowledge Base Integration
-
- [ ] risk-governance.md consulted
- [ ] probability-impact.md applied
- [ ] test-levels-framework.md referenced
- [ ] test-priorities-matrix.md used
- [ ] Additional fragments loaded as needed
-
-### Status File Integration
-
- [ ] bmm-workflow-status.md exists
- [ ] Test design logged in Quality & Testing Progress
- [ ] Epic number and scope documented
- [ ] Completion timestamp recorded
-
-### Workflow Dependencies
-
- [ ] Can proceed to `*atdd` workflow with P0 scenarios
- [ ] `*atdd` is a separate workflow and must be run explicitly (not auto-run)
- [ ] Can proceed to `automate` workflow with full coverage plan
- [ ] Risk assessment informs `gate` workflow criteria
- [ ] Integrates with `ci` workflow execution order
-
-## Completion Criteria
-
-**All must be true:**
-
- [ ] All prerequisites met
- [ ] All process steps completed
- [ ] All output validations passed
- [ ] All quality checks passed
- [ ] All integration points verified
- [ ] Output file complete and well-formatted
- [ ] Team review scheduled (if required)
-
-## Post-Workflow Actions
-
-**User must complete:**
-
-1. [ ] Review risk assessment with team
-2. [ ] Prioritize mitigation for high-priority risks (score ≥6)
-3. [ ] Allocate resources per estimates
-4. [ ] Run `*atdd` workflow to generate P0 tests (separate workflow; not auto-run)
-5. [ ] Set up test data factories and fixtures
-6. [ ] Schedule team review of test design document
-
-**Recommended next workflows:**
-
-1. [ ] Run `atdd` workflow for P0 test generation
-2. [ ] Run `framework` workflow if not already done
-3. [ ] Run `ci` workflow to configure pipeline stages
-
-## Rollback Procedure
-
-If workflow fails:
-
-1. [ ] Delete output file
-2. [ ] Review error logs
-3. [ ] Fix missing context (PRD, architecture docs)
-4. [ ] Clarify ambiguous requirements
-5. [ ] Retry workflow
-
-## Notes
-
-### Common Issues
-
-**Issue**: Too many P0 tests
-
- **Solution**: Apply strict P0 criteria - must block core AND high risk AND no workaround
-
-**Issue**: Risk scores all high
-
- **Solution**: Differentiate between high-impact (3) and degraded (2) impacts
-
-**Issue**: Duplicate coverage across levels
-
- **Solution**: Use test pyramid - E2E for critical paths only
-
-**Issue**: Resource estimates too high
-
- **Solution**: Invest in fixtures/factories to reduce per-test setup time
-
-### Best Practices
-
- Base risk assessment on evidence, not assumptions
- High-priority risks (≥6) require immediate mitigation
- P0 tests should cover <10% of total scenarios
- Avoid testing same behavior at multiple levels
- Include smoke tests (P0 subset) for fast feedback
-
---
-
-**Checklist Complete**: Sign off when all items validated.
-
-**Completed by:** {name}
-**Date:** {date}
-**Epic:** {epic title}
-**Notes:** {additional notes}
--- a/_bmad/bmm/workflows/testarch/test-design/instructions.md
+++ b/_bmad/bmm/workflows/testarch/test-design/instructions.md
@@ -1,788 +0,0 @@
-<!-- Powered by BMAD-CORE™ -->
-
-# Test Design and Risk Assessment
-
-**Workflow ID**: `_bmad/bmm/testarch/test-design`
-**Version**: 4.0 (BMad v6)
-
---
-
-## Overview
-
-Plans comprehensive test coverage strategy with risk assessment, priority classification, and execution ordering. This workflow operates in **two modes**:
-
- **System-Level Mode (Phase 3)**: Testability review of architecture before solutioning gate check
- **Epic-Level Mode (Phase 4)**: Per-epic test planning with risk assessment (current behavior)
-
-The workflow auto-detects which mode to use based on project phase.
-
---
-
-## Preflight: Detect Mode and Load Context
-
-**Critical:** Determine mode before proceeding.
-
-### Mode Detection
-
-1. **Check for sprint-status.yaml**
-   - If `{implementation_artifacts}/sprint-status.yaml` exists → **Epic-Level Mode** (Phase 4)
-   - If NOT exists → Check workflow status
-
-2. **Check workflow-status.yaml**
-   - Read `{planning_artifacts}/bmm-workflow-status.yaml`
-   - If `implementation-readiness: required` or `implementation-readiness: recommended` → **System-Level Mode** (Phase 3)
-   - Otherwise → **Epic-Level Mode** (Phase 4 without sprint status yet)
-
-3. **Mode-Specific Requirements**
-
-   **System-Level Mode (Phase 3 - Testability Review):**
-   - ✅ Architecture document exists (architecture.md or tech-spec)
-   - ✅ PRD exists with functional and non-functional requirements
-   - ✅ Epics documented (epics.md)
-   - ⚠️ Output: `{output_folder}/test-design-system.md`
-
-   **Epic-Level Mode (Phase 4 - Per-Epic Planning):**
-   - ✅ Story markdown with acceptance criteria available
-   - ✅ PRD or epic documentation exists for context
-   - ✅ Architecture documents available (optional but recommended)
-   - ✅ Requirements are clear and testable
-   - ⚠️ Output: `{output_folder}/test-design-epic-{epic_num}.md`
-
-**Halt Condition:** If mode cannot be determined or required files missing, HALT and notify user with missing prerequisites.
-
---
-
-## Step 1: Load Context (Mode-Aware)
-
-**Mode-Specific Loading:**
-
-### System-Level Mode (Phase 3)
-
-1. **Read Architecture Documentation**
-   - Load architecture.md or tech-spec (REQUIRED)
-   - Load PRD.md for functional and non-functional requirements
-   - Load epics.md for feature scope
-   - Identify technology stack decisions (frameworks, databases, deployment targets)
-   - Note integration points and external system dependencies
-   - Extract NFR requirements (performance SLOs, security requirements, etc.)
-
-2. **Check Playwright Utils Flag**
-
-   Read `{config_source}` and check `config.tea_use_playwright_utils`.
-
-   If true, note that `@seontechnologies/playwright-utils` provides utilities for test implementation. Reference in test design where relevant.
-
-3. **Load Knowledge Base Fragments (System-Level)**
-
-   **Critical:** Consult `{project-root}/_bmad/bmm/testarch/tea-index.csv` to load:
-   - `nfr-criteria.md` - NFR validation approach (security, performance, reliability, maintainability)
-   - `test-levels-framework.md` - Test levels strategy guidance
-   - `risk-governance.md` - Testability risk identification
-   - `test-quality.md` - Quality standards and Definition of Done
-
-4. **Analyze Existing Test Setup (if brownfield)**
-   - Search for existing test directories
-   - Identify current test framework (if any)
-   - Note testability concerns in existing codebase
-
-### Epic-Level Mode (Phase 4)
-
-1. **Read Requirements Documentation**
-   - Load PRD.md for high-level product requirements
-   - Read epics.md or specific epic for feature scope
-   - Read story markdown for detailed acceptance criteria
-   - Identify all testable requirements
-
-2. **Load Architecture Context**
-   - Read architecture.md for system design
-   - Read tech-spec for implementation details
-   - Read test-design-system.md (if exists from Phase 3)
-   - Identify technical constraints and dependencies
-   - Note integration points and external systems
-
-3. **Analyze Existing Test Coverage**
-   - Search for existing test files in `{test_dir}`
-   - Identify coverage gaps
-   - Note areas with insufficient testing
-   - Check for flaky or outdated tests
-
-4. **Load Knowledge Base Fragments (Epic-Level)**
-
-   **Critical:** Consult `{project-root}/_bmad/bmm/testarch/tea-index.csv` to load:
-   - `risk-governance.md` - Risk classification framework (6 categories: TECH, SEC, PERF, DATA, BUS, OPS), automated scoring, gate decision engine, owner tracking (625 lines, 4 examples)
-   - `probability-impact.md` - Risk scoring methodology (probability × impact matrix, automated classification, dynamic re-assessment, gate integration, 604 lines, 4 examples)
-   - `test-levels-framework.md` - Test level selection guidance (E2E vs API vs Component vs Unit with decision matrix, characteristics, when to use each, 467 lines, 4 examples)
-   - `test-priorities-matrix.md` - P0-P3 prioritization criteria (automated priority calculation, risk-based mapping, tagging strategy, time budgets, 389 lines, 2 examples)
-
-**Halt Condition (Epic-Level only):** If story data or acceptance criteria are missing, check if brownfield exploration is needed. If neither requirements NOR exploration possible, HALT with message: "Epic-level test design requires clear requirements, acceptance criteria, or brownfield app URL for exploration"
-
---
-
-## Step 1.5: System-Level Testability Review (Phase 3 Only)
-
-**Skip this step if Epic-Level Mode.** This step only executes in System-Level Mode.
-
-### Actions
-
-1. **Review Architecture for Testability**
-
-   Evaluate architecture against these criteria:
-
-   **Controllability:**
-   - Can we control system state for testing? (API seeding, factories, database reset)
-   - Are external dependencies mockable? (interfaces, dependency injection)
-   - Can we trigger error conditions? (chaos engineering, fault injection)
-
-   **Observability:**
-   - Can we inspect system state? (logging, metrics, traces)
-   - Are test results deterministic? (no race conditions, clear success/failure)
-   - Can we validate NFRs? (performance metrics, security audit logs)
-
-   **Reliability:**
-   - Are tests isolated? (parallel-safe, stateless, cleanup discipline)
-   - Can we reproduce failures? (deterministic waits, HAR capture, seed data)
-   - Are components loosely coupled? (mockable, testable boundaries)
-
-2. **Identify Architecturally Significant Requirements (ASRs)**
-
-   From PRD NFRs and architecture decisions, identify quality requirements that:
-   - Drive architecture decisions (e.g., "Must handle 10K concurrent users" → caching architecture)
-   - Pose testability challenges (e.g., "Sub-second response time" → performance test infrastructure)
-   - Require special test environments (e.g., "Multi-region deployment" → regional test instances)
-
-   Score each ASR using risk matrix (probability × impact).
-
-3. **Define Test Levels Strategy**
-
-   Based on architecture (mobile, web, API, microservices, monolith):
-   - Recommend unit/integration/E2E split (e.g., 70/20/10 for API-heavy, 40/30/30 for UI-heavy)
-   - Identify test environment needs (local, staging, ephemeral, production-like)
-   - Define testing approach per technology (Playwright for web, Maestro for mobile, k6 for performance)
-
-4. **Assess NFR Testing Approach**
-
-   For each NFR category:
-   - **Security**: Auth/authz tests, OWASP validation, secret handling (Playwright E2E + security tools)
-   - **Performance**: Load/stress/spike testing with k6, SLO/SLA thresholds
-   - **Reliability**: Error handling, retries, circuit breakers, health checks (Playwright + API tests)
-   - **Maintainability**: Coverage targets, code quality gates, observability validation
-
-5. **Flag Testability Concerns**
-
-   Identify architecture decisions that harm testability:
-   - ❌ Tight coupling (no interfaces, hard dependencies)
-   - ❌ No dependency injection (can't mock external services)
-   - ❌ Hardcoded configurations (can't test different envs)
-   - ❌ Missing observability (can't validate NFRs)
-   - ❌ Stateful designs (can't parallelize tests)
-
-   **Critical:** If testability concerns are blockers (e.g., "Architecture makes performance testing impossible"), document as CONCERNS or FAIL recommendation for gate check.
-
-6. **Output System-Level Test Design**
-
-   Write to `{output_folder}/test-design-system.md` containing:
-
-   ```markdown
-   # System-Level Test Design
-
-   ## Testability Assessment
-
-   - Controllability: [PASS/CONCERNS/FAIL with details]
-   - Observability: [PASS/CONCERNS/FAIL with details]
-   - Reliability: [PASS/CONCERNS/FAIL with details]
-
-   ## Architecturally Significant Requirements (ASRs)
-
-   [Risk-scored quality requirements]
-
-   ## Test Levels Strategy
-
-   - Unit: [X%] - [Rationale]
-   - Integration: [Y%] - [Rationale]
-   - E2E: [Z%] - [Rationale]
-
-   ## NFR Testing Approach
-
-   - Security: [Approach with tools]
-   - Performance: [Approach with tools]
-   - Reliability: [Approach with tools]
-   - Maintainability: [Approach with tools]
-
-   ## Test Environment Requirements
-
-   [Infrastructure needs based on deployment architecture]
-
-   ## Testability Concerns (if any)
-
-   [Blockers or concerns that should inform solutioning gate check]
-
-   ## Recommendations for Sprint 0
-
-   [Specific actions for *framework and *ci workflows]
-   ```
-
-**After System-Level Mode:** Skip to Step 4 (Generate Deliverables) - Steps 2-3 are epic-level only.
-
---
-
-## Step 1.6: Exploratory Mode Selection (Epic-Level Only)
-
-### Actions
-
-1. **Detect Planning Mode**
-
-   Determine mode based on context:
-
-   **Requirements-Based Mode (DEFAULT)**:
-   - Have clear story/PRD with acceptance criteria
-   - Uses: Existing workflow (Steps 2-4)
-   - Appropriate for: Documented features, greenfield projects
-
-   **Exploratory Mode (OPTIONAL - Brownfield)**:
-   - Missing/incomplete requirements AND brownfield application exists
-   - Uses: UI exploration to discover functionality
-   - Appropriate for: Undocumented brownfield apps, legacy systems
-
-2. **Requirements-Based Mode (DEFAULT - Skip to Step 2)**
-
-   If requirements are clear:
-   - Continue with existing workflow (Step 2: Assess and Classify Risks)
-   - Use loaded requirements from Step 1
-   - Proceed with risk assessment based on documented requirements
-
-3. **Exploratory Mode (OPTIONAL - Brownfield Apps)**
-
-   If exploring brownfield application:
-
-   **A. Check MCP Availability**
-
-   If config.tea_use_mcp_enhancements is true AND Playwright MCP tools available:
-   - Use MCP-assisted exploration (Step 3.B)
-
-   If MCP unavailable OR config.tea_use_mcp_enhancements is false:
-   - Use manual exploration fallback (Step 3.C)
-
-   **B. MCP-Assisted Exploration (If MCP Tools Available)**
-
-   Use Playwright MCP browser tools to explore UI:
-
-   **Setup:**
-
-   ```
-   1. Use planner_setup_page to initialize browser
-   2. Navigate to {exploration_url}
-   3. Capture initial state with browser_snapshot
-   ```
-
-   **Exploration Process:**
-
-   ```
-   4. Use browser_navigate to explore different pages
-   5. Use browser_click to interact with buttons, links, forms
-   6. Use browser_hover to reveal hidden menus/tooltips
-   7. Capture browser_snapshot at each significant state
-   8. Take browser_screenshot for documentation
-   9. Monitor browser_console_messages for JavaScript errors
-   10. Track browser_network_requests to identify API calls
-   11. Map user flows and interactive elements
-   12. Document discovered functionality
-   ```
-
-   **Discovery Documentation:**
-   - Create list of discovered features (pages, workflows, forms)
-   - Identify user journeys (navigation paths)
-   - Map API endpoints (from network requests)
-   - Note error states (from console messages)
-   - Capture screenshots for visual reference
-
-   **Convert to Test Scenarios:**
-   - Transform discoveries into testable requirements
-   - Prioritize based on user flow criticality
-   - Identify risks from discovered functionality
-   - Continue with Step 2 (Assess and Classify Risks) using discovered requirements
-
-   **C. Manual Exploration Fallback (If MCP Unavailable)**
-
-   If Playwright MCP is not available:
-
-   **Notify User:**
-
-   ```markdown
-   Exploratory mode enabled but Playwright MCP unavailable.
-
-   **Manual exploration required:**
-
-   1. Open application at: {exploration_url}
-   2. Explore all pages, workflows, and features
-   3. Document findings in markdown:
-      - List of pages/features discovered
-      - User journeys identified
-      - API endpoints observed (DevTools Network tab)
-      - JavaScript errors noted (DevTools Console)
-      - Critical workflows mapped
-
-   4. Provide exploration findings to continue workflow
-
-   **Alternative:** Disable exploratory_mode and provide requirements documentation
-   ```
-
-   Wait for user to provide exploration findings, then:
-   - Parse user-provided discovery documentation
-   - Convert to testable requirements
-   - Continue with Step 2 (risk assessment)
-
-4. **Proceed to Risk Assessment**
-
-   After mode selection (Requirements-Based OR Exploratory):
-   - Continue to Step 2: Assess and Classify Risks
-   - Use requirements from documentation (Requirements-Based) OR discoveries (Exploratory)
-
---
-
-## Step 2: Assess and Classify Risks
-
-### Actions
-
-1. **Identify Genuine Risks**
-
-   Filter requirements to isolate actual risks (not just features):
-   - Unresolved technical gaps
-   - Security vulnerabilities
-   - Performance bottlenecks
-   - Data loss or corruption potential
-   - Business impact failures
-   - Operational deployment issues
-
-2. **Classify Risks by Category**
-
-   Use these standard risk categories:
-
-   **TECH** (Technical/Architecture):
-   - Architecture flaws
-   - Integration failures
-   - Scalability issues
-   - Technical debt
-
-   **SEC** (Security):
-   - Missing access controls
-   - Authentication bypass
-   - Data exposure
-   - Injection vulnerabilities
-
-   **PERF** (Performance):
-   - SLA violations
-   - Response time degradation
-   - Resource exhaustion
-   - Scalability limits
-
-   **DATA** (Data Integrity):
-   - Data loss
-   - Data corruption
-   - Inconsistent state
-   - Migration failures
-
-   **BUS** (Business Impact):
-   - User experience degradation
-   - Business logic errors
-   - Revenue impact
-   - Compliance violations
-
-   **OPS** (Operations):
-   - Deployment failures
-   - Configuration errors
-   - Monitoring gaps
-   - Rollback issues
-
-3. **Score Risk Probability**
-
-   Rate likelihood (1-3):
-   - **1 (Unlikely)**: <10% chance, edge case
-   - **2 (Possible)**: 10-50% chance, known scenario
-   - **3 (Likely)**: >50% chance, common occurrence
-
-4. **Score Risk Impact**
-
-   Rate severity (1-3):
-   - **1 (Minor)**: Cosmetic, workaround exists, limited users
-   - **2 (Degraded)**: Feature impaired, workaround difficult, affects many users
-   - **3 (Critical)**: System failure, data loss, no workaround, blocks usage
-
-5. **Calculate Risk Score**
-
-   ```
-   Risk Score = Probability × Impact
-
-   Scores:
-   1-2: Low risk (monitor)
-   3-4: Medium risk (plan mitigation)
-   6-9: High risk (immediate mitigation required)
-   ```
-
-6. **Highlight High-Priority Risks**
-
-   Flag all risks with score ≥6 for immediate attention.
-
-7. **Request Clarification**
-
-   If evidence is missing or assumptions required:
-   - Document assumptions clearly
-   - Request user clarification
-   - Do NOT speculate on business impact
-
-8. **Plan Mitigations**
-
-   For each high-priority risk:
-   - Define mitigation strategy
-   - Assign owner (dev, QA, ops)
-   - Set timeline
-   - Update residual risk expectation
-
---
-
-## Step 3: Design Test Coverage
-
-### Actions
-
-1. **Break Down Acceptance Criteria**
-
-   Convert each acceptance criterion into atomic test scenarios:
-   - One scenario per testable behavior
-   - Scenarios are independent
-   - Scenarios are repeatable
-   - Scenarios tie back to risk mitigations
-
-2. **Select Appropriate Test Levels**
-
-   **Knowledge Base Reference**: `test-levels-framework.md`
-
-   Map requirements to optimal test levels (avoid duplication):
-
-   **E2E (End-to-End)**:
-   - Critical user journeys
-   - Multi-system integration
-   - Production-like environment
-   - Highest confidence, slowest execution
-
-   **API (Integration)**:
-   - Service contracts
-   - Business logic validation
-   - Fast feedback
-   - Good for complex scenarios
-
-   **Component**:
-   - UI component behavior
-   - Interaction testing
-   - Visual regression
-   - Fast, isolated
-
-   **Unit**:
-   - Business logic
-   - Edge cases
-   - Error handling
-   - Fastest, most granular
-
-   **Avoid duplicate coverage**: Don't test same behavior at multiple levels unless necessary.
-
-3. **Assign Priority Levels**
-
-   **Knowledge Base Reference**: `test-priorities-matrix.md`
-
-   **P0 (Critical)**:
-   - Blocks core user journey
-   - High-risk areas (score ≥6)
-   - Revenue-impacting
-   - Security-critical
-   - **Run on every commit**
-
-   **P1 (High)**:
-   - Important user features
-   - Medium-risk areas (score 3-4)
-   - Common workflows
-   - **Run on PR to main**
-
-   **P2 (Medium)**:
-   - Secondary features
-   - Low-risk areas (score 1-2)
-   - Edge cases
-   - **Run nightly or weekly**
-
-   **P3 (Low)**:
-   - Nice-to-have
-   - Exploratory
-   - Performance benchmarks
-   - **Run on-demand**
-
-4. **Outline Data and Tooling Prerequisites**
-
-   For each test scenario, identify:
-   - Test data requirements (factories, fixtures)
-   - External services (mocks, stubs)
-   - Environment setup
-   - Tools and dependencies
-
-5. **Define Execution Order**
-
-   Recommend test execution sequence:
-   1. **Smoke tests** (P0 subset, <5 min)
-   2. **P0 tests** (critical paths, <10 min)
-   3. **P1 tests** (important features, <30 min)
-   4. **P2/P3 tests** (full regression, <60 min)
-
---
-
-## Step 4: Generate Deliverables
-
-### Actions
-
-1. **Create Risk Assessment Matrix**
-
-   Use template structure:
-
-   ```markdown
-   | Risk ID | Category | Description | Probability | Impact | Score | Mitigation      |
-   | ------- | -------- | ----------- | ----------- | ------ | ----- | --------------- |
-   | R-001   | SEC      | Auth bypass | 2           | 3      | 6     | Add authz check |
-   ```
-
-2. **Create Coverage Matrix**
-
-   ```markdown
-   | Requirement | Test Level | Priority | Risk Link | Test Count | Owner |
-   | ----------- | ---------- | -------- | --------- | ---------- | ----- |
-   | Login flow  | E2E        | P0       | R-001     | 3          | QA    |
-   ```
-
-3. **Document Execution Order**
-
-   ```markdown
-   ### Smoke Tests (<5 min)
-
-   - Login successful
-   - Dashboard loads
-
-   ### P0 Tests (<10 min)
-
-   - [Full P0 list]
-
-   ### P1 Tests (<30 min)
-
-   - [Full P1 list]
-   ```
-
-4. **Include Resource Estimates**
-
-   ```markdown
-   ### Test Effort Estimates
-
-   - P0 scenarios: 15 tests × 2 hours = 30 hours
-   - P1 scenarios: 25 tests × 1 hour = 25 hours
-   - P2 scenarios: 40 tests × 0.5 hour = 20 hours
-   - **Total:** 75 hours (~10 days)
-   ```
-
-5. **Add Gate Criteria**
-
-   ```markdown
-   ### Quality Gate Criteria
-
-   - All P0 tests pass (100%)
-   - P1 tests pass rate ≥95%
-   - No high-risk (score ≥6) items unmitigated
-   - Test coverage ≥80% for critical paths
-   ```
-
-6. **Write to Output File**
-
-   Save to `{output_folder}/test-design-epic-{epic_num}.md` using template structure.
-
---
-
-## Important Notes
-
-### Risk Category Definitions
-
-**TECH** (Technical/Architecture):
-
- Architecture flaws or technical debt
- Integration complexity
- Scalability concerns
-
-**SEC** (Security):
-
- Missing security controls
- Authentication/authorization gaps
- Data exposure risks
-
-**PERF** (Performance):
-
- SLA risk or performance degradation
- Resource constraints
- Scalability bottlenecks
-
-**DATA** (Data Integrity):
-
- Data loss or corruption potential
- State consistency issues
- Migration risks
-
-**BUS** (Business Impact):
-
- User experience harm
- Business logic errors
- Revenue or compliance impact
-
-**OPS** (Operations):
-
- Deployment or runtime failures
- Configuration issues
- Monitoring/observability gaps
-
-### Risk Scoring Methodology
-
-**Probability × Impact = Risk Score**
-
-Examples:
-
- High likelihood (3) × Critical impact (3) = **Score 9** (highest priority)
- Possible (2) × Critical (3) = **Score 6** (high priority threshold)
- Unlikely (1) × Minor (1) = **Score 1** (low priority)
-
-**Threshold**: Scores ≥6 require immediate mitigation.
-
-### Test Level Selection Strategy
-
-**Avoid duplication:**
-
- Don't test same behavior at E2E and API level
- Use E2E for critical paths only
- Use API tests for complex business logic
- Use unit tests for edge cases
-
-**Tradeoffs:**
-
- E2E: High confidence, slow execution, brittle
- API: Good balance, fast, stable
- Unit: Fastest feedback, narrow scope
-
-### Priority Assignment Guidelines
-
-**P0 criteria** (all must be true):
-
- Blocks core functionality
- High-risk (score ≥6)
- No workaround exists
- Affects majority of users
-
-**P1 criteria**:
-
- Important feature
- Medium risk (score 3-5)
- Workaround exists but difficult
-
-**P2/P3**: Everything else, prioritized by value
-
-### Knowledge Base Integration
-
-**Core Fragments (Auto-loaded in Step 1):**
-
- `risk-governance.md` - Risk classification (6 categories), automated scoring, gate decision engine, coverage traceability, owner tracking (625 lines, 4 examples)
- `probability-impact.md` - Probability × impact matrix, automated classification thresholds, dynamic re-assessment, gate integration (604 lines, 4 examples)
- `test-levels-framework.md` - E2E vs API vs Component vs Unit decision framework with characteristics matrix (467 lines, 4 examples)
- `test-priorities-matrix.md` - P0-P3 automated priority calculation, risk-based mapping, tagging strategy, time budgets (389 lines, 2 examples)
-
-**Reference for Test Planning:**
-
- `selective-testing.md` - Execution strategy: tag-based, spec filters, diff-based selection, promotion rules (727 lines, 4 examples)
- `fixture-architecture.md` - Data setup patterns: pure function → fixture → mergeTests, auto-cleanup (406 lines, 5 examples)
-
-**Manual Reference (Optional):**
-
- Use `tea-index.csv` to find additional specialized fragments as needed
-
-### Evidence-Based Assessment
-
-**Critical principle:** Base risk assessment on evidence, not speculation.
-
-**Evidence sources:**
-
- PRD and user research
- Architecture documentation
- Historical bug data
- User feedback
- Security audit results
-
-**Avoid:**
-
- Guessing business impact
- Assuming user behavior
- Inventing requirements
-
-**When uncertain:** Document assumptions and request clarification from user.
-
---
-
-## Output Summary
-
-After completing this workflow, provide a summary:
-
-```markdown
-## Test Design Complete
-
-**Epic**: {epic_num}
-**Scope**: {design_level}
-
-**Risk Assessment**:
-
- Total risks identified: {count}
- High-priority risks (≥6): {high_count}
- Categories: {categories}
-
-**Coverage Plan**:
-
- P0 scenarios: {p0_count} ({p0_hours} hours)
- P1 scenarios: {p1_count} ({p1_hours} hours)
- P2/P3 scenarios: {p2p3_count} ({p2p3_hours} hours)
- **Total effort**: {total_hours} hours (~{total_days} days)
-
-**Test Levels**:
-
- E2E: {e2e_count}
- API: {api_count}
- Component: {component_count}
- Unit: {unit_count}
-
-**Quality Gate Criteria**:
-
- P0 pass rate: 100%
- P1 pass rate: ≥95%
- High-risk mitigations: 100%
- Coverage: ≥80%
-
-**Output File**: {output_file}
-
-**Next Steps**:
-
-1. Review risk assessment with team
-2. Prioritize mitigation for high-risk items (score ≥6)
-3. Run `*atdd` to generate failing tests for P0 scenarios (separate workflow; not auto-run by `*test-design`)
-4. Allocate resources per effort estimates
-5. Set up test data factories and fixtures
-```
-
---
-
-## Validation
-
-After completing all steps, verify:
-
- [ ] Risk assessment complete with all categories
- [ ] All risks scored (probability × impact)
- [ ] High-priority risks (≥6) flagged
- [ ] Coverage matrix maps requirements to test levels
- [ ] Priority levels assigned (P0-P3)
- [ ] Execution order defined
- [ ] Resource estimates provided
- [ ] Quality gate criteria defined
- [ ] Output file created and formatted correctly
-
-Refer to `checklist.md` for comprehensive validation criteria.
--- a/_bmad/bmm/workflows/testarch/test-design/test-design-template.md
+++ b/_bmad/bmm/workflows/testarch/test-design/test-design-template.md
@@ -1,294 +0,0 @@
-# Test Design: Epic {epic_num} - {epic_title}
-
-**Date:** {date}
-**Author:** {user_name}
-**Status:** Draft / Approved
-
---
-
-## Executive Summary
-
-**Scope:** {design_level} test design for Epic {epic_num}
-
-**Risk Summary:**
-
- Total risks identified: {total_risks}
- High-priority risks (≥6): {high_priority_count}
- Critical categories: {top_categories}
-
-**Coverage Summary:**
-
- P0 scenarios: {p0_count} ({p0_hours} hours)
- P1 scenarios: {p1_count} ({p1_hours} hours)
- P2/P3 scenarios: {p2p3_count} ({p2p3_hours} hours)
- **Total effort**: {total_hours} hours (~{total_days} days)
-
---
-
-## Risk Assessment
-
-### High-Priority Risks (Score ≥6)
-
-| Risk ID | Category | Description   | Probability | Impact | Score | Mitigation   | Owner   | Timeline |
-| ------- | -------- | ------------- | ----------- | ------ | ----- | ------------ | ------- | -------- |
-| R-001   | SEC      | {description} | 2           | 3      | 6     | {mitigation} | {owner} | {date}   |
-| R-002   | PERF     | {description} | 3           | 2      | 6     | {mitigation} | {owner} | {date}   |
-
-### Medium-Priority Risks (Score 3-4)
-
-| Risk ID | Category | Description   | Probability | Impact | Score | Mitigation   | Owner   |
-| ------- | -------- | ------------- | ----------- | ------ | ----- | ------------ | ------- |
-| R-003   | TECH     | {description} | 2           | 2      | 4     | {mitigation} | {owner} |
-| R-004   | DATA     | {description} | 1           | 3      | 3     | {mitigation} | {owner} |
-
-### Low-Priority Risks (Score 1-2)
-
-| Risk ID | Category | Description   | Probability | Impact | Score | Action  |
-| ------- | -------- | ------------- | ----------- | ------ | ----- | ------- |
-| R-005   | OPS      | {description} | 1           | 2      | 2     | Monitor |
-| R-006   | BUS      | {description} | 1           | 1      | 1     | Monitor |
-
-### Risk Category Legend
-
- **TECH**: Technical/Architecture (flaws, integration, scalability)
- **SEC**: Security (access controls, auth, data exposure)
- **PERF**: Performance (SLA violations, degradation, resource limits)
- **DATA**: Data Integrity (loss, corruption, inconsistency)
- **BUS**: Business Impact (UX harm, logic errors, revenue)
- **OPS**: Operations (deployment, config, monitoring)
-
---
-
-## Test Coverage Plan
-
-### P0 (Critical) - Run on every commit
-
-**Criteria**: Blocks core journey + High risk (≥6) + No workaround
-
-| Requirement   | Test Level | Risk Link | Test Count | Owner | Notes   |
-| ------------- | ---------- | --------- | ---------- | ----- | ------- |
-| {requirement} | E2E        | R-001     | 3          | QA    | {notes} |
-| {requirement} | API        | R-002     | 5          | QA    | {notes} |
-
-**Total P0**: {p0_count} tests, {p0_hours} hours
-
-### P1 (High) - Run on PR to main
-
-**Criteria**: Important features + Medium risk (3-4) + Common workflows
-
-| Requirement   | Test Level | Risk Link | Test Count | Owner | Notes   |
-| ------------- | ---------- | --------- | ---------- | ----- | ------- |
-| {requirement} | API        | R-003     | 4          | QA    | {notes} |
-| {requirement} | Component  | -         | 6          | DEV   | {notes} |
-
-**Total P1**: {p1_count} tests, {p1_hours} hours
-
-### P2 (Medium) - Run nightly/weekly
-
-**Criteria**: Secondary features + Low risk (1-2) + Edge cases
-
-| Requirement   | Test Level | Risk Link | Test Count | Owner | Notes   |
-| ------------- | ---------- | --------- | ---------- | ----- | ------- |
-| {requirement} | API        | R-004     | 8          | QA    | {notes} |
-| {requirement} | Unit       | -         | 15         | DEV   | {notes} |
-
-**Total P2**: {p2_count} tests, {p2_hours} hours
-
-### P3 (Low) - Run on-demand
-
-**Criteria**: Nice-to-have + Exploratory + Performance benchmarks
-
-| Requirement   | Test Level | Test Count | Owner | Notes   |
-| ------------- | ---------- | ---------- | ----- | ------- |
-| {requirement} | E2E        | 2          | QA    | {notes} |
-| {requirement} | Unit       | 8          | DEV   | {notes} |
-
-**Total P3**: {p3_count} tests, {p3_hours} hours
-
---
-
-## Execution Order
-
-### Smoke Tests (<5 min)
-
-**Purpose**: Fast feedback, catch build-breaking issues
-
- [ ] {scenario} (30s)
- [ ] {scenario} (45s)
- [ ] {scenario} (1min)
-
-**Total**: {smoke_count} scenarios
-
-### P0 Tests (<10 min)
-
-**Purpose**: Critical path validation
-
- [ ] {scenario} (E2E)
- [ ] {scenario} (API)
- [ ] {scenario} (API)
-
-**Total**: {p0_count} scenarios
-
-### P1 Tests (<30 min)
-
-**Purpose**: Important feature coverage
-
- [ ] {scenario} (API)
- [ ] {scenario} (Component)
-
-**Total**: {p1_count} scenarios
-
-### P2/P3 Tests (<60 min)
-
-**Purpose**: Full regression coverage
-
- [ ] {scenario} (Unit)
- [ ] {scenario} (API)
-
-**Total**: {p2p3_count} scenarios
-
---
-
-## Resource Estimates
-
-### Test Development Effort
-
-| Priority  | Count             | Hours/Test | Total Hours       | Notes                   |
-| --------- | ----------------- | ---------- | ----------------- | ----------------------- |
-| P0        | {p0_count}        | 2.0        | {p0_hours}        | Complex setup, security |
-| P1        | {p1_count}        | 1.0        | {p1_hours}        | Standard coverage       |
-| P2        | {p2_count}        | 0.5        | {p2_hours}        | Simple scenarios        |
-| P3        | {p3_count}        | 0.25       | {p3_hours}        | Exploratory             |
-| **Total** | **{total_count}** | **-**      | **{total_hours}** | **~{total_days} days**  |
-
-### Prerequisites
-
-**Test Data:**
-
- {factory_name} factory (faker-based, auto-cleanup)
- {fixture_name} fixture (setup/teardown)
-
-**Tooling:**
-
- {tool} for {purpose}
- {tool} for {purpose}
-
-**Environment:**
-
- {env_requirement}
- {env_requirement}
-
---
-
-## Quality Gate Criteria
-
-### Pass/Fail Thresholds
-
- **P0 pass rate**: 100% (no exceptions)
- **P1 pass rate**: ≥95% (waivers required for failures)
- **P2/P3 pass rate**: ≥90% (informational)
- **High-risk mitigations**: 100% complete or approved waivers
-
-### Coverage Targets
-
- **Critical paths**: ≥80%
- **Security scenarios**: 100%
- **Business logic**: ≥70%
- **Edge cases**: ≥50%
-
-### Non-Negotiable Requirements
-
- [ ] All P0 tests pass
- [ ] No high-risk (≥6) items unmitigated
- [ ] Security tests (SEC category) pass 100%
- [ ] Performance targets met (PERF category)
-
---
-
-## Mitigation Plans
-
-### R-001: {Risk Description} (Score: 6)
-
-**Mitigation Strategy:** {detailed_mitigation}
-**Owner:** {owner}
-**Timeline:** {date}
-**Status:** Planned / In Progress / Complete
-**Verification:** {how_to_verify}
-
-### R-002: {Risk Description} (Score: 6)
-
-**Mitigation Strategy:** {detailed_mitigation}
-**Owner:** {owner}
-**Timeline:** {date}
-**Status:** Planned / In Progress / Complete
-**Verification:** {how_to_verify}
-
---
-
-## Assumptions and Dependencies
-
-### Assumptions
-
-1. {assumption}
-2. {assumption}
-3. {assumption}
-
-### Dependencies
-
-1. {dependency} - Required by {date}
-2. {dependency} - Required by {date}
-
-### Risks to Plan
-
- **Risk**: {risk_to_plan}
-  - **Impact**: {impact}
-  - **Contingency**: {contingency}
-
---
-
---
-
-## Follow-on Workflows (Manual)
-
- Run `*atdd` to generate failing P0 tests (separate workflow; not auto-run).
- Run `*automate` for broader coverage once implementation exists.
-
---
-
-## Approval
-
-**Test Design Approved By:**
-
- [ ] Product Manager: {name} Date: {date}
- [ ] Tech Lead: {name} Date: {date}
- [ ] QA Lead: {name} Date: {date}
-
-**Comments:**
-
---
-
---
-
---
-
-## Appendix
-
-### Knowledge Base References
-
- `risk-governance.md` - Risk classification framework
- `probability-impact.md` - Risk scoring methodology
- `test-levels-framework.md` - Test level selection
- `test-priorities-matrix.md` - P0-P3 prioritization
-
-### Related Documents
-
- PRD: {prd_link}
- Epic: {epic_link}
- Architecture: {arch_link}
- Tech Spec: {tech_spec_link}
-
---
-
-**Generated by**: BMad TEA Agent - Test Architect Module
-**Workflow**: `_bmad/bmm/testarch/test-design`
-**Version**: 4.0 (BMad v6)
--- a/_bmad/bmm/workflows/testarch/test-design/workflow.yaml
+++ b/_bmad/bmm/workflows/testarch/test-design/workflow.yaml
@@ -1,54 +0,0 @@
-# Test Architect workflow: test-design
-name: testarch-test-design
-description: "Dual-mode workflow: (1) System-level testability review in Solutioning phase, or (2) Epic-level test planning in Implementation phase. Auto-detects mode based on project phase."
-author: "BMad"
-
-# Critical variables from config
-config_source: "{project-root}/_bmad/bmm/config.yaml"
-output_folder: "{config_source}:output_folder"
-user_name: "{config_source}:user_name"
-communication_language: "{config_source}:communication_language"
-document_output_language: "{config_source}:document_output_language"
-date: system-generated
-
-# Workflow components
-installed_path: "{project-root}/_bmad/bmm/workflows/testarch/test-design"
-instructions: "{installed_path}/instructions.md"
-validation: "{installed_path}/checklist.md"
-template: "{installed_path}/test-design-template.md"
-
-# Variables and inputs
-variables:
-  design_level: "full" # full, targeted, minimal - scope of design effort
-  mode: "auto-detect" # auto-detect (default), system-level, epic-level
-
-# Output configuration
-# Note: Actual output file determined dynamically based on mode detection
-# Declared outputs for new workflow format
-outputs:
-  - id: system-level
-    description: "System-level testability review (Phase 3)"
-    path: "{output_folder}/test-design-system.md"
-  - id: epic-level
-    description: "Epic-level test plan (Phase 4)"
-    path: "{output_folder}/test-design-epic-{epic_num}.md"
-default_output_file: "{output_folder}/test-design-epic-{epic_num}.md"
-
-# Required tools
-required_tools:
-  - read_file # Read PRD, epics, stories, architecture docs
-  - write_file # Create test design document
-  - list_files # Find related documentation
-  - search_repo # Search for existing tests and patterns
-
-tags:
-  - qa
-  - planning
-  - test-architect
-  - risk-assessment
-  - coverage
-
-execution_hints:
-  interactive: false # Minimize prompts
-  autonomous: true # Proceed without user input unless blocked
-  iterative: true
--- a/_bmad/bmm/workflows/testarch/test-review/checklist.md
+++ b/_bmad/bmm/workflows/testarch/test-review/checklist.md
@@ -1,472 +0,0 @@
-# Test Quality Review - Validation Checklist
-
-Use this checklist to validate that the test quality review workflow completed successfully and all quality criteria were properly evaluated.
-
---
-
-## Prerequisites
-
-Note: `test-review` is optional and only audits existing tests; it does not generate tests.
-
-### Test File Discovery
-
- [ ] Test file(s) identified for review (single/directory/suite scope)
- [ ] Test files exist and are readable
- [ ] Test framework detected (Playwright, Jest, Cypress, Vitest, etc.)
- [ ] Test framework configuration found (playwright.config.ts, jest.config.js, etc.)
-
-### Knowledge Base Loading
-
- [ ] tea-index.csv loaded successfully
- [ ] `test-quality.md` loaded (Definition of Done)
- [ ] `fixture-architecture.md` loaded (Pure function → Fixture patterns)
- [ ] `network-first.md` loaded (Route intercept before navigate)
- [ ] `data-factories.md` loaded (Factory patterns)
- [ ] `test-levels-framework.md` loaded (E2E vs API vs Component vs Unit)
- [ ] All other enabled fragments loaded successfully
-
-### Context Gathering
-
- [ ] Story file discovered or explicitly provided (if available)
- [ ] Test design document discovered or explicitly provided (if available)
- [ ] Acceptance criteria extracted from story (if available)
- [ ] Priority context (P0/P1/P2/P3) extracted from test-design (if available)
-
---
-
-## Process Steps
-
-### Step 1: Context Loading
-
- [ ] Review scope determined (single/directory/suite)
- [ ] Test file paths collected
- [ ] Related artifacts discovered (story, test-design)
- [ ] Knowledge base fragments loaded successfully
- [ ] Quality criteria flags read from workflow variables
-
-### Step 2: Test File Parsing
-
-**For Each Test File:**
-
- [ ] File read successfully
- [ ] File size measured (lines, KB)
- [ ] File structure parsed (describe blocks, it blocks)
- [ ] Test IDs extracted (if present)
- [ ] Priority markers extracted (if present)
- [ ] Imports analyzed
- [ ] Dependencies identified
-
-**Test Structure Analysis:**
-
- [ ] Describe block count calculated
- [ ] It/test block count calculated
- [ ] BDD structure identified (Given-When-Then)
- [ ] Fixture usage detected
- [ ] Data factory usage detected
- [ ] Network interception patterns identified
- [ ] Assertions counted
- [ ] Waits and timeouts cataloged
- [ ] Conditionals (if/else) detected
- [ ] Try/catch blocks detected
- [ ] Shared state or globals detected
-
-### Step 3: Quality Criteria Validation
-
-**For Each Enabled Criterion:**
-
-#### BDD Format (if `check_given_when_then: true`)
-
- [ ] Given-When-Then structure evaluated
- [ ] Status assigned (PASS/WARN/FAIL)
- [ ] Violations recorded with line numbers
- [ ] Examples of good/bad patterns noted
-
-#### Test IDs (if `check_test_ids: true`)
-
- [ ] Test ID presence validated
- [ ] Test ID format checked (e.g., 1.3-E2E-001)
- [ ] Status assigned (PASS/WARN/FAIL)
- [ ] Missing IDs cataloged
-
-#### Priority Markers (if `check_priority_markers: true`)
-
- [ ] P0/P1/P2/P3 classification validated
- [ ] Status assigned (PASS/WARN/FAIL)
- [ ] Missing priorities cataloged
-
-#### Hard Waits (if `check_hard_waits: true`)
-
- [ ] sleep(), waitForTimeout(), hardcoded delays detected
- [ ] Justification comments checked
- [ ] Status assigned (PASS/WARN/FAIL)
- [ ] Violations recorded with line numbers and recommended fixes
-
-#### Determinism (if `check_determinism: true`)
-
- [ ] Conditionals (if/else/switch) detected
- [ ] Try/catch abuse detected
- [ ] Random values (Math.random, Date.now) detected
- [ ] Status assigned (PASS/WARN/FAIL)
- [ ] Violations recorded with recommended fixes
-
-#### Isolation (if `check_isolation: true`)
-
- [ ] Cleanup hooks (afterEach/afterAll) validated
- [ ] Shared state detected
- [ ] Global variable mutations detected
- [ ] Resource cleanup verified
- [ ] Status assigned (PASS/WARN/FAIL)
- [ ] Violations recorded with recommended fixes
-
-#### Fixture Patterns (if `check_fixture_patterns: true`)
-
- [ ] Fixtures detected (test.extend)
- [ ] Pure functions validated
- [ ] mergeTests usage checked
- [ ] beforeEach complexity analyzed
- [ ] Status assigned (PASS/WARN/FAIL)
- [ ] Violations recorded with recommended fixes
-
-#### Data Factories (if `check_data_factories: true`)
-
- [ ] Factory functions detected
- [ ] Hardcoded data (magic strings/numbers) detected
- [ ] Faker.js or similar usage validated
- [ ] API-first setup pattern checked
- [ ] Status assigned (PASS/WARN/FAIL)
- [ ] Violations recorded with recommended fixes
-
-#### Network-First (if `check_network_first: true`)
-
- [ ] page.route() before page.goto() validated
- [ ] Race conditions detected (route after navigate)
- [ ] waitForResponse patterns checked
- [ ] Status assigned (PASS/WARN/FAIL)
- [ ] Violations recorded with recommended fixes
-
-#### Assertions (if `check_assertions: true`)
-
- [ ] Explicit assertions counted
- [ ] Implicit waits without assertions detected
- [ ] Assertion specificity validated
- [ ] Status assigned (PASS/WARN/FAIL)
- [ ] Violations recorded with recommended fixes
-
-#### Test Length (if `check_test_length: true`)
-
- [ ] File line count calculated
- [ ] Threshold comparison (≤300 lines ideal)
- [ ] Status assigned (PASS/WARN/FAIL)
- [ ] Splitting recommendations generated (if >300 lines)
-
-#### Test Duration (if `check_test_duration: true`)
-
- [ ] Test complexity analyzed (as proxy for duration if no execution data)
- [ ] Threshold comparison (≤1.5 min target)
- [ ] Status assigned (PASS/WARN/FAIL)
- [ ] Optimization recommendations generated
-
-#### Flakiness Patterns (if `check_flakiness_patterns: true`)
-
- [ ] Tight timeouts detected (e.g., { timeout: 1000 })
- [ ] Race conditions detected
- [ ] Timing-dependent assertions detected
- [ ] Retry logic detected
- [ ] Environment-dependent assumptions detected
- [ ] Status assigned (PASS/WARN/FAIL)
- [ ] Violations recorded with recommended fixes
-
---
-
-### Step 4: Quality Score Calculation
-
-**Violation Counting:**
-
- [ ] Critical (P0) violations counted
- [ ] High (P1) violations counted
- [ ] Medium (P2) violations counted
- [ ] Low (P3) violations counted
- [ ] Violation breakdown by criterion recorded
-
-**Score Calculation:**
-
- [ ] Starting score: 100
- [ ] Critical violations deducted (-10 each)
- [ ] High violations deducted (-5 each)
- [ ] Medium violations deducted (-2 each)
- [ ] Low violations deducted (-1 each)
- [ ] Bonus points added (max +30):
-  - [ ] Excellent BDD structure (+5 if applicable)
-  - [ ] Comprehensive fixtures (+5 if applicable)
-  - [ ] Comprehensive data factories (+5 if applicable)
-  - [ ] Network-first pattern (+5 if applicable)
-  - [ ] Perfect isolation (+5 if applicable)
-  - [ ] All test IDs present (+5 if applicable)
- [ ] Final score calculated: max(0, min(100, Starting - Violations + Bonus))
-
-**Quality Grade:**
-
- [ ] Grade assigned based on score:
-  - 90-100: A+ (Excellent)
-  - 80-89: A (Good)
-  - 70-79: B (Acceptable)
-  - 60-69: C (Needs Improvement)
-  - <60: F (Critical Issues)
-
---
-
-### Step 5: Review Report Generation
-
-**Report Sections Created:**
-
- [ ] **Header Section**:
-  - [ ] Test file(s) reviewed listed
-  - [ ] Review date recorded
-  - [ ] Review scope noted (single/directory/suite)
-  - [ ] Quality score and grade displayed
-
- [ ] **Executive Summary**:
-  - [ ] Overall assessment (Excellent/Good/Needs Improvement/Critical)
-  - [ ] Key strengths listed (3-5 bullet points)
-  - [ ] Key weaknesses listed (3-5 bullet points)
-  - [ ] Recommendation stated (Approve/Approve with comments/Request changes/Block)
-
- [ ] **Quality Criteria Assessment**:
-  - [ ] Table with all criteria evaluated
-  - [ ] Status for each criterion (PASS/WARN/FAIL)
-  - [ ] Violation count per criterion
-
- [ ] **Critical Issues (Must Fix)**:
-  - [ ] P0/P1 violations listed
-  - [ ] Code location provided for each (file:line)
-  - [ ] Issue explanation clear
-  - [ ] Recommended fix provided with code example
-  - [ ] Knowledge base reference provided
-
- [ ] **Recommendations (Should Fix)**:
-  - [ ] P2/P3 violations listed
-  - [ ] Code location provided for each (file:line)
-  - [ ] Issue explanation clear
-  - [ ] Recommended improvement provided with code example
-  - [ ] Knowledge base reference provided
-
- [ ] **Best Practices Examples** (if good patterns found):
-  - [ ] Good patterns highlighted from tests
-  - [ ] Knowledge base fragments referenced
-  - [ ] Examples provided for others to follow
-
- [ ] **Knowledge Base References**:
-  - [ ] All fragments consulted listed
-  - [ ] Links to detailed guidance provided
-
---
-
-### Step 6: Optional Outputs Generation
-
-**Inline Comments** (if `generate_inline_comments: true`):
-
- [ ] Inline comments generated at violation locations
- [ ] Comment format: `// TODO (TEA Review): [Issue] - See test-review-{filename}.md`
- [ ] Comments added to test files (no logic changes)
- [ ] Test files remain valid and executable
-
-**Quality Badge** (if `generate_quality_badge: true`):
-
- [ ] Badge created with quality score (e.g., "Test Quality: 87/100 (A)")
- [ ] Badge format suitable for README or documentation
- [ ] Badge saved to output folder
-
-**Story Update** (if `append_to_story: true` and story file exists):
-
- [ ] "Test Quality Review" section created
- [ ] Quality score included
- [ ] Critical issues summarized
- [ ] Link to full review report provided
- [ ] Story file updated successfully
-
---
-
-### Step 7: Save and Notify
-
-**Outputs Saved:**
-
- [ ] Review report saved to `{output_file}`
- [ ] Inline comments written to test files (if enabled)
- [ ] Quality badge saved (if enabled)
- [ ] Story file updated (if enabled)
- [ ] All outputs are valid and readable
-
-**Summary Message Generated:**
-
- [ ] Quality score and grade included
- [ ] Critical issue count stated
- [ ] Recommendation provided (Approve/Request changes/Block)
- [ ] Next steps clarified
- [ ] Message displayed to user
-
---
-
-## Output Validation
-
-### Review Report Completeness
-
- [ ] All required sections present
- [ ] No placeholder text or TODOs in report
- [ ] All code locations are accurate (file:line)
- [ ] All code examples are valid and demonstrate fix
- [ ] All knowledge base references are correct
-
-### Review Report Accuracy
-
- [ ] Quality score matches violation breakdown
- [ ] Grade matches score range
- [ ] Violations correctly categorized by severity (P0/P1/P2/P3)
- [ ] Violations correctly attributed to quality criteria
- [ ] No false positives (violations are legitimate issues)
- [ ] No false negatives (critical issues not missed)
-
-### Review Report Clarity
-
- [ ] Executive summary is clear and actionable
- [ ] Issue explanations are understandable
- [ ] Recommended fixes are implementable
- [ ] Code examples are correct and runnable
- [ ] Recommendation (Approve/Request changes) is clear
-
---
-
-## Quality Checks
-
-### Knowledge-Based Validation
-
- [ ] All feedback grounded in knowledge base fragments
- [ ] Recommendations follow proven patterns
- [ ] No arbitrary or opinion-based feedback
- [ ] Knowledge fragment references accurate and relevant
-
-### Actionable Feedback
-
- [ ] Every issue includes recommended fix
- [ ] Every fix includes code example
- [ ] Code examples demonstrate correct pattern
- [ ] Fixes reference knowledge base for more detail
-
-### Severity Classification
-
- [ ] Critical (P0) issues are genuinely critical (hard waits, race conditions, no assertions)
- [ ] High (P1) issues impact maintainability/reliability (missing IDs, hardcoded data)
- [ ] Medium (P2) issues are nice-to-have improvements (long files, missing priorities)
- [ ] Low (P3) issues are minor style/preference (verbose tests)
-
-### Context Awareness
-
- [ ] Review considers project context (some patterns may be justified)
- [ ] Violations with justification comments noted as acceptable
- [ ] Edge cases acknowledged
- [ ] Recommendations are pragmatic, not dogmatic
-
---
-
-## Integration Points
-
-### Story File Integration
-
- [ ] Story file discovered correctly (if available)
- [ ] Acceptance criteria extracted and used for context
- [ ] Test quality section appended to story (if enabled)
- [ ] Link to review report added to story
-
-### Test Design Integration
-
- [ ] Test design document discovered correctly (if available)
- [ ] Priority context (P0/P1/P2/P3) extracted and used
- [ ] Review validates tests align with prioritization
- [ ] Misalignment flagged (e.g., P0 scenario missing tests)
-
-### Knowledge Base Integration
-
- [ ] tea-index.csv loaded successfully
- [ ] All required fragments loaded
- [ ] Fragments applied correctly to validation
- [ ] Fragment references in report are accurate
-
---
-
-## Edge Cases and Special Situations
-
-### Empty or Minimal Tests
-
- [ ] If test file is empty, report notes "No tests found"
- [ ] If test file has only boilerplate, report notes "No meaningful tests"
- [ ] Score reflects lack of content appropriately
-
-### Legacy Tests
-
- [ ] Legacy tests acknowledged in context
- [ ] Review provides practical recommendations for improvement
- [ ] Recognizes that complete refactor may not be feasible
- [ ] Prioritizes critical issues (flakiness) over style
-
-### Test Framework Variations
-
- [ ] Review adapts to test framework (Playwright vs Jest vs Cypress)
- [ ] Framework-specific patterns recognized (e.g., Playwright fixtures)
- [ ] Framework-specific violations detected (e.g., Cypress anti-patterns)
- [ ] Knowledge fragments applied appropriately for framework
-
-### Justified Violations
-
- [ ] Violations with justification comments in code noted as acceptable
- [ ] Justifications evaluated for legitimacy
- [ ] Report acknowledges justified patterns
- [ ] Score not penalized for justified violations
-
---
-
-## Final Validation
-
-### Review Completeness
-
- [ ] All enabled quality criteria evaluated
- [ ] All test files in scope reviewed
- [ ] All violations cataloged
- [ ] All recommendations provided
- [ ] Review report is comprehensive
-
-### Review Accuracy
-
- [ ] Quality score is accurate
- [ ] Violations are correct (no false positives)
- [ ] Critical issues not missed (no false negatives)
- [ ] Code locations are correct
- [ ] Knowledge base references are accurate
-
-### Review Usefulness
-
- [ ] Feedback is actionable
- [ ] Recommendations are implementable
- [ ] Code examples are correct
- [ ] Review helps developer improve tests
- [ ] Review educates on best practices
-
-### Workflow Complete
-
- [ ] All checklist items completed
- [ ] All outputs validated and saved
- [ ] User notified with summary
- [ ] Review ready for developer consumption
- [ ] Follow-up actions identified (if any)
-
---
-
-## Notes
-
-Record any issues, observations, or important context during workflow execution:
-
- **Test Framework**: [Playwright, Jest, Cypress, etc.]
- **Review Scope**: [single file, directory, full suite]
- **Quality Score**: [0-100 score, letter grade]
- **Critical Issues**: [Count of P0/P1 violations]
- **Recommendation**: [Approve / Approve with comments / Request changes / Block]
- **Special Considerations**: [Legacy code, justified patterns, edge cases]
- **Follow-up Actions**: [Re-review after fixes, pair programming, etc.]
--- a/_bmad/bmm/workflows/testarch/test-review/instructions.md
+++ b/_bmad/bmm/workflows/testarch/test-review/instructions.md
@@ -1,628 +0,0 @@
-# Test Quality Review - Instructions v4.0
-
-**Workflow:** `testarch-test-review`
-**Purpose:** Review test quality using TEA's comprehensive knowledge base and validate against best practices for maintainability, determinism, isolation, and flakiness prevention
-**Agent:** Test Architect (TEA)
-**Format:** Pure Markdown v4.0 (no XML blocks)
-
---
-
-## Overview
-
-This workflow performs comprehensive test quality reviews using TEA's knowledge base of best practices. It validates tests against proven patterns for fixture architecture, network-first safeguards, data factories, determinism, isolation, and flakiness prevention. The review generates actionable feedback with quality scoring.
-
-**Key Capabilities:**
-
- **Knowledge-Based Review**: Applies patterns from tea-index.csv fragments
- **Quality Scoring**: 0-100 score based on violations and best practices
- **Multi-Scope**: Review single file, directory, or entire test suite
- **Pattern Detection**: Identifies flaky patterns, hard waits, race conditions
- **Best Practice Validation**: BDD format, test IDs, priorities, assertions
- **Actionable Feedback**: Critical issues (must fix) vs recommendations (should fix)
- **Integration**: Works with story files, test-design, acceptance criteria
-
---
-
-## Prerequisites
-
-**Required:**
-
- Test file(s) to review (auto-discovered or explicitly provided)
- Test framework configuration (playwright.config.ts, jest.config.js, etc.)
-
-**Recommended:**
-
- Story file with acceptance criteria (for context)
- Test design document (for priority context)
- Knowledge base fragments available in tea-index.csv
-
-**Halt Conditions:**
-
- If test file path is invalid or file doesn't exist, halt and request correction
- If test_dir is empty (no tests found), halt and notify user
-
---
-
-## Workflow Steps
-
-### Step 1: Load Context and Knowledge Base
-
-**Actions:**
-
-1. Check playwright-utils flag:
-   - Read `{config_source}` and check `config.tea_use_playwright_utils`
-
-2. Load relevant knowledge fragments from `{project-root}/_bmad/bmm/testarch/tea-index.csv`:
-
-   **Core Patterns (Always load):**
-   - `test-quality.md` - Definition of Done (deterministic tests, isolated with cleanup, explicit assertions, <300 lines, <1.5 min, 658 lines, 5 examples)
-   - `data-factories.md` - Factory functions with faker: overrides, nested factories, API-first setup (498 lines, 5 examples)
-   - `test-levels-framework.md` - E2E vs API vs Component vs Unit appropriateness with decision matrix (467 lines, 4 examples)
-   - `selective-testing.md` - Duplicate coverage detection with tag-based, spec filter, diff-based selection (727 lines, 4 examples)
-   - `test-healing-patterns.md` - Common failure patterns: stale selectors, race conditions, dynamic data, network errors, hard waits (648 lines, 5 examples)
-   - `selector-resilience.md` - Selector best practices (data-testid > ARIA > text > CSS hierarchy, anti-patterns, 541 lines, 4 examples)
-   - `timing-debugging.md` - Race condition prevention and async debugging techniques (370 lines, 3 examples)
-
-   **If `config.tea_use_playwright_utils: true` (All Utilities):**
-   - `overview.md` - Playwright utils best practices
-   - `api-request.md` - Validate apiRequest usage patterns
-   - `network-recorder.md` - Review HAR record/playback implementation
-   - `auth-session.md` - Check auth token management
-   - `intercept-network-call.md` - Validate network interception
-   - `recurse.md` - Review polling patterns
-   - `log.md` - Check logging best practices
-   - `file-utils.md` - Validate file operation patterns
-   - `burn-in.md` - Review burn-in configuration
-   - `network-error-monitor.md` - Check error monitoring setup
-   - `fixtures-composition.md` - Validate mergeTests usage
-
-   **If `config.tea_use_playwright_utils: false`:**
-   - `fixture-architecture.md` - Pure function → Fixture → mergeTests composition with auto-cleanup (406 lines, 5 examples)
-   - `network-first.md` - Route intercept before navigate to prevent race conditions (489 lines, 5 examples)
-   - `playwright-config.md` - Environment-based configuration with fail-fast validation (722 lines, 5 examples)
-   - `component-tdd.md` - Red-Green-Refactor patterns with provider isolation (480 lines, 4 examples)
-   - `ci-burn-in.md` - Flaky test detection with 10-iteration burn-in loop (678 lines, 4 examples)
-
-3. Determine review scope:
-   - **single**: Review one test file (`test_file_path` provided)
-   - **directory**: Review all tests in directory (`test_dir` provided)
-   - **suite**: Review entire test suite (discover all test files)
-
-4. Auto-discover related artifacts (if `auto_discover_story: true`):
-   - Extract test ID from filename (e.g., `1.3-E2E-001.spec.ts` → story 1.3)
-   - Search for story file (`story-1.3.md`)
-   - Search for test design (`test-design-story-1.3.md` or `test-design-epic-1.md`)
-
-5. Read story file for context (if available):
-   - Extract acceptance criteria
-   - Extract priority classification
-   - Extract expected test IDs
-
-**Output:** Complete knowledge base loaded, review scope determined, context gathered
-
---
-
-### Step 2: Discover and Parse Test Files
-
-**Actions:**
-
-1. **Discover test files** based on scope:
-   - **single**: Use `test_file_path` variable
-   - **directory**: Use `glob` to find all test files in `test_dir` (e.g., `*.spec.ts`, `*.test.js`)
-   - **suite**: Use `glob` to find all test files recursively from project root
-
-2. **Parse test file metadata**:
-   - File path and name
-   - File size (warn if >15 KB or >300 lines)
-   - Test framework detected (Playwright, Jest, Cypress, Vitest, etc.)
-   - Imports and dependencies
-   - Test structure (describe/context/it blocks)
-
-3. **Extract test structure**:
-   - Count of describe blocks (test suites)
-   - Count of it/test blocks (individual tests)
-   - Test IDs (if present, e.g., `test.describe('1.3-E2E-001')`)
-   - Priority markers (if present, e.g., `test.describe.only` for P0)
-   - BDD structure (Given-When-Then comments or steps)
-
-4. **Identify test patterns**:
-   - Fixtures used
-   - Data factories used
-   - Network interception patterns
-   - Assertions used (expect, assert, toHaveText, etc.)
-   - Waits and timeouts (page.waitFor, sleep, hardcoded delays)
-   - Conditionals (if/else, switch, ternary)
-   - Try/catch blocks
-   - Shared state or globals
-
-**Output:** Complete test file inventory with structure and pattern analysis
-
---
-
-### Step 3: Validate Against Quality Criteria
-
-**Actions:**
-
-For each test file, validate against quality criteria (configurable via workflow variables):
-
-#### 1. BDD Format Validation (if `check_given_when_then: true`)
-
- ✅ **PASS**: Tests use Given-When-Then structure (comments or step organization)
- ⚠️ **WARN**: Tests have some structure but not explicit GWT
- ❌ **FAIL**: Tests lack clear structure, hard to understand intent
-
-**Knowledge Fragment**: test-quality.md, tdd-cycles.md
-
---
-
-#### 2. Test ID Conventions (if `check_test_ids: true`)
-
- ✅ **PASS**: Test IDs present and follow convention (e.g., `1.3-E2E-001`, `2.1-API-005`)
- ⚠️ **WARN**: Some test IDs missing or inconsistent
- ❌ **FAIL**: No test IDs, can't trace tests to requirements
-
-**Knowledge Fragment**: traceability.md, test-quality.md
-
---
-
-#### 3. Priority Markers (if `check_priority_markers: true`)
-
- ✅ **PASS**: Tests classified as P0/P1/P2/P3 (via markers or test-design reference)
- ⚠️ **WARN**: Some priority classifications missing
- ❌ **FAIL**: No priority classification, can't determine criticality
-
-**Knowledge Fragment**: test-priorities.md, risk-governance.md
-
---
-
-#### 4. Hard Waits Detection (if `check_hard_waits: true`)
-
- ✅ **PASS**: No hard waits detected (no `sleep()`, `wait(5000)`, hardcoded delays)
- ⚠️ **WARN**: Some hard waits used but with justification comments
- ❌ **FAIL**: Hard waits detected without justification (flakiness risk)
-
-**Patterns to detect:**
-
- `sleep(1000)`, `setTimeout()`, `delay()`
- `page.waitForTimeout(5000)` without explicit reason
- `await new Promise(resolve => setTimeout(resolve, 3000))`
-
-**Knowledge Fragment**: test-quality.md, network-first.md
-
---
-
-#### 5. Determinism Check (if `check_determinism: true`)
-
- ✅ **PASS**: Tests are deterministic (no conditionals, no try/catch abuse, no random values)
- ⚠️ **WARN**: Some conditionals but with clear justification
- ❌ **FAIL**: Tests use if/else, switch, or try/catch to control flow (flakiness risk)
-
-**Patterns to detect:**
-
- `if (condition) { test logic }` - tests should work deterministically
- `try { test } catch { fallback }` - tests shouldn't swallow errors
- `Math.random()`, `Date.now()` without factory abstraction
-
-**Knowledge Fragment**: test-quality.md, data-factories.md
-
---
-
-#### 6. Isolation Validation (if `check_isolation: true`)
-
- ✅ **PASS**: Tests clean up resources, no shared state, can run in any order
- ⚠️ **WARN**: Some cleanup missing but isolated enough
- ❌ **FAIL**: Tests share state, depend on execution order, leave resources
-
-**Patterns to check:**
-
- afterEach/afterAll cleanup hooks present
- No global variables mutated
- Database/API state cleaned up after tests
- Test data deleted or marked inactive
-
-**Knowledge Fragment**: test-quality.md, data-factories.md
-
---
-
-#### 7. Fixture Patterns (if `check_fixture_patterns: true`)
-
- ✅ **PASS**: Uses pure function → Fixture → mergeTests pattern
- ⚠️ **WARN**: Some fixtures used but not consistently
- ❌ **FAIL**: No fixtures, tests repeat setup code (maintainability risk)
-
-**Patterns to check:**
-
- Fixtures defined (e.g., `test.extend({ customFixture: async ({}, use) => { ... }})`)
- Pure functions used for fixture logic
- mergeTests used to combine fixtures
- No beforeEach with complex setup (should be in fixtures)
-
-**Knowledge Fragment**: fixture-architecture.md
-
---
-
-#### 8. Data Factories (if `check_data_factories: true`)
-
- ✅ **PASS**: Uses factory functions with overrides, API-first setup
- ⚠️ **WARN**: Some factories used but also hardcoded data
- ❌ **FAIL**: Hardcoded test data, magic strings/numbers (maintainability risk)
-
-**Patterns to check:**
-
- Factory functions defined (e.g., `createUser()`, `generateInvoice()`)
- Factories use faker.js or similar for realistic data
- Factories accept overrides (e.g., `createUser({ email: 'custom@example.com' })`)
- API-first setup (create via API, test via UI)
-
-**Knowledge Fragment**: data-factories.md
-
---
-
-#### 9. Network-First Pattern (if `check_network_first: true`)
-
- ✅ **PASS**: Route interception set up BEFORE navigation (race condition prevention)
- ⚠️ **WARN**: Some routes intercepted correctly, others after navigation
- ❌ **FAIL**: Route interception after navigation (race condition risk)
-
-**Patterns to check:**
-
- `page.route()` called before `page.goto()`
- `page.waitForResponse()` used with explicit URL pattern
- No navigation followed immediately by route setup
-
-**Knowledge Fragment**: network-first.md
-
---
-
-#### 10. Assertions (if `check_assertions: true`)
-
- ✅ **PASS**: Explicit assertions present (expect, assert, toHaveText)
- ⚠️ **WARN**: Some tests rely on implicit waits instead of assertions
- ❌ **FAIL**: Missing assertions, tests don't verify behavior
-
-**Patterns to check:**
-
- Each test has at least one assertion
- Assertions are specific (not just truthy checks)
- Assertions use framework-provided matchers (toHaveText, toBeVisible)
-
-**Knowledge Fragment**: test-quality.md
-
---
-
-#### 11. Test Length (if `check_test_length: true`)
-
- ✅ **PASS**: Test file ≤200 lines (ideal), ≤300 lines (acceptable)
- ⚠️ **WARN**: Test file 301-500 lines (consider splitting)
- ❌ **FAIL**: Test file >500 lines (too large, maintainability risk)
-
-**Knowledge Fragment**: test-quality.md
-
---
-
-#### 12. Test Duration (if `check_test_duration: true`)
-
- ✅ **PASS**: Individual tests ≤1.5 minutes (target: <30 seconds)
- ⚠️ **WARN**: Some tests 1.5-3 minutes (consider optimization)
- ❌ **FAIL**: Tests >3 minutes (too slow, impacts CI/CD)
-
-**Note:** Duration estimation based on complexity analysis if execution data unavailable
-
-**Knowledge Fragment**: test-quality.md, selective-testing.md
-
---
-
-#### 13. Flakiness Patterns (if `check_flakiness_patterns: true`)
-
- ✅ **PASS**: No known flaky patterns detected
- ⚠️ **WARN**: Some potential flaky patterns (e.g., tight timeouts, race conditions)
- ❌ **FAIL**: Multiple flaky patterns detected (high flakiness risk)
-
-**Patterns to detect:**
-
- Tight timeouts (e.g., `{ timeout: 1000 }`)
- Race conditions (navigation before route interception)
- Timing-dependent assertions (e.g., checking timestamps)
- Retry logic in tests (hides flakiness)
- Environment-dependent assumptions (hardcoded URLs, ports)
-
-**Knowledge Fragment**: test-quality.md, network-first.md, ci-burn-in.md
-
---
-
-### Step 4: Calculate Quality Score
-
-**Actions:**
-
-1. **Count violations** by severity:
-   - **Critical (P0)**: Hard waits without justification, no assertions, race conditions, shared state
-   - **High (P1)**: Missing test IDs, no BDD structure, hardcoded data, missing fixtures
-   - **Medium (P2)**: Long test files (>300 lines), missing priorities, some conditionals
-   - **Low (P3)**: Minor style issues, incomplete cleanup, verbose tests
-
-2. **Calculate quality score** (if `quality_score_enabled: true`):
-
-```
-Starting Score: 100
-
-Critical Violations: -10 points each
-High Violations: -5 points each
-Medium Violations: -2 points each
-Low Violations: -1 point each
-
-Bonus Points:
-+ Excellent BDD structure: +5
-+ Comprehensive fixtures: +5
-+ Comprehensive data factories: +5
-+ Network-first pattern: +5
-+ Perfect isolation: +5
-+ All test IDs present: +5
-
-Quality Score: max(0, min(100, Starting Score - Violations + Bonus))
-```
-
-3. **Quality Grade**:
-   - **90-100**: Excellent (A+)
-   - **80-89**: Good (A)
-   - **70-79**: Acceptable (B)
-   - **60-69**: Needs Improvement (C)
-   - **<60**: Critical Issues (F)
-
-**Output:** Quality score calculated with violation breakdown
-
---
-
-### Step 5: Generate Review Report
-
-**Actions:**
-
-1. **Create review report** using `test-review-template.md`:
-
-   **Header Section:**
-   - Test file(s) reviewed
-   - Review date
-   - Review scope (single/directory/suite)
-   - Quality score and grade
-
-   **Executive Summary:**
-   - Overall assessment (Excellent/Good/Needs Improvement/Critical)
-   - Key strengths
-   - Key weaknesses
-   - Recommendation (Approve/Approve with comments/Request changes)
-
-   **Quality Criteria Assessment:**
-   - Table with all criteria evaluated
-   - Status for each (PASS/WARN/FAIL)
-   - Violation count per criterion
-
-   **Critical Issues (Must Fix):**
-   - Priority P0/P1 violations
-   - Code location (file:line)
-   - Explanation of issue
-   - Recommended fix
-   - Knowledge base reference
-
-   **Recommendations (Should Fix):**
-   - Priority P2/P3 violations
-   - Code location (file:line)
-   - Explanation of issue
-   - Recommended improvement
-   - Knowledge base reference
-
-   **Best Practices Examples:**
-   - Highlight good patterns found in tests
-   - Reference knowledge base fragments
-   - Provide examples for others to follow
-
-   **Knowledge Base References:**
-   - List all fragments consulted
-   - Provide links to detailed guidance
-
-2. **Generate inline comments** (if `generate_inline_comments: true`):
-   - Add TODO comments in test files at violation locations
-   - Format: `// TODO (TEA Review): [Issue description] - See test-review-{filename}.md`
-   - Never modify test logic, only add comments
-
-3. **Generate quality badge** (if `generate_quality_badge: true`):
-   - Create badge with quality score (e.g., "Test Quality: 87/100 (A)")
-   - Format for inclusion in README or documentation
-
-4. **Append to story file** (if `append_to_story: true` and story file exists):
-   - Add "Test Quality Review" section to story
-   - Include quality score and critical issues
-   - Link to full review report
-
-**Output:** Comprehensive review report with actionable feedback
-
---
-
-### Step 6: Save Outputs and Notify
-
-**Actions:**
-
-1. **Save review report** to `{output_file}`
-2. **Save inline comments** to test files (if enabled)
-3. **Save quality badge** to output folder (if enabled)
-4. **Update story file** (if enabled)
-5. **Generate summary message** for user:
-   - Quality score and grade
-   - Critical issue count
-   - Recommendation
-
-**Output:** All review artifacts saved and user notified
-
---
-
-## Quality Criteria Decision Matrix
-
-| Criterion          | PASS                      | WARN           | FAIL                | Knowledge Fragment      |
-| ------------------ | ------------------------- | -------------- | ------------------- | ----------------------- |
-| BDD Format         | Given-When-Then present   | Some structure | No structure        | test-quality.md         |
-| Test IDs           | All tests have IDs        | Some missing   | No IDs              | traceability.md         |
-| Priority Markers   | All classified            | Some missing   | No classification   | test-priorities.md      |
-| Hard Waits         | No hard waits             | Some justified | Hard waits present  | test-quality.md         |
-| Determinism        | No conditionals/random    | Some justified | Conditionals/random | test-quality.md         |
-| Isolation          | Clean up, no shared state | Some gaps      | Shared state        | test-quality.md         |
-| Fixture Patterns   | Pure fn → Fixture         | Some fixtures  | No fixtures         | fixture-architecture.md |
-| Data Factories     | Factory functions         | Some factories | Hardcoded data      | data-factories.md       |
-| Network-First      | Intercept before navigate | Some correct   | Race conditions     | network-first.md        |
-| Assertions         | Explicit assertions       | Some implicit  | Missing assertions  | test-quality.md         |
-| Test Length        | ≤300 lines                | 301-500 lines  | >500 lines          | test-quality.md         |
-| Test Duration      | ≤1.5 min                  | 1.5-3 min      | >3 min              | test-quality.md         |
-| Flakiness Patterns | No flaky patterns         | Some potential | Multiple patterns   | ci-burn-in.md           |
-
---
-
-## Example Review Summary
-
-````markdown
-# Test Quality Review: auth-login.spec.ts
-
-**Quality Score**: 78/100 (B - Acceptable)
-**Review Date**: 2025-10-14
-**Recommendation**: Approve with Comments
-
-## Executive Summary
-
-Overall, the test demonstrates good structure and coverage of the login flow. However, there are several areas for improvement to enhance maintainability and prevent flakiness.
-
-**Strengths:**
-
- Excellent BDD structure with clear Given-When-Then comments
- Good use of test IDs (1.3-E2E-001, 1.3-E2E-002)
- Comprehensive assertions on authentication state
-
-**Weaknesses:**
-
- Hard wait detected (page.waitForTimeout(2000)) - flakiness risk
- Hardcoded test data (email: 'test@example.com') - use factories instead
- Missing fixture for common login setup - DRY violation
-
-**Recommendation**: Address critical issue (hard wait) before merging. Other improvements can be addressed in follow-up PR.
-
-## Critical Issues (Must Fix)
-
-### 1. Hard Wait Detected (Line 45)
-
-**Severity**: P0 (Critical)
-**Issue**: `await page.waitForTimeout(2000)` introduces flakiness
-**Fix**: Use explicit wait for element or network request instead
-**Knowledge**: See test-quality.md, network-first.md
-
-```typescript
-// ❌ Bad (current)
-await page.waitForTimeout(2000);
-await expect(page.locator('[data-testid="user-menu"]')).toBeVisible();
-
-// ✅ Good (recommended)
-await expect(page.locator('[data-testid="user-menu"]')).toBeVisible({ timeout: 10000 });
-```
-````
-
-## Recommendations (Should Fix)
-
-### 1. Use Data Factory for Test User (Lines 23, 32, 41)
-
-**Severity**: P1 (High)
-**Issue**: Hardcoded email `test@example.com` - maintainability risk
-**Fix**: Create factory function for test users
-**Knowledge**: See data-factories.md
-
-```typescript
-// ✅ Good (recommended)
-import { createTestUser } from './factories/user-factory';
-
-const testUser = createTestUser({ role: 'admin' });
-await loginPage.login(testUser.email, testUser.password);
-```
-
-### 2. Extract Login Setup to Fixture (Lines 18-28)
-
-**Severity**: P1 (High)
-**Issue**: Login setup repeated across tests - DRY violation
-**Fix**: Create fixture for authenticated state
-**Knowledge**: See fixture-architecture.md
-
-```typescript
-// ✅ Good (recommended)
-const test = base.extend({
-  authenticatedPage: async ({ page }, use) => {
-    const user = createTestUser();
-    await loginPage.login(user.email, user.password);
-    await use(page);
-  },
-});
-
-test('user can access dashboard', async ({ authenticatedPage }) => {
-  // Test starts already logged in
-});
-```
-
-## Quality Score Breakdown
-
- Starting Score: 100
- Critical Violations (1 × -10): -10
- High Violations (2 × -5): -10
- Medium Violations (0 × -2): 0
- Low Violations (1 × -1): -1
- Bonus (BDD +5, Test IDs +5): +10
- **Final Score**: 78/100 (B)
-
-```
-
---
-
-## Integration with Other Workflows
-
-### Before Test Review
-
- **atdd**: Generate acceptance tests (TEA reviews them for quality)
- **automate**: Expand regression suite (TEA reviews new tests)
- **dev story**: Developer writes implementation tests (TEA reviews them)
-
-### After Test Review
-
- **Developer**: Addresses critical issues, improves based on recommendations
- **gate**: Test quality review feeds into gate decision (high-quality tests increase confidence)
-
-### Coordinates With
-
- **Story File**: Review links to acceptance criteria context
- **Test Design**: Review validates tests align with prioritization
- **Knowledge Base**: Review references fragments for detailed guidance
-
---
-
-## Important Notes
-
-1. **Non-Prescriptive**: Review provides guidance, not rigid rules
-2. **Context Matters**: Some violations may be justified for specific scenarios
-3. **Knowledge-Based**: All feedback grounded in proven patterns from tea-index.csv
-4. **Actionable**: Every issue includes recommended fix with code examples
-5. **Quality Score**: Use as indicator, not absolute measure
-6. **Continuous Improvement**: Review same tests periodically as patterns evolve
-
---
-
-## Troubleshooting
-
-**Problem: No test files found**
- Verify test_dir path is correct
- Check test file extensions match glob pattern
- Ensure test files exist in expected location
-
-**Problem: Quality score seems too low/high**
- Review violation counts - may need to adjust thresholds
- Consider context - some projects have different standards
- Focus on critical issues first, not just score
-
-**Problem: Inline comments not generated**
- Check generate_inline_comments: true in variables
- Verify write permissions on test files
- Review append_to_file: false (separate report mode)
-
-**Problem: Knowledge fragments not loading**
- Verify tea-index.csv exists in testarch/ directory
- Check fragment file paths are correct
- Ensure auto_load_knowledge: true in variables
-```
--- a/_bmad/bmm/workflows/testarch/test-review/test-review-template.md
+++ b/_bmad/bmm/workflows/testarch/test-review/test-review-template.md
@@ -1,390 +0,0 @@
-# Test Quality Review: {test_filename}
-
-**Quality Score**: {score}/100 ({grade} - {assessment})
-**Review Date**: {YYYY-MM-DD}
-**Review Scope**: {single | directory | suite}
-**Reviewer**: {user_name or TEA Agent}
-
---
-
-Note: This review audits existing tests; it does not generate tests.
-
-## Executive Summary
-
-**Overall Assessment**: {Excellent | Good | Acceptable | Needs Improvement | Critical Issues}
-
-**Recommendation**: {Approve | Approve with Comments | Request Changes | Block}
-
-### Key Strengths
-
-✅ {strength_1}
-✅ {strength_2}
-✅ {strength_3}
-
-### Key Weaknesses
-
-❌ {weakness_1}
-❌ {weakness_2}
-❌ {weakness_3}
-
-### Summary
-
-{1-2 paragraph summary of overall test quality, highlighting major findings and recommendation rationale}
-
---
-
-## Quality Criteria Assessment
-
-| Criterion                            | Status                          | Violations | Notes        |
-| ------------------------------------ | ------------------------------- | ---------- | ------------ |
-| BDD Format (Given-When-Then)         | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
-| Test IDs                             | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
-| Priority Markers (P0/P1/P2/P3)       | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
-| Hard Waits (sleep, waitForTimeout)   | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
-| Determinism (no conditionals)        | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
-| Isolation (cleanup, no shared state) | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
-| Fixture Patterns                     | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
-| Data Factories                       | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
-| Network-First Pattern                | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
-| Explicit Assertions                  | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
-| Test Length (≤300 lines)             | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {lines}    | {brief_note} |
-| Test Duration (≤1.5 min)             | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {duration} | {brief_note} |
-| Flakiness Patterns                   | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
-
-**Total Violations**: {critical_count} Critical, {high_count} High, {medium_count} Medium, {low_count} Low
-
---
-
-## Quality Score Breakdown
-
-```
-Starting Score:          100
-Critical Violations:     -{critical_count} × 10 = -{critical_deduction}
-High Violations:         -{high_count} × 5 = -{high_deduction}
-Medium Violations:       -{medium_count} × 2 = -{medium_deduction}
-Low Violations:          -{low_count} × 1 = -{low_deduction}
-
-Bonus Points:
-  Excellent BDD:         +{0|5}
-  Comprehensive Fixtures: +{0|5}
-  Data Factories:        +{0|5}
-  Network-First:         +{0|5}
-  Perfect Isolation:     +{0|5}
-  All Test IDs:          +{0|5}
-                         --------
-Total Bonus:             +{bonus_total}
-
-Final Score:             {final_score}/100
-Grade:                   {grade}
-```
-
---
-
-## Critical Issues (Must Fix)
-
-{If no critical issues: "No critical issues detected. ✅"}
-
-{For each critical issue:}
-
-### {issue_number}. {Issue Title}
-
-**Severity**: P0 (Critical)
-**Location**: `{filename}:{line_number}`
-**Criterion**: {criterion_name}
-**Knowledge Base**: [{fragment_name}]({fragment_path})
-
-**Issue Description**:
-{Detailed explanation of what the problem is and why it's critical}
-
-**Current Code**:
-
-```typescript
-// ❌ Bad (current implementation)
-{
-  code_snippet_showing_problem;
-}
-```
-
-**Recommended Fix**:
-
-```typescript
-// ✅ Good (recommended approach)
-{
-  code_snippet_showing_solution;
-}
-```
-
-**Why This Matters**:
-{Explanation of impact - flakiness risk, maintainability, reliability}
-
-**Related Violations**:
-{If similar issue appears elsewhere, note line numbers}
-
---
-
-## Recommendations (Should Fix)
-
-{If no recommendations: "No additional recommendations. Test quality is excellent. ✅"}
-
-{For each recommendation:}
-
-### {rec_number}. {Recommendation Title}
-
-**Severity**: {P1 (High) | P2 (Medium) | P3 (Low)}
-**Location**: `{filename}:{line_number}`
-**Criterion**: {criterion_name}
-**Knowledge Base**: [{fragment_name}]({fragment_path})
-
-**Issue Description**:
-{Detailed explanation of what could be improved and why}
-
-**Current Code**:
-
-```typescript
-// ⚠️ Could be improved (current implementation)
-{
-  code_snippet_showing_current_approach;
-}
-```
-
-**Recommended Improvement**:
-
-```typescript
-// ✅ Better approach (recommended)
-{
-  code_snippet_showing_improvement;
-}
-```
-
-**Benefits**:
-{Explanation of benefits - maintainability, readability, reusability}
-
-**Priority**:
-{Why this is P1/P2/P3 - urgency and impact}
-
---
-
-## Best Practices Found
-
-{If good patterns found, highlight them}
-
-{For each best practice:}
-
-### {practice_number}. {Best Practice Title}
-
-**Location**: `{filename}:{line_number}`
-**Pattern**: {pattern_name}
-**Knowledge Base**: [{fragment_name}]({fragment_path})
-
-**Why This Is Good**:
-{Explanation of why this pattern is excellent}
-
-**Code Example**:
-
-```typescript
-// ✅ Excellent pattern demonstrated in this test
-{
-  code_snippet_showing_best_practice;
-}
-```
-
-**Use as Reference**:
-{Encourage using this pattern in other tests}
-
---
-
-## Test File Analysis
-
-### File Metadata
-
- **File Path**: `{relative_path_from_project_root}`
- **File Size**: {line_count} lines, {kb_size} KB
- **Test Framework**: {Playwright | Jest | Cypress | Vitest | Other}
- **Language**: {TypeScript | JavaScript}
-
-### Test Structure
-
- **Describe Blocks**: {describe_count}
- **Test Cases (it/test)**: {test_count}
- **Average Test Length**: {avg_lines_per_test} lines per test
- **Fixtures Used**: {fixture_count} ({fixture_names})
- **Data Factories Used**: {factory_count} ({factory_names})
-
-### Test Coverage Scope
-
- **Test IDs**: {test_id_list}
- **Priority Distribution**:
-  - P0 (Critical): {p0_count} tests
-  - P1 (High): {p1_count} tests
-  - P2 (Medium): {p2_count} tests
-  - P3 (Low): {p3_count} tests
-  - Unknown: {unknown_count} tests
-
-### Assertions Analysis
-
- **Total Assertions**: {assertion_count}
- **Assertions per Test**: {avg_assertions_per_test} (avg)
- **Assertion Types**: {assertion_types_used}
-
---
-
-## Context and Integration
-
-### Related Artifacts
-
-{If story file found:}
-
- **Story File**: [{story_filename}]({story_path})
- **Acceptance Criteria Mapped**: {ac_mapped}/{ac_total} ({ac_coverage}%)
-
-{If test-design found:}
-
- **Test Design**: [{test_design_filename}]({test_design_path})
- **Risk Assessment**: {risk_level}
- **Priority Framework**: P0-P3 applied
-
-### Acceptance Criteria Validation
-
-{If story file available, map tests to ACs:}
-
-| Acceptance Criterion | Test ID   | Status                     | Notes   |
-| -------------------- | --------- | -------------------------- | ------- |
-| {AC_1}               | {test_id} | {✅ Covered \| ❌ Missing} | {notes} |
-| {AC_2}               | {test_id} | {✅ Covered \| ❌ Missing} | {notes} |
-| {AC_3}               | {test_id} | {✅ Covered \| ❌ Missing} | {notes} |
-
-**Coverage**: {covered_count}/{total_count} criteria covered ({coverage_percentage}%)
-
---
-
-## Knowledge Base References
-
-This review consulted the following knowledge base fragments:
-
- **[test-quality.md](../../../testarch/knowledge/test-quality.md)** - Definition of Done for tests (no hard waits, <300 lines, <1.5 min, self-cleaning)
- **[fixture-architecture.md](../../../testarch/knowledge/fixture-architecture.md)** - Pure function → Fixture → mergeTests pattern
- **[network-first.md](../../../testarch/knowledge/network-first.md)** - Route intercept before navigate (race condition prevention)
- **[data-factories.md](../../../testarch/knowledge/data-factories.md)** - Factory functions with overrides, API-first setup
- **[test-levels-framework.md](../../../testarch/knowledge/test-levels-framework.md)** - E2E vs API vs Component vs Unit appropriateness
- **[tdd-cycles.md](../../../testarch/knowledge/tdd-cycles.md)** - Red-Green-Refactor patterns
- **[selective-testing.md](../../../testarch/knowledge/selective-testing.md)** - Duplicate coverage detection
- **[ci-burn-in.md](../../../testarch/knowledge/ci-burn-in.md)** - Flakiness detection patterns (10-iteration loop)
- **[test-priorities.md](../../../testarch/knowledge/test-priorities.md)** - P0/P1/P2/P3 classification framework
- **[traceability.md](../../../testarch/knowledge/traceability.md)** - Requirements-to-tests mapping
-
-See [tea-index.csv](../../../testarch/tea-index.csv) for complete knowledge base.
-
---
-
-## Next Steps
-
-### Immediate Actions (Before Merge)
-
-1. **{action_1}** - {description}
-   - Priority: {P0 | P1 | P2}
-   - Owner: {team_or_person}
-   - Estimated Effort: {time_estimate}
-
-2. **{action_2}** - {description}
-   - Priority: {P0 | P1 | P2}
-   - Owner: {team_or_person}
-   - Estimated Effort: {time_estimate}
-
-### Follow-up Actions (Future PRs)
-
-1. **{action_1}** - {description}
-   - Priority: {P2 | P3}
-   - Target: {next_sprint | backlog}
-
-2. **{action_2}** - {description}
-   - Priority: {P2 | P3}
-   - Target: {next_sprint | backlog}
-
-### Re-Review Needed?
-
-{✅ No re-review needed - approve as-is}
-{⚠️ Re-review after critical fixes - request changes, then re-review}
-{❌ Major refactor required - block merge, pair programming recommended}
-
---
-
-## Decision
-
-**Recommendation**: {Approve | Approve with Comments | Request Changes | Block}
-
-**Rationale**:
-{1-2 paragraph explanation of recommendation based on findings}
-
-**For Approve**:
-
-> Test quality is excellent/good with {score}/100 score. {Minor issues noted can be addressed in follow-up PRs.} Tests are production-ready and follow best practices.
-
-**For Approve with Comments**:
-
-> Test quality is acceptable with {score}/100 score. {High-priority recommendations should be addressed but don't block merge.} Critical issues resolved, but improvements would enhance maintainability.
-
-**For Request Changes**:
-
-> Test quality needs improvement with {score}/100 score. {Critical issues must be fixed before merge.} {X} critical violations detected that pose flakiness/maintainability risks.
-
-**For Block**:
-
-> Test quality is insufficient with {score}/100 score. {Multiple critical issues make tests unsuitable for production.} Recommend pairing session with QA engineer to apply patterns from knowledge base.
-
---
-
-## Appendix
-
-### Violation Summary by Location
-
-{Table of all violations sorted by line number:}
-
-| Line   | Severity      | Criterion   | Issue         | Fix         |
-| ------ | ------------- | ----------- | ------------- | ----------- |
-| {line} | {P0/P1/P2/P3} | {criterion} | {brief_issue} | {brief_fix} |
-| {line} | {P0/P1/P2/P3} | {criterion} | {brief_issue} | {brief_fix} |
-
-### Quality Trends
-
-{If reviewing same file multiple times, show trend:}
-
-| Review Date  | Score         | Grade     | Critical Issues | Trend       |
-| ------------ | ------------- | --------- | --------------- | ----------- |
-| {YYYY-MM-DD} | {score_1}/100 | {grade_1} | {count_1}       | ⬆️ Improved |
-| {YYYY-MM-DD} | {score_2}/100 | {grade_2} | {count_2}       | ⬇️ Declined |
-| {YYYY-MM-DD} | {score_3}/100 | {grade_3} | {count_3}       | ➡️ Stable   |
-
-### Related Reviews
-
-{If reviewing multiple files in directory/suite:}
-
-| File     | Score       | Grade   | Critical | Status             |
-| -------- | ----------- | ------- | -------- | ------------------ |
-| {file_1} | {score}/100 | {grade} | {count}  | {Approved/Blocked} |
-| {file_2} | {score}/100 | {grade} | {count}  | {Approved/Blocked} |
-| {file_3} | {score}/100 | {grade} | {count}  | {Approved/Blocked} |
-
-**Suite Average**: {avg_score}/100 ({avg_grade})
-
---
-
-## Review Metadata
-
-**Generated By**: BMad TEA Agent (Test Architect)
-**Workflow**: testarch-test-review v4.0
-**Review ID**: test-review-{filename}-{YYYYMMDD}
-**Timestamp**: {YYYY-MM-DD HH:MM:SS}
-**Version**: 1.0
-
---
-
-## Feedback on This Review
-
-If you have questions or feedback on this review:
-
-1. Review patterns in knowledge base: `testarch/knowledge/`
-2. Consult tea-index.csv for detailed guidance
-3. Request clarification on specific violations
-4. Pair with QA engineer to apply patterns
-
-This review is guidance, not rigid rules. Context matters - if a pattern is justified, document it with a comment.
--- a/_bmad/bmm/workflows/testarch/test-review/workflow.yaml
+++ b/_bmad/bmm/workflows/testarch/test-review/workflow.yaml
@@ -1,46 +0,0 @@
-# Test Architect workflow: test-review
-name: testarch-test-review
-description: "Review test quality using comprehensive knowledge base and best practices validation"
-author: "BMad"
-
-# Critical variables from config
-config_source: "{project-root}/_bmad/bmm/config.yaml"
-output_folder: "{config_source}:output_folder"
-user_name: "{config_source}:user_name"
-communication_language: "{config_source}:communication_language"
-document_output_language: "{config_source}:document_output_language"
-date: system-generated
-
-# Workflow components
-installed_path: "{project-root}/_bmad/bmm/workflows/testarch/test-review"
-instructions: "{installed_path}/instructions.md"
-validation: "{installed_path}/checklist.md"
-template: "{installed_path}/test-review-template.md"
-
-# Variables and inputs
-variables:
-  test_dir: "{project-root}/tests" # Root test directory
-  review_scope: "single" # single (one file), directory (folder), suite (all tests)
-
-# Output configuration
-default_output_file: "{output_folder}/test-review.md"
-
-# Required tools
-required_tools:
-  - read_file # Read test files, story, test-design
-  - write_file # Create review report
-  - list_files # Discover test files in directory
-  - search_repo # Find tests by patterns
-  - glob # Find test files matching patterns
-
-tags:
-  - qa
-  - test-architect
-  - code-review
-  - quality
-  - best-practices
-
-execution_hints:
-  interactive: false # Minimize prompts
-  autonomous: true # Proceed without user input unless blocked
-  iterative: true # Can review multiple files
--- a/_bmad/bmm/workflows/testarch/trace/checklist.md
+++ b/_bmad/bmm/workflows/testarch/trace/checklist.md
@@ -1,655 +0,0 @@
-# Requirements Traceability & Gate Decision - Validation Checklist
-
-**Workflow:** `testarch-trace`
-**Purpose:** Ensure complete traceability matrix with actionable gap analysis AND make deployment readiness decision (PASS/CONCERNS/FAIL/WAIVED)
-
-This checklist covers **two sequential phases**:
-
- **PHASE 1**: Requirements Traceability (always executed)
- **PHASE 2**: Quality Gate Decision (executed if `enable_gate_decision: true`)
-
---
-
-# PHASE 1: REQUIREMENTS TRACEABILITY
-
-## Prerequisites Validation
-
- [ ] Acceptance criteria are available (from story file OR inline)
- [ ] Test suite exists (or gaps are acknowledged and documented)
- [ ] If tests are missing, recommend `*atdd` (trace does not run it automatically)
- [ ] Test directory path is correct (`test_dir` variable)
- [ ] Story file is accessible (if using BMad mode)
- [ ] Knowledge base is loaded (test-priorities, traceability, risk-governance)
-
---
-
-## Context Loading
-
- [ ] Story file read successfully (if applicable)
- [ ] Acceptance criteria extracted correctly
- [ ] Story ID identified (e.g., 1.3)
- [ ] `test-design.md` loaded (if available)
- [ ] `tech-spec.md` loaded (if available)
- [ ] `PRD.md` loaded (if available)
- [ ] Relevant knowledge fragments loaded from `tea-index.csv`
-
---
-
-## Test Discovery and Cataloging
-
- [ ] Tests auto-discovered using multiple strategies (test IDs, describe blocks, file paths)
- [ ] Tests categorized by level (E2E, API, Component, Unit)
- [ ] Test metadata extracted:
-  - [ ] Test IDs (e.g., 1.3-E2E-001)
-  - [ ] Describe/context blocks
-  - [ ] It blocks (individual test cases)
-  - [ ] Given-When-Then structure (if BDD)
-  - [ ] Priority markers (P0/P1/P2/P3)
- [ ] All relevant test files found (no tests missed due to naming conventions)
-
---
-
-## Criteria-to-Test Mapping
-
- [ ] Each acceptance criterion mapped to tests (or marked as NONE)
- [ ] Explicit references found (test IDs, describe blocks mentioning criterion)
- [ ] Test level documented (E2E, API, Component, Unit)
- [ ] Given-When-Then narrative verified for alignment
- [ ] Traceability matrix table generated:
-  - [ ] Criterion ID
-  - [ ] Description
-  - [ ] Test ID
-  - [ ] Test File
-  - [ ] Test Level
-  - [ ] Coverage Status
-
---
-
-## Coverage Classification
-
- [ ] Coverage status classified for each criterion:
-  - [ ] **FULL** - All scenarios validated at appropriate level(s)
-  - [ ] **PARTIAL** - Some coverage but missing edge cases or levels
-  - [ ] **NONE** - No test coverage at any level
-  - [ ] **UNIT-ONLY** - Only unit tests (missing integration/E2E validation)
-  - [ ] **INTEGRATION-ONLY** - Only API/Component tests (missing unit confidence)
- [ ] Classification justifications provided
- [ ] Edge cases considered in FULL vs PARTIAL determination
-
---
-
-## Duplicate Coverage Detection
-
- [ ] Duplicate coverage checked across test levels
- [ ] Acceptable overlap identified (defense in depth for critical paths)
- [ ] Unacceptable duplication flagged (same validation at multiple levels)
- [ ] Recommendations provided for consolidation
- [ ] Selective testing principles applied
-
---
-
-## Gap Analysis
-
- [ ] Coverage gaps identified:
-  - [ ] Criteria with NONE status
-  - [ ] Criteria with PARTIAL status
-  - [ ] Criteria with UNIT-ONLY status
-  - [ ] Criteria with INTEGRATION-ONLY status
- [ ] Gaps prioritized by risk level using test-priorities framework:
-  - [ ] **CRITICAL** - P0 criteria without FULL coverage (BLOCKER)
-  - [ ] **HIGH** - P1 criteria without FULL coverage (PR blocker)
-  - [ ] **MEDIUM** - P2 criteria without FULL coverage (nightly gap)
-  - [ ] **LOW** - P3 criteria without FULL coverage (acceptable)
- [ ] Specific test recommendations provided for each gap:
-  - [ ] Suggested test level (E2E, API, Component, Unit)
-  - [ ] Test description (Given-When-Then)
-  - [ ] Recommended test ID (e.g., 1.3-E2E-004)
-  - [ ] Explanation of why test is needed
-
---
-
-## Coverage Metrics
-
- [ ] Overall coverage percentage calculated (FULL coverage / total criteria)
- [ ] P0 coverage percentage calculated
- [ ] P1 coverage percentage calculated
- [ ] P2 coverage percentage calculated (if applicable)
- [ ] Coverage by level calculated:
-  - [ ] E2E coverage %
-  - [ ] API coverage %
-  - [ ] Component coverage %
-  - [ ] Unit coverage %
-
---
-
-## Test Quality Verification
-
-For each mapped test, verify:
-
- [ ] Explicit assertions are present (not hidden in helpers)
- [ ] Test follows Given-When-Then structure
- [ ] No hard waits or sleeps (deterministic waiting only)
- [ ] Self-cleaning (test cleans up its data)
- [ ] File size < 300 lines
- [ ] Test duration < 90 seconds
-
-Quality issues flagged:
-
- [ ] **BLOCKER** issues identified (missing assertions, hard waits, flaky patterns)
- [ ] **WARNING** issues identified (large files, slow tests, unclear structure)
- [ ] **INFO** issues identified (style inconsistencies, missing documentation)
-
-Knowledge fragments referenced:
-
- [ ] `test-quality.md` for Definition of Done
- [ ] `fixture-architecture.md` for self-cleaning patterns
- [ ] `network-first.md` for Playwright best practices
- [ ] `data-factories.md` for test data patterns
-
---
-
-## Phase 1 Deliverables Generated
-
-### Traceability Matrix Markdown
-
- [ ] File created at `{output_folder}/traceability-matrix.md`
- [ ] Template from `trace-template.md` used
- [ ] Full mapping table included
- [ ] Coverage status section included
- [ ] Gap analysis section included
- [ ] Quality assessment section included
- [ ] Recommendations section included
-
-### Coverage Badge/Metric (if enabled)
-
- [ ] Badge markdown generated
- [ ] Metrics exported to JSON for CI/CD integration
-
-### Updated Story File (if enabled)
-
- [ ] "Traceability" section added to story markdown
- [ ] Link to traceability matrix included
- [ ] Coverage summary included
-
---
-
-## Phase 1 Quality Assurance
-
-### Accuracy Checks
-
- [ ] All acceptance criteria accounted for (none skipped)
- [ ] Test IDs correctly formatted (e.g., 1.3-E2E-001)
- [ ] File paths are correct and accessible
- [ ] Coverage percentages calculated correctly
- [ ] No false positives (tests incorrectly mapped to criteria)
- [ ] No false negatives (existing tests missed in mapping)
-
-### Completeness Checks
-
- [ ] All test levels considered (E2E, API, Component, Unit)
- [ ] All priorities considered (P0, P1, P2, P3)
- [ ] All coverage statuses used appropriately (FULL, PARTIAL, NONE, UNIT-ONLY, INTEGRATION-ONLY)
- [ ] All gaps have recommendations
- [ ] All quality issues have severity and remediation guidance
-
-### Actionability Checks
-
- [ ] Recommendations are specific (not generic)
- [ ] Test IDs suggested for new tests
- [ ] Given-When-Then provided for recommended tests
- [ ] Impact explained for each gap
- [ ] Priorities clear (CRITICAL, HIGH, MEDIUM, LOW)
-
---
-
-## Phase 1 Documentation
-
- [ ] Traceability matrix is readable and well-formatted
- [ ] Tables render correctly in markdown
- [ ] Code blocks have proper syntax highlighting
- [ ] Links are valid and accessible
- [ ] Recommendations are clear and prioritized
-
---
-
-# PHASE 2: QUALITY GATE DECISION
-
-**Note**: Phase 2 executes only if `enable_gate_decision: true` in workflow.yaml
-
---
-
-## Prerequisites
-
-### Evidence Gathering
-
- [ ] Test execution results obtained (CI/CD pipeline, test framework reports)
- [ ] Story/epic/release file identified and read
- [ ] Test design document discovered or explicitly provided (if available)
- [ ] Traceability matrix discovered or explicitly provided (available from Phase 1)
- [ ] NFR assessment discovered or explicitly provided (if available)
- [ ] Code coverage report discovered or explicitly provided (if available)
- [ ] Burn-in results discovered or explicitly provided (if available)
-
-### Evidence Validation
-
- [ ] Evidence freshness validated (warn if >7 days old, recommend re-running workflows)
- [ ] All required assessments available or user acknowledged gaps
- [ ] Test results are complete (not partial or interrupted runs)
- [ ] Test results match current codebase (not from outdated branch)
-
-### Knowledge Base Loading
-
- [ ] `risk-governance.md` loaded successfully
- [ ] `probability-impact.md` loaded successfully
- [ ] `test-quality.md` loaded successfully
- [ ] `test-priorities.md` loaded successfully
- [ ] `ci-burn-in.md` loaded (if burn-in results available)
-
---
-
-## Process Steps
-
-### Step 1: Context Loading
-
- [ ] Gate type identified (story/epic/release/hotfix)
- [ ] Target ID extracted (story_id, epic_num, or release_version)
- [ ] Decision thresholds loaded from workflow variables
- [ ] Risk tolerance configuration loaded
- [ ] Waiver policy loaded
-
-### Step 2: Evidence Parsing
-
-**Test Results:**
-
- [ ] Total test count extracted
- [ ] Passed test count extracted
- [ ] Failed test count extracted
- [ ] Skipped test count extracted
- [ ] Test duration extracted
- [ ] P0 test pass rate calculated
- [ ] P1 test pass rate calculated
- [ ] Overall test pass rate calculated
-
-**Quality Assessments:**
-
- [ ] P0/P1/P2/P3 scenarios extracted from test-design.md (if available)
- [ ] Risk scores extracted from test-design.md (if available)
- [ ] Coverage percentages extracted from traceability-matrix.md (available from Phase 1)
- [ ] Coverage gaps extracted from traceability-matrix.md (available from Phase 1)
- [ ] NFR status extracted from nfr-assessment.md (if available)
- [ ] Security issues count extracted from nfr-assessment.md (if available)
-
-**Code Coverage:**
-
- [ ] Line coverage percentage extracted (if available)
- [ ] Branch coverage percentage extracted (if available)
- [ ] Function coverage percentage extracted (if available)
- [ ] Critical path coverage validated (if available)
-
-**Burn-in Results:**
-
- [ ] Burn-in iterations count extracted (if available)
- [ ] Flaky tests count extracted (if available)
- [ ] Stability score calculated (if available)
-
-### Step 3: Decision Rules Application
-
-**P0 Criteria Evaluation:**
-
- [ ] P0 test pass rate evaluated (must be 100%)
- [ ] P0 acceptance criteria coverage evaluated (must be 100%)
- [ ] Security issues count evaluated (must be 0)
- [ ] Critical NFR failures evaluated (must be 0)
- [ ] Flaky tests evaluated (must be 0 if burn-in enabled)
- [ ] P0 decision recorded: PASS or FAIL
-
-**P1 Criteria Evaluation:**
-
- [ ] P1 test pass rate evaluated (threshold: min_p1_pass_rate)
- [ ] P1 acceptance criteria coverage evaluated (threshold: 95%)
- [ ] Overall test pass rate evaluated (threshold: min_overall_pass_rate)
- [ ] Code coverage evaluated (threshold: min_coverage)
- [ ] P1 decision recorded: PASS or CONCERNS
-
-**P2/P3 Criteria Evaluation:**
-
- [ ] P2 failures tracked (informational, don't block if allow_p2_failures: true)
- [ ] P3 failures tracked (informational, don't block if allow_p3_failures: true)
- [ ] Residual risks documented
-
-**Final Decision:**
-
- [ ] Decision determined: PASS / CONCERNS / FAIL / WAIVED
- [ ] Decision rationale documented
- [ ] Decision is deterministic (follows rules, not arbitrary)
-
-### Step 4: Documentation
-
-**Gate Decision Document Created:**
-
- [ ] Story/epic/release info section complete (ID, title, description, links)
- [ ] Decision clearly stated (PASS / CONCERNS / FAIL / WAIVED)
- [ ] Decision date recorded
- [ ] Evaluator recorded (user or agent name)
-
-**Evidence Summary Documented:**
-
- [ ] Test results summary complete (total, passed, failed, pass rates)
- [ ] Coverage summary complete (P0/P1 criteria, code coverage)
- [ ] NFR validation summary complete (security, performance, reliability, maintainability)
- [ ] Flakiness summary complete (burn-in iterations, flaky test count)
-
-**Rationale Documented:**
-
- [ ] Decision rationale clearly explained
- [ ] Key evidence highlighted
- [ ] Assumptions and caveats noted (if any)
-
-**Residual Risks Documented (if CONCERNS or WAIVED):**
-
- [ ] Unresolved P1/P2 issues listed
- [ ] Probability × impact estimated for each risk
- [ ] Mitigations or workarounds described
-
-**Waivers Documented (if WAIVED):**
-
- [ ] Waiver reason documented (business justification)
- [ ] Waiver approver documented (name, role)
- [ ] Waiver expiry date documented
- [ ] Remediation plan documented (fix in next release, due date)
- [ ] Monitoring plan documented
-
-**Critical Issues Documented (if FAIL or CONCERNS):**
-
- [ ] Top 5-10 critical issues listed
- [ ] Priority assigned to each issue (P0/P1/P2)
- [ ] Owner assigned to each issue
- [ ] Due date assigned to each issue
-
-**Recommendations Documented:**
-
- [ ] Next steps clearly stated for decision type
- [ ] Deployment recommendation provided
- [ ] Monitoring recommendations provided (if applicable)
- [ ] Remediation recommendations provided (if applicable)
-
-### Step 5: Status Updates and Notifications
-
-**Status File Updated:**
-
- [ ] Gate decision appended to bmm-workflow-status.md (if append_to_history: true)
- [ ] Format correct: `[DATE] Gate Decision: DECISION - Target {ID} - {rationale}`
- [ ] Status file committed or staged for commit
-
-**Gate YAML Created:**
-
- [ ] Gate YAML snippet generated with decision and criteria
- [ ] Evidence references included in YAML
- [ ] Next steps included in YAML
- [ ] YAML file saved to output folder
-
-**Stakeholder Notification Generated:**
-
- [ ] Notification subject line created
- [ ] Notification body created with summary
- [ ] Recipients identified (PM, SM, DEV lead, stakeholders)
- [ ] Notification ready for delivery (if notify_stakeholders: true)
-
-**Outputs Saved:**
-
- [ ] Gate decision document saved to `{output_file}`
- [ ] Gate YAML saved to `{output_folder}/gate-decision-{target}.yaml`
- [ ] All outputs are valid and readable
-
---
-
-## Phase 2 Output Validation
-
-### Gate Decision Document
-
-**Completeness:**
-
- [ ] All required sections present (info, decision, evidence, rationale, next steps)
- [ ] No placeholder text or TODOs left in document
- [ ] All evidence references are accurate and complete
- [ ] All links to artifacts are valid
-
-**Accuracy:**
-
- [ ] Decision matches applied criteria rules
- [ ] Test results match CI/CD pipeline output
- [ ] Coverage percentages match reports
- [ ] NFR status matches assessment document
- [ ] No contradictions or inconsistencies
-
-**Clarity:**
-
- [ ] Decision rationale is clear and unambiguous
- [ ] Technical jargon is explained or avoided
- [ ] Stakeholders can understand next steps
- [ ] Recommendations are actionable
-
-### Gate YAML
-
-**Format:**
-
- [ ] YAML is valid (no syntax errors)
- [ ] All required fields present (target, decision, date, evaluator, criteria, evidence)
- [ ] Field values are correct data types (numbers, strings, dates)
-
-**Content:**
-
- [ ] Criteria values match decision document
- [ ] Evidence references are accurate
- [ ] Next steps align with decision type
-
---
-
-## Phase 2 Quality Checks
-
-### Decision Integrity
-
- [ ] Decision is deterministic (follows rules, not arbitrary)
- [ ] P0 failures result in FAIL decision (unless waived)
- [ ] Security issues result in FAIL decision (unless waived - but should never be waived)
- [ ] Waivers have business justification and approver (if WAIVED)
- [ ] Residual risks are documented (if CONCERNS or WAIVED)
-
-### Evidence-Based
-
- [ ] Decision is based on actual test results (not guesses)
- [ ] All claims are supported by evidence
- [ ] No assumptions without documentation
- [ ] Evidence sources are cited (CI run IDs, report URLs)
-
-### Transparency
-
- [ ] Decision rationale is transparent and auditable
- [ ] Criteria evaluation is documented step-by-step
- [ ] Any deviations from standard process are explained
- [ ] Waiver justifications are clear (if applicable)
-
-### Consistency
-
- [ ] Decision aligns with risk-governance knowledge fragment
- [ ] Priority framework (P0/P1/P2/P3) applied consistently
- [ ] Terminology consistent with test-quality knowledge fragment
- [ ] Decision matrix followed correctly
-
---
-
-## Phase 2 Integration Points
-
-### BMad Workflow Status
-
- [ ] Gate decision added to `bmm-workflow-status.md`
- [ ] Format matches existing gate history entries
- [ ] Timestamp is accurate
- [ ] Decision summary is concise (<80 chars)
-
-### CI/CD Pipeline
-
- [ ] Gate YAML is CI/CD-compatible
- [ ] YAML can be parsed by pipeline automation
- [ ] Decision can be used to block/allow deployments
- [ ] Evidence references are accessible to pipeline
-
-### Stakeholders
-
- [ ] Notification message is clear and actionable
- [ ] Decision is explained in non-technical terms
- [ ] Next steps are specific and time-bound
- [ ] Recipients are appropriate for decision type
-
---
-
-## Phase 2 Compliance and Audit
-
-### Audit Trail
-
- [ ] Decision date and time recorded
- [ ] Evaluator identified (user or agent)
- [ ] All evidence sources cited
- [ ] Decision criteria documented
- [ ] Rationale clearly explained
-
-### Traceability
-
- [ ] Gate decision traceable to story/epic/release
- [ ] Evidence traceable to specific test runs
- [ ] Assessments traceable to workflows that created them
- [ ] Waiver traceable to approver (if applicable)
-
-### Compliance
-
- [ ] Security requirements validated (no unresolved vulnerabilities)
- [ ] Quality standards met or waived with justification
- [ ] Regulatory requirements addressed (if applicable)
- [ ] Documentation sufficient for external audit
-
---
-
-## Phase 2 Edge Cases and Exceptions
-
-### Missing Evidence
-
- [ ] If test-design.md missing, decision still possible with test results + trace
- [ ] If traceability-matrix.md missing, decision still possible with test results (but Phase 1 should provide it)
- [ ] If nfr-assessment.md missing, NFR validation marked as NOT ASSESSED
- [ ] If code coverage missing, coverage criterion marked as NOT ASSESSED
- [ ] User acknowledged gaps in evidence or provided alternative proof
-
-### Stale Evidence
-
- [ ] Evidence freshness checked (if validate_evidence_freshness: true)
- [ ] Warnings issued for assessments >7 days old
- [ ] User acknowledged stale evidence or re-ran workflows
- [ ] Decision document notes any stale evidence used
-
-### Conflicting Evidence
-
- [ ] Conflicts between test results and assessments resolved
- [ ] Most recent/authoritative source identified
- [ ] Conflict resolution documented in decision rationale
- [ ] User consulted if conflict cannot be resolved
-
-### Waiver Scenarios
-
- [ ] Waiver only used for FAIL decision (not PASS or CONCERNS)
- [ ] Waiver has business justification (not technical convenience)
- [ ] Waiver has named approver with authority (VP/CTO/PO)
- [ ] Waiver has expiry date (does NOT apply to future releases)
- [ ] Waiver has remediation plan with concrete due date
- [ ] Security vulnerabilities are NOT waived (enforced)
-
---
-
-# FINAL VALIDATION (Both Phases)
-
-## Non-Prescriptive Validation
-
- [ ] Traceability format adapted to team needs (not rigid template)
- [ ] Examples are minimal and focused on patterns
- [ ] Teams can extend with custom classifications
- [ ] Integration with external systems supported (JIRA, Azure DevOps)
- [ ] Compliance requirements considered (if applicable)
-
---
-
-## Documentation and Communication
-
- [ ] All documents are readable and well-formatted
- [ ] Tables render correctly in markdown
- [ ] Code blocks have proper syntax highlighting
- [ ] Links are valid and accessible
- [ ] Recommendations are clear and prioritized
- [ ] Gate decision is prominent and unambiguous (Phase 2)
-
---
-
-## Final Validation
-
-**Phase 1 (Traceability):**
-
- [ ] All prerequisites met
- [ ] All acceptance criteria mapped or gaps documented
- [ ] P0 coverage is 100% OR documented as BLOCKER
- [ ] Gap analysis is complete and prioritized
- [ ] Test quality issues identified and flagged
- [ ] Deliverables generated and saved
-
-**Phase 2 (Gate Decision):**
-
- [ ] All quality evidence gathered
- [ ] Decision criteria applied correctly
- [ ] Decision rationale documented
- [ ] Gate YAML ready for CI/CD integration
- [ ] Status file updated (if enabled)
- [ ] Stakeholders notified (if enabled)
-
-**Workflow Complete:**
-
- [ ] Phase 1 completed successfully
- [ ] Phase 2 completed successfully (if enabled)
- [ ] All outputs validated and saved
- [ ] Ready to proceed based on gate decision
-
---
-
-## Sign-Off
-
-**Phase 1 - Traceability Status:**
-
- [ ] ✅ PASS - All quality gates met, no critical gaps
- [ ] ⚠️ WARN - P1 gaps exist, address before PR merge
- [ ] ❌ FAIL - P0 gaps exist, BLOCKER for release
-
-**Phase 2 - Gate Decision Status (if enabled):**
-
- [ ] ✅ PASS - Deploy to production
- [ ] ⚠️ CONCERNS - Deploy with monitoring
- [ ] ❌ FAIL - Block deployment, fix issues
- [ ] 🔓 WAIVED - Deploy with business approval and remediation plan
-
-**Next Actions:**
-
- If PASS (both phases): Proceed to deployment
- If WARN/CONCERNS: Address gaps/issues, proceed with monitoring
- If FAIL (either phase): Run `*atdd` for missing tests, fix issues, re-run `*trace`
- If WAIVED: Deploy with approved waiver, schedule remediation
-
---
-
-## Notes
-
-Record any issues, deviations, or important observations during workflow execution:
-
- **Phase 1 Issues**: [Note any traceability mapping challenges, missing tests, quality concerns]
- **Phase 2 Issues**: [Note any missing, stale, or conflicting evidence]
- **Decision Rationale**: [Document any nuanced reasoning or edge cases]
- **Waiver Details**: [Document waiver negotiations or approvals]
- **Follow-up Actions**: [List any actions required after gate decision]
-
---
-
-<!-- Powered by BMAD-CORE™ -->
--- a/_bmad/bmm/workflows/testarch/trace/instructions.md
+++ b/_bmad/bmm/workflows/testarch/trace/instructions.md
--- a/_bmad/bmm/workflows/testarch/trace/trace-template.md
+++ b/_bmad/bmm/workflows/testarch/trace/trace-template.md
@@ -1,675 +0,0 @@
-# Traceability Matrix & Gate Decision - Story {STORY_ID}
-
-**Story:** {STORY_TITLE}
-**Date:** {DATE}
-**Evaluator:** {user_name or TEA Agent}
-
---
-
-Note: This workflow does not generate tests. If gaps exist, run `*atdd` or `*automate` to create coverage.
-
-## PHASE 1: REQUIREMENTS TRACEABILITY
-
-### Coverage Summary
-
-| Priority  | Total Criteria | FULL Coverage | Coverage % | Status       |
-| --------- | -------------- | ------------- | ---------- | ------------ |
-| P0        | {P0_TOTAL}     | {P0_FULL}     | {P0_PCT}%  | {P0_STATUS}  |
-| P1        | {P1_TOTAL}     | {P1_FULL}     | {P1_PCT}%  | {P1_STATUS}  |
-| P2        | {P2_TOTAL}     | {P2_FULL}     | {P2_PCT}%  | {P2_STATUS}  |
-| P3        | {P3_TOTAL}     | {P3_FULL}     | {P3_PCT}%  | {P3_STATUS}  |
-| **Total** | **{TOTAL}**    | **{FULL}**    | **{PCT}%** | **{STATUS}** |
-
-**Legend:**
-
- ✅ PASS - Coverage meets quality gate threshold
- ⚠️ WARN - Coverage below threshold but not critical
- ❌ FAIL - Coverage below minimum threshold (blocker)
-
---
-
-### Detailed Mapping
-
-#### {CRITERION_ID}: {CRITERION_DESCRIPTION} ({PRIORITY})
-
- **Coverage:** {COVERAGE_STATUS} {STATUS_ICON}
- **Tests:**
-  - `{TEST_ID}` - {TEST_FILE}:{LINE}
-    - **Given:** {GIVEN}
-    - **When:** {WHEN}
-    - **Then:** {THEN}
-  - `{TEST_ID_2}` - {TEST_FILE_2}:{LINE}
-    - **Given:** {GIVEN_2}
-    - **When:** {WHEN_2}
-    - **Then:** {THEN_2}
-
- **Gaps:** (if PARTIAL or UNIT-ONLY or INTEGRATION-ONLY)
-  - Missing: {MISSING_SCENARIO_1}
-  - Missing: {MISSING_SCENARIO_2}
-
- **Recommendation:** {RECOMMENDATION_TEXT}
-
---
-
-#### Example: AC-1: User can login with email and password (P0)
-
- **Coverage:** FULL ✅
- **Tests:**
-  - `1.3-E2E-001` - tests/e2e/auth.spec.ts:12
-    - **Given:** User has valid credentials
-    - **When:** User submits login form
-    - **Then:** User is redirected to dashboard
-  - `1.3-UNIT-001` - tests/unit/auth-service.spec.ts:8
-    - **Given:** Valid email and password hash
-    - **When:** validateCredentials is called
-    - **Then:** Returns user object
-
---
-
-#### Example: AC-3: User can reset password via email (P1)
-
- **Coverage:** PARTIAL ⚠️
- **Tests:**
-  - `1.3-E2E-003` - tests/e2e/auth.spec.ts:44
-    - **Given:** User requests password reset
-    - **When:** User clicks reset link in email
-    - **Then:** User can set new password
-
- **Gaps:**
-  - Missing: Email delivery validation
-  - Missing: Expired token handling (error path)
-  - Missing: Invalid token handling (security test)
-  - Missing: Unit test for token generation logic
-
- **Recommendation:** Add `1.3-API-001` for email service integration testing and `1.3-UNIT-003` for token generation logic. Add `1.3-E2E-004` for error path validation (expired/invalid tokens).
-
---
-
-### Gap Analysis
-
-#### Critical Gaps (BLOCKER) ❌
-
-{CRITICAL_GAP_COUNT} gaps found. **Do not release until resolved.**
-
-1. **{CRITERION_ID}: {CRITERION_DESCRIPTION}** (P0)
-   - Current Coverage: {COVERAGE_STATUS}
-   - Missing Tests: {MISSING_TEST_DESCRIPTION}
-   - Recommend: {RECOMMENDED_TEST_ID} ({RECOMMENDED_TEST_LEVEL})
-   - Impact: {IMPACT_DESCRIPTION}
-
---
-
-#### High Priority Gaps (PR BLOCKER) ⚠️
-
-{HIGH_GAP_COUNT} gaps found. **Address before PR merge.**
-
-1. **{CRITERION_ID}: {CRITERION_DESCRIPTION}** (P1)
-   - Current Coverage: {COVERAGE_STATUS}
-   - Missing Tests: {MISSING_TEST_DESCRIPTION}
-   - Recommend: {RECOMMENDED_TEST_ID} ({RECOMMENDED_TEST_LEVEL})
-   - Impact: {IMPACT_DESCRIPTION}
-
---
-
-#### Medium Priority Gaps (Nightly) ⚠️
-
-{MEDIUM_GAP_COUNT} gaps found. **Address in nightly test improvements.**
-
-1. **{CRITERION_ID}: {CRITERION_DESCRIPTION}** (P2)
-   - Current Coverage: {COVERAGE_STATUS}
-   - Recommend: {RECOMMENDED_TEST_ID} ({RECOMMENDED_TEST_LEVEL})
-
---
-
-#### Low Priority Gaps (Optional) ℹ️
-
-{LOW_GAP_COUNT} gaps found. **Optional - add if time permits.**
-
-1. **{CRITERION_ID}: {CRITERION_DESCRIPTION}** (P3)
-   - Current Coverage: {COVERAGE_STATUS}
-
---
-
-### Quality Assessment
-
-#### Tests with Issues
-
-**BLOCKER Issues** ❌
-
- `{TEST_ID}` - {ISSUE_DESCRIPTION} - {REMEDIATION}
-
-**WARNING Issues** ⚠️
-
- `{TEST_ID}` - {ISSUE_DESCRIPTION} - {REMEDIATION}
-
-**INFO Issues** ℹ️
-
- `{TEST_ID}` - {ISSUE_DESCRIPTION} - {REMEDIATION}
-
---
-
-#### Example Quality Issues
-
-**WARNING Issues** ⚠️
-
- `1.3-E2E-001` - 145 seconds (exceeds 90s target) - Optimize fixture setup to reduce test duration
- `1.3-UNIT-005` - 320 lines (exceeds 300 line limit) - Split into multiple focused test files
-
-**INFO Issues** ℹ️
-
- `1.3-E2E-002` - Missing Given-When-Then structure - Refactor describe block to use BDD format
-
---
-
-#### Tests Passing Quality Gates
-
-**{PASSING_TEST_COUNT}/{TOTAL_TEST_COUNT} tests ({PASSING_PCT}%) meet all quality criteria** ✅
-
---
-
-### Duplicate Coverage Analysis
-
-#### Acceptable Overlap (Defense in Depth)
-
- {CRITERION_ID}: Tested at unit (business logic) and E2E (user journey) ✅
-
-#### Unacceptable Duplication ⚠️
-
- {CRITERION_ID}: Same validation at E2E and Component level
-  - Recommendation: Remove {TEST_ID} or consolidate with {OTHER_TEST_ID}
-
---
-
-### Coverage by Test Level
-
-| Test Level | Tests             | Criteria Covered     | Coverage %       |
-| ---------- | ----------------- | -------------------- | ---------------- |
-| E2E        | {E2E_COUNT}       | {E2E_CRITERIA}       | {E2E_PCT}%       |
-| API        | {API_COUNT}       | {API_CRITERIA}       | {API_PCT}%       |
-| Component  | {COMP_COUNT}      | {COMP_CRITERIA}      | {COMP_PCT}%      |
-| Unit       | {UNIT_COUNT}      | {UNIT_CRITERIA}      | {UNIT_PCT}%      |
-| **Total**  | **{TOTAL_TESTS}** | **{TOTAL_CRITERIA}** | **{TOTAL_PCT}%** |
-
---
-
-### Traceability Recommendations
-
-#### Immediate Actions (Before PR Merge)
-
-1. **{ACTION_1}** - {DESCRIPTION}
-2. **{ACTION_2}** - {DESCRIPTION}
-
-#### Short-term Actions (This Sprint)
-
-1. **{ACTION_1}** - {DESCRIPTION}
-2. **{ACTION_2}** - {DESCRIPTION}
-
-#### Long-term Actions (Backlog)
-
-1. **{ACTION_1}** - {DESCRIPTION}
-
---
-
-#### Example Recommendations
-
-**Immediate Actions (Before PR Merge)**
-
-1. **Add P1 Password Reset Tests** - Implement `1.3-API-001` for email service integration and `1.3-E2E-004` for error path validation. P1 coverage currently at 80%, target is 90%.
-2. **Optimize Slow E2E Test** - Refactor `1.3-E2E-001` to use faster fixture setup. Currently 145s, target is <90s.
-
-**Short-term Actions (This Sprint)**
-
-1. **Enhance P2 Coverage** - Add E2E validation for session timeout (`1.3-E2E-005`). Currently UNIT-ONLY coverage.
-2. **Split Large Test File** - Break `1.3-UNIT-005` (320 lines) into multiple focused test files (<300 lines each).
-
-**Long-term Actions (Backlog)**
-
-1. **Enrich P3 Coverage** - Add tests for edge cases in P3 criteria if time permits.
-
---
-
-## PHASE 2: QUALITY GATE DECISION
-
-**Gate Type:** {story | epic | release | hotfix}
-**Decision Mode:** {deterministic | manual}
-
---
-
-### Evidence Summary
-
-#### Test Execution Results
-
- **Total Tests**: {total_count}
- **Passed**: {passed_count} ({pass_percentage}%)
- **Failed**: {failed_count} ({fail_percentage}%)
- **Skipped**: {skipped_count} ({skip_percentage}%)
- **Duration**: {total_duration}
-
-**Priority Breakdown:**
-
- **P0 Tests**: {p0_passed}/{p0_total} passed ({p0_pass_rate}%) {✅ | ❌}
- **P1 Tests**: {p1_passed}/{p1_total} passed ({p1_pass_rate}%) {✅ | ⚠️ | ❌}
- **P2 Tests**: {p2_passed}/{p2_total} passed ({p2_pass_rate}%) {informational}
- **P3 Tests**: {p3_passed}/{p3_total} passed ({p3_pass_rate}%) {informational}
-
-**Overall Pass Rate**: {overall_pass_rate}% {✅ | ⚠️ | ❌}
-
-**Test Results Source**: {CI_run_id | test_report_url | local_run}
-
---
-
-#### Coverage Summary (from Phase 1)
-
-**Requirements Coverage:**
-
- **P0 Acceptance Criteria**: {p0_covered}/{p0_total} covered ({p0_coverage}%) {✅ | ❌}
- **P1 Acceptance Criteria**: {p1_covered}/{p1_total} covered ({p1_coverage}%) {✅ | ⚠️ | ❌}
- **P2 Acceptance Criteria**: {p2_covered}/{p2_total} covered ({p2_coverage}%) {informational}
- **Overall Coverage**: {overall_coverage}%
-
-**Code Coverage** (if available):
-
- **Line Coverage**: {line_coverage}% {✅ | ⚠️ | ❌}
- **Branch Coverage**: {branch_coverage}% {✅ | ⚠️ | ❌}
- **Function Coverage**: {function_coverage}% {✅ | ⚠️ | ❌}
-
-**Coverage Source**: {coverage_report_url | coverage_file_path}
-
---
-
-#### Non-Functional Requirements (NFRs)
-
-**Security**: {PASS | CONCERNS | FAIL | NOT_ASSESSED} {✅ | ⚠️ | ❌}
-
- Security Issues: {security_issue_count}
- {details_if_issues}
-
-**Performance**: {PASS | CONCERNS | FAIL | NOT_ASSESSED} {✅ | ⚠️ | ❌}
-
- {performance_metrics_summary}
-
-**Reliability**: {PASS | CONCERNS | FAIL | NOT_ASSESSED} {✅ | ⚠️ | ❌}
-
- {reliability_metrics_summary}
-
-**Maintainability**: {PASS | CONCERNS | FAIL | NOT_ASSESSED} {✅ | ⚠️ | ❌}
-
- {maintainability_metrics_summary}
-
-**NFR Source**: {nfr_assessment_file_path | not_assessed}
-
---
-
-#### Flakiness Validation
-
-**Burn-in Results** (if available):
-
- **Burn-in Iterations**: {iteration_count} (e.g., 10)
- **Flaky Tests Detected**: {flaky_test_count} {✅ if 0 | ❌ if >0}
- **Stability Score**: {stability_percentage}%
-
-**Flaky Tests List** (if any):
-
- {flaky_test_1_name} - {failure_rate}
- {flaky_test_2_name} - {failure_rate}
-
-**Burn-in Source**: {CI_burn_in_run_id | not_available}
-
---
-
-### Decision Criteria Evaluation
-
-#### P0 Criteria (Must ALL Pass)
-
-| Criterion             | Threshold | Actual                    | Status   |
-| --------------------- | --------- | ------------------------- | -------- | -------- |
-| P0 Coverage           | 100%      | {p0_coverage}%            | {✅ PASS | ❌ FAIL} |
-| P0 Test Pass Rate     | 100%      | {p0_pass_rate}%           | {✅ PASS | ❌ FAIL} |
-| Security Issues       | 0         | {security_issue_count}    | {✅ PASS | ❌ FAIL} |
-| Critical NFR Failures | 0         | {critical_nfr_fail_count} | {✅ PASS | ❌ FAIL} |
-| Flaky Tests           | 0         | {flaky_test_count}        | {✅ PASS | ❌ FAIL} |
-
-**P0 Evaluation**: {✅ ALL PASS | ❌ ONE OR MORE FAILED}
-
---
-
-#### P1 Criteria (Required for PASS, May Accept for CONCERNS)
-
-| Criterion              | Threshold                 | Actual               | Status   |
-| ---------------------- | ------------------------- | -------------------- | -------- | ----------- | -------- |
-| P1 Coverage            | ≥{min_p1_coverage}%       | {p1_coverage}%       | {✅ PASS | ⚠️ CONCERNS | ❌ FAIL} |
-| P1 Test Pass Rate      | ≥{min_p1_pass_rate}%      | {p1_pass_rate}%      | {✅ PASS | ⚠️ CONCERNS | ❌ FAIL} |
-| Overall Test Pass Rate | ≥{min_overall_pass_rate}% | {overall_pass_rate}% | {✅ PASS | ⚠️ CONCERNS | ❌ FAIL} |
-| Overall Coverage       | ≥{min_coverage}%          | {overall_coverage}%  | {✅ PASS | ⚠️ CONCERNS | ❌ FAIL} |
-
-**P1 Evaluation**: {✅ ALL PASS | ⚠️ SOME CONCERNS | ❌ FAILED}
-
---
-
-#### P2/P3 Criteria (Informational, Don't Block)
-
-| Criterion         | Actual          | Notes                                                        |
-| ----------------- | --------------- | ------------------------------------------------------------ |
-| P2 Test Pass Rate | {p2_pass_rate}% | {allow_p2_failures ? "Tracked, doesn't block" : "Evaluated"} |
-| P3 Test Pass Rate | {p3_pass_rate}% | {allow_p3_failures ? "Tracked, doesn't block" : "Evaluated"} |
-
---
-
-### GATE DECISION: {PASS | CONCERNS | FAIL | WAIVED}
-
---
-
-### Rationale
-
-{Explain decision based on criteria evaluation}
-
-{Highlight key evidence that drove decision}
-
-{Note any assumptions or caveats}
-
-**Example (PASS):**
-
-> All P0 criteria met with 100% coverage and pass rates across critical tests. All P1 criteria exceeded thresholds with 98% overall pass rate and 92% coverage. No security issues detected. No flaky tests in validation. Feature is ready for production deployment with standard monitoring.
-
-**Example (CONCERNS):**
-
-> All P0 criteria met, ensuring critical user journeys are protected. However, P1 coverage (88%) falls below threshold (90%) due to missing E2E test for AC-5 edge case. Overall pass rate (96%) is excellent. Issues are non-critical and have acceptable workarounds. Risk is low enough to deploy with enhanced monitoring.
-
-**Example (FAIL):**
-
-> CRITICAL BLOCKERS DETECTED:
->
-> 1. P0 coverage incomplete (80%) - AC-2 security validation missing
-> 2. P0 test failures (75% pass rate) in core search functionality
-> 3. Unresolved SQL injection vulnerability in search filter (CRITICAL)
->
-> Release MUST BE BLOCKED until P0 issues are resolved. Security vulnerability cannot be waived.
-
-**Example (WAIVED):**
-
-> Original decision was FAIL due to P0 test failure in legacy Excel 2007 export module (affects <1% of users). However, release contains critical GDPR compliance features required by regulatory deadline (Oct 15). Business has approved waiver given:
->
-> - Regulatory priority overrides legacy module risk
-> - Workaround available (use Excel 2010+)
-> - Issue will be fixed in v2.4.1 hotfix (due Oct 20)
-> - Enhanced monitoring in place
-
---
-
-### {Section: Delete if not applicable}
-
-#### Residual Risks (For CONCERNS or WAIVED)
-
-List unresolved P1/P2 issues that don't block release but should be tracked:
-
-1. **{Risk Description}**
-   - **Priority**: P1 | P2
-   - **Probability**: Low | Medium | High
-   - **Impact**: Low | Medium | High
-   - **Risk Score**: {probability × impact}
-   - **Mitigation**: {workaround or monitoring plan}
-   - **Remediation**: {fix in next sprint/release}
-
-**Overall Residual Risk**: {LOW | MEDIUM | HIGH}
-
---
-
-#### Waiver Details (For WAIVED only)
-
-**Original Decision**: ❌ FAIL
-
-**Reason for Failure**:
-
- {list_of_blocking_issues}
-
-**Waiver Information**:
-
- **Waiver Reason**: {business_justification}
- **Waiver Approver**: {name}, {role} (e.g., Jane Doe, VP Engineering)
- **Approval Date**: {YYYY-MM-DD}
- **Waiver Expiry**: {YYYY-MM-DD} (**NOTE**: Does NOT apply to next release)
-
-**Monitoring Plan**:
-
- {enhanced_monitoring_1}
- {enhanced_monitoring_2}
- {escalation_criteria}
-
-**Remediation Plan**:
-
- **Fix Target**: {next_release_version} (e.g., v2.4.1 hotfix)
- **Due Date**: {YYYY-MM-DD}
- **Owner**: {team_or_person}
- **Verification**: {how_fix_will_be_verified}
-
-**Business Justification**:
-{detailed_explanation_of_why_waiver_is_acceptable}
-
---
-
-#### Critical Issues (For FAIL or CONCERNS)
-
-Top blockers requiring immediate attention:
-
-| Priority | Issue         | Description         | Owner        | Due Date     | Status             |
-| -------- | ------------- | ------------------- | ------------ | ------------ | ------------------ |
-| P0       | {issue_title} | {brief_description} | {owner_name} | {YYYY-MM-DD} | {OPEN/IN_PROGRESS} |
-| P0       | {issue_title} | {brief_description} | {owner_name} | {YYYY-MM-DD} | {OPEN/IN_PROGRESS} |
-| P1       | {issue_title} | {brief_description} | {owner_name} | {YYYY-MM-DD} | {OPEN/IN_PROGRESS} |
-
-**Blocking Issues Count**: {p0_blocker_count} P0 blockers, {p1_blocker_count} P1 issues
-
---
-
-### Gate Recommendations
-
-#### For PASS Decision ✅
-
-1. **Proceed to deployment**
-   - Deploy to staging environment
-   - Validate with smoke tests
-   - Monitor key metrics for 24-48 hours
-   - Deploy to production with standard monitoring
-
-2. **Post-Deployment Monitoring**
-   - {metric_1_to_monitor}
-   - {metric_2_to_monitor}
-   - {alert_thresholds}
-
-3. **Success Criteria**
-   - {success_criterion_1}
-   - {success_criterion_2}
-
---
-
-#### For CONCERNS Decision ⚠️
-
-1. **Deploy with Enhanced Monitoring**
-   - Deploy to staging with extended validation period
-   - Enable enhanced logging/monitoring for known risk areas:
-     - {risk_area_1}
-     - {risk_area_2}
-   - Set aggressive alerts for potential issues
-   - Deploy to production with caution
-
-2. **Create Remediation Backlog**
-   - Create story: "{fix_title_1}" (Priority: {priority})
-   - Create story: "{fix_title_2}" (Priority: {priority})
-   - Target sprint: {next_sprint}
-
-3. **Post-Deployment Actions**
-   - Monitor {specific_areas} closely for {time_period}
-   - Weekly status updates on remediation progress
-   - Re-assess after fixes deployed
-
---
-
-#### For FAIL Decision ❌
-
-1. **Block Deployment Immediately**
-   - Do NOT deploy to any environment
-   - Notify stakeholders of blocking issues
-   - Escalate to tech lead and PM
-
-2. **Fix Critical Issues**
-   - Address P0 blockers listed in Critical Issues section
-   - Owner assignments confirmed
-   - Due dates agreed upon
-   - Daily standup on blocker resolution
-
-3. **Re-Run Gate After Fixes**
-   - Re-run full test suite after fixes
-   - Re-run `bmad tea *trace` workflow
-   - Verify decision is PASS before deploying
-
---
-
-#### For WAIVED Decision 🔓
-
-1. **Deploy with Business Approval**
-   - Confirm waiver approver has signed off
-   - Document waiver in release notes
-   - Notify all stakeholders of waived risks
-
-2. **Aggressive Monitoring**
-   - {enhanced_monitoring_plan}
-   - {escalation_procedures}
-   - Daily checks on waived risk areas
-
-3. **Mandatory Remediation**
-   - Fix MUST be completed by {due_date}
-   - Issue CANNOT be waived in next release
-   - Track remediation progress weekly
-   - Verify fix in next gate
-
---
-
-### Next Steps
-
-**Immediate Actions** (next 24-48 hours):
-
-1. {action_1}
-2. {action_2}
-3. {action_3}
-
-**Follow-up Actions** (next sprint/release):
-
-1. {action_1}
-2. {action_2}
-3. {action_3}
-
-**Stakeholder Communication**:
-
- Notify PM: {decision_summary}
- Notify SM: {decision_summary}
- Notify DEV lead: {decision_summary}
-
---
-
-## Integrated YAML Snippet (CI/CD)
-
-```yaml
-traceability_and_gate:
-  # Phase 1: Traceability
-  traceability:
-    story_id: "{STORY_ID}"
-    date: "{DATE}"
-    coverage:
-      overall: {OVERALL_PCT}%
-      p0: {P0_PCT}%
-      p1: {P1_PCT}%
-      p2: {P2_PCT}%
-      p3: {P3_PCT}%
-    gaps:
-      critical: {CRITICAL_COUNT}
-      high: {HIGH_COUNT}
-      medium: {MEDIUM_COUNT}
-      low: {LOW_COUNT}
-    quality:
-      passing_tests: {PASSING_COUNT}
-      total_tests: {TOTAL_TESTS}
-      blocker_issues: {BLOCKER_COUNT}
-      warning_issues: {WARNING_COUNT}
-    recommendations:
-      - "{RECOMMENDATION_1}"
-      - "{RECOMMENDATION_2}"
-
-  # Phase 2: Gate Decision
-  gate_decision:
-    decision: "{PASS | CONCERNS | FAIL | WAIVED}"
-    gate_type: "{story | epic | release | hotfix}"
-    decision_mode: "{deterministic | manual}"
-    criteria:
-      p0_coverage: {p0_coverage}%
-      p0_pass_rate: {p0_pass_rate}%
-      p1_coverage: {p1_coverage}%
-      p1_pass_rate: {p1_pass_rate}%
-      overall_pass_rate: {overall_pass_rate}%
-      overall_coverage: {overall_coverage}%
-      security_issues: {security_issue_count}
-      critical_nfrs_fail: {critical_nfr_fail_count}
-      flaky_tests: {flaky_test_count}
-    thresholds:
-      min_p0_coverage: 100
-      min_p0_pass_rate: 100
-      min_p1_coverage: {min_p1_coverage}
-      min_p1_pass_rate: {min_p1_pass_rate}
-      min_overall_pass_rate: {min_overall_pass_rate}
-      min_coverage: {min_coverage}
-    evidence:
-      test_results: "{CI_run_id | test_report_url}"
-      traceability: "{trace_file_path}"
-      nfr_assessment: "{nfr_file_path}"
-      code_coverage: "{coverage_report_url}"
-    next_steps: "{brief_summary_of_recommendations}"
-    waiver: # Only if WAIVED
-      reason: "{business_justification}"
-      approver: "{name}, {role}"
-      expiry: "{YYYY-MM-DD}"
-      remediation_due: "{YYYY-MM-DD}"
-```
-
---
-
-## Related Artifacts
-
- **Story File:** {STORY_FILE_PATH}
- **Test Design:** {TEST_DESIGN_PATH} (if available)
- **Tech Spec:** {TECH_SPEC_PATH} (if available)
- **Test Results:** {TEST_RESULTS_PATH}
- **NFR Assessment:** {NFR_FILE_PATH} (if available)
- **Test Files:** {TEST_DIR_PATH}
-
---
-
-## Sign-Off
-
-**Phase 1 - Traceability Assessment:**
-
- Overall Coverage: {OVERALL_PCT}%
- P0 Coverage: {P0_PCT}% {P0_STATUS}
- P1 Coverage: {P1_PCT}% {P1_STATUS}
- Critical Gaps: {CRITICAL_COUNT}
- High Priority Gaps: {HIGH_COUNT}
-
-**Phase 2 - Gate Decision:**
-
- **Decision**: {PASS | CONCERNS | FAIL | WAIVED} {STATUS_ICON}
- **P0 Evaluation**: {✅ ALL PASS | ❌ ONE OR MORE FAILED}
- **P1 Evaluation**: {✅ ALL PASS | ⚠️ SOME CONCERNS | ❌ FAILED}
-
-**Overall Status:** {STATUS} {STATUS_ICON}
-
-**Next Steps:**
-
- If PASS ✅: Proceed to deployment
- If CONCERNS ⚠️: Deploy with monitoring, create remediation backlog
- If FAIL ❌: Block deployment, fix critical issues, re-run workflow
- If WAIVED 🔓: Deploy with business approval and aggressive monitoring
-
-**Generated:** {DATE}
-**Workflow:** testarch-trace v4.0 (Enhanced with Gate Decision)
-
---
-
-<!-- Powered by BMAD-CORE™ -->
--- a/_bmad/bmm/workflows/testarch/trace/workflow.yaml
+++ b/_bmad/bmm/workflows/testarch/trace/workflow.yaml
@@ -1,55 +0,0 @@
-# Test Architect workflow: trace (enhanced with gate decision)
-name: testarch-trace
-description: "Generate requirements-to-tests traceability matrix, analyze coverage, and make quality gate decision (PASS/CONCERNS/FAIL/WAIVED)"
-author: "BMad"
-
-# Critical variables from config
-config_source: "{project-root}/_bmad/bmm/config.yaml"
-output_folder: "{config_source}:output_folder"
-user_name: "{config_source}:user_name"
-communication_language: "{config_source}:communication_language"
-document_output_language: "{config_source}:document_output_language"
-date: system-generated
-
-# Workflow components
-installed_path: "{project-root}/_bmad/bmm/workflows/testarch/trace"
-instructions: "{installed_path}/instructions.md"
-validation: "{installed_path}/checklist.md"
-template: "{installed_path}/trace-template.md"
-
-# Variables and inputs
-variables:
-  # Directory paths
-  test_dir: "{project-root}/tests" # Root test directory
-  source_dir: "{project-root}/src" # Source code directory
-
-  # Workflow behavior
-  coverage_levels: "e2e,api,component,unit" # Which test levels to trace
-  gate_type: "story" # story | epic | release | hotfix - determines gate scope
-  decision_mode: "deterministic" # deterministic (rule-based) | manual (team decision)
-
-# Output configuration
-default_output_file: "{output_folder}/traceability-matrix.md"
-
-# Required tools
-required_tools:
-  - read_file # Read story, test files, BMad artifacts
-  - write_file # Create traceability matrix, gate YAML
-  - list_files # Discover test files
-  - search_repo # Find tests by test ID, describe blocks
-  - glob # Find test files matching patterns
-
-tags:
-  - qa
-  - traceability
-  - test-architect
-  - coverage
-  - requirements
-  - gate
-  - decision
-  - release
-
-execution_hints:
-  interactive: false # Minimize prompts
-  autonomous: true # Proceed without user input unless blocked
-  iterative: true