Ignore and untrack BMad directories
This commit is contained in:
@@ -1,364 +0,0 @@
|
||||
# ATDD Checklist - Epic {epic_num}, Story {story_num}: {story_title}
|
||||
|
||||
**Date:** {date}
|
||||
**Author:** {user_name}
|
||||
**Primary Test Level:** {primary_level}
|
||||
|
||||
---
|
||||
|
||||
## Story Summary
|
||||
|
||||
{Brief 2-3 sentence summary of the user story}
|
||||
|
||||
**As a** {user_role}
|
||||
**I want** {feature_description}
|
||||
**So that** {business_value}
|
||||
|
||||
---
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
{List all testable acceptance criteria from the story}
|
||||
|
||||
1. {Acceptance criterion 1}
|
||||
2. {Acceptance criterion 2}
|
||||
3. {Acceptance criterion 3}
|
||||
|
||||
---
|
||||
|
||||
## Failing Tests Created (RED Phase)
|
||||
|
||||
### E2E Tests ({e2e_test_count} tests)
|
||||
|
||||
**File:** `{e2e_test_file_path}` ({line_count} lines)
|
||||
|
||||
{List each E2E test with its current status and expected failure reason}
|
||||
|
||||
- ✅ **Test:** {test_name}
|
||||
- **Status:** RED - {failure_reason}
|
||||
- **Verifies:** {what_this_test_validates}
|
||||
|
||||
### API Tests ({api_test_count} tests)
|
||||
|
||||
**File:** `{api_test_file_path}` ({line_count} lines)
|
||||
|
||||
{List each API test with its current status and expected failure reason}
|
||||
|
||||
- ✅ **Test:** {test_name}
|
||||
- **Status:** RED - {failure_reason}
|
||||
- **Verifies:** {what_this_test_validates}
|
||||
|
||||
### Component Tests ({component_test_count} tests)
|
||||
|
||||
**File:** `{component_test_file_path}` ({line_count} lines)
|
||||
|
||||
{List each component test with its current status and expected failure reason}
|
||||
|
||||
- ✅ **Test:** {test_name}
|
||||
- **Status:** RED - {failure_reason}
|
||||
- **Verifies:** {what_this_test_validates}
|
||||
|
||||
---
|
||||
|
||||
## Data Factories Created
|
||||
|
||||
{List all data factory files created with their exports}
|
||||
|
||||
### {Entity} Factory
|
||||
|
||||
**File:** `tests/support/factories/{entity}.factory.ts`
|
||||
|
||||
**Exports:**
|
||||
|
||||
- `create{Entity}(overrides?)` - Create single entity with optional overrides
|
||||
- `create{Entity}s(count)` - Create array of entities
|
||||
|
||||
**Example Usage:**
|
||||
|
||||
```typescript
|
||||
const user = createUser({ email: 'specific@example.com' });
|
||||
const users = createUsers(5); // Generate 5 random users
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Fixtures Created
|
||||
|
||||
{List all test fixture files created with their fixture names and descriptions}
|
||||
|
||||
### {Feature} Fixtures
|
||||
|
||||
**File:** `tests/support/fixtures/{feature}.fixture.ts`
|
||||
|
||||
**Fixtures:**
|
||||
|
||||
- `{fixtureName}` - {description_of_what_fixture_provides}
|
||||
- **Setup:** {what_setup_does}
|
||||
- **Provides:** {what_test_receives}
|
||||
- **Cleanup:** {what_cleanup_does}
|
||||
|
||||
**Example Usage:**
|
||||
|
||||
```typescript
|
||||
import { test } from './fixtures/{feature}.fixture';
|
||||
|
||||
test('should do something', async ({ {fixtureName} }) => {
|
||||
// {fixtureName} is ready to use with auto-cleanup
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Mock Requirements
|
||||
|
||||
{Document external services that need mocking and their requirements}
|
||||
|
||||
### {Service Name} Mock
|
||||
|
||||
**Endpoint:** `{HTTP_METHOD} {endpoint_url}`
|
||||
|
||||
**Success Response:**
|
||||
|
||||
```json
|
||||
{
|
||||
{success_response_example}
|
||||
}
|
||||
```
|
||||
|
||||
**Failure Response:**
|
||||
|
||||
```json
|
||||
{
|
||||
{failure_response_example}
|
||||
}
|
||||
```
|
||||
|
||||
**Notes:** {any_special_mock_requirements}
|
||||
|
||||
---
|
||||
|
||||
## Required data-testid Attributes
|
||||
|
||||
{List all data-testid attributes required in UI implementation for test stability}
|
||||
|
||||
### {Page or Component Name}
|
||||
|
||||
- `{data-testid-name}` - {description_of_element}
|
||||
- `{data-testid-name}` - {description_of_element}
|
||||
|
||||
**Implementation Example:**
|
||||
|
||||
```tsx
|
||||
<button data-testid="login-button">Log In</button>
|
||||
<input data-testid="email-input" type="email" />
|
||||
<div data-testid="error-message">{errorText}</div>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation Checklist
|
||||
|
||||
{Map each failing test to concrete implementation tasks that will make it pass}
|
||||
|
||||
### Test: {test_name_1}
|
||||
|
||||
**File:** `{test_file_path}`
|
||||
|
||||
**Tasks to make this test pass:**
|
||||
|
||||
- [ ] {Implementation task 1}
|
||||
- [ ] {Implementation task 2}
|
||||
- [ ] {Implementation task 3}
|
||||
- [ ] Add required data-testid attributes: {list_of_testids}
|
||||
- [ ] Run test: `{test_execution_command}`
|
||||
- [ ] ✅ Test passes (green phase)
|
||||
|
||||
**Estimated Effort:** {effort_estimate} hours
|
||||
|
||||
---
|
||||
|
||||
### Test: {test_name_2}
|
||||
|
||||
**File:** `{test_file_path}`
|
||||
|
||||
**Tasks to make this test pass:**
|
||||
|
||||
- [ ] {Implementation task 1}
|
||||
- [ ] {Implementation task 2}
|
||||
- [ ] {Implementation task 3}
|
||||
- [ ] Add required data-testid attributes: {list_of_testids}
|
||||
- [ ] Run test: `{test_execution_command}`
|
||||
- [ ] ✅ Test passes (green phase)
|
||||
|
||||
**Estimated Effort:** {effort_estimate} hours
|
||||
|
||||
---
|
||||
|
||||
## Running Tests
|
||||
|
||||
```bash
|
||||
# Run all failing tests for this story
|
||||
{test_command_all}
|
||||
|
||||
# Run specific test file
|
||||
{test_command_specific_file}
|
||||
|
||||
# Run tests in headed mode (see browser)
|
||||
{test_command_headed}
|
||||
|
||||
# Debug specific test
|
||||
{test_command_debug}
|
||||
|
||||
# Run tests with coverage
|
||||
{test_command_coverage}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Red-Green-Refactor Workflow
|
||||
|
||||
### RED Phase (Complete) ✅
|
||||
|
||||
**TEA Agent Responsibilities:**
|
||||
|
||||
- ✅ All tests written and failing
|
||||
- ✅ Fixtures and factories created with auto-cleanup
|
||||
- ✅ Mock requirements documented
|
||||
- ✅ data-testid requirements listed
|
||||
- ✅ Implementation checklist created
|
||||
|
||||
**Verification:**
|
||||
|
||||
- All tests run and fail as expected
|
||||
- Failure messages are clear and actionable
|
||||
- Tests fail due to missing implementation, not test bugs
|
||||
|
||||
---
|
||||
|
||||
### GREEN Phase (DEV Team - Next Steps)
|
||||
|
||||
**DEV Agent Responsibilities:**
|
||||
|
||||
1. **Pick one failing test** from implementation checklist (start with highest priority)
|
||||
2. **Read the test** to understand expected behavior
|
||||
3. **Implement minimal code** to make that specific test pass
|
||||
4. **Run the test** to verify it now passes (green)
|
||||
5. **Check off the task** in implementation checklist
|
||||
6. **Move to next test** and repeat
|
||||
|
||||
**Key Principles:**
|
||||
|
||||
- One test at a time (don't try to fix all at once)
|
||||
- Minimal implementation (don't over-engineer)
|
||||
- Run tests frequently (immediate feedback)
|
||||
- Use implementation checklist as roadmap
|
||||
|
||||
**Progress Tracking:**
|
||||
|
||||
- Check off tasks as you complete them
|
||||
- Share progress in daily standup
|
||||
- Mark story as IN PROGRESS in `bmm-workflow-status.md`
|
||||
|
||||
---
|
||||
|
||||
### REFACTOR Phase (DEV Team - After All Tests Pass)
|
||||
|
||||
**DEV Agent Responsibilities:**
|
||||
|
||||
1. **Verify all tests pass** (green phase complete)
|
||||
2. **Review code for quality** (readability, maintainability, performance)
|
||||
3. **Extract duplications** (DRY principle)
|
||||
4. **Optimize performance** (if needed)
|
||||
5. **Ensure tests still pass** after each refactor
|
||||
6. **Update documentation** (if API contracts change)
|
||||
|
||||
**Key Principles:**
|
||||
|
||||
- Tests provide safety net (refactor with confidence)
|
||||
- Make small refactors (easier to debug if tests fail)
|
||||
- Run tests after each change
|
||||
- Don't change test behavior (only implementation)
|
||||
|
||||
**Completion:**
|
||||
|
||||
- All tests pass
|
||||
- Code quality meets team standards
|
||||
- No duplications or code smells
|
||||
- Ready for code review and story approval
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Share this checklist and failing tests** with the dev workflow (manual handoff)
|
||||
2. **Review this checklist** with team in standup or planning
|
||||
3. **Run failing tests** to confirm RED phase: `{test_command_all}`
|
||||
4. **Begin implementation** using implementation checklist as guide
|
||||
5. **Work one test at a time** (red → green for each)
|
||||
6. **Share progress** in daily standup
|
||||
7. **When all tests pass**, refactor code for quality
|
||||
8. **When refactoring complete**, manually update story status to 'done' in sprint-status.yaml
|
||||
|
||||
---
|
||||
|
||||
## Knowledge Base References Applied
|
||||
|
||||
This ATDD workflow consulted the following knowledge fragments:
|
||||
|
||||
- **fixture-architecture.md** - Test fixture patterns with setup/teardown and auto-cleanup using Playwright's `test.extend()`
|
||||
- **data-factories.md** - Factory patterns using `@faker-js/faker` for random test data generation with overrides support
|
||||
- **component-tdd.md** - Component test strategies using Playwright Component Testing
|
||||
- **network-first.md** - Route interception patterns (intercept BEFORE navigation to prevent race conditions)
|
||||
- **test-quality.md** - Test design principles (Given-When-Then, one assertion per test, determinism, isolation)
|
||||
- **test-levels-framework.md** - Test level selection framework (E2E vs API vs Component vs Unit)
|
||||
|
||||
See `tea-index.csv` for complete knowledge fragment mapping.
|
||||
|
||||
---
|
||||
|
||||
## Test Execution Evidence
|
||||
|
||||
### Initial Test Run (RED Phase Verification)
|
||||
|
||||
**Command:** `{test_command_all}`
|
||||
|
||||
**Results:**
|
||||
|
||||
```
|
||||
{paste_test_run_output_showing_all_tests_failing}
|
||||
```
|
||||
|
||||
**Summary:**
|
||||
|
||||
- Total tests: {total_test_count}
|
||||
- Passing: 0 (expected)
|
||||
- Failing: {total_test_count} (expected)
|
||||
- Status: ✅ RED phase verified
|
||||
|
||||
**Expected Failure Messages:**
|
||||
{list_expected_failure_messages_for_each_test}
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
{Any additional notes, context, or special considerations for this story}
|
||||
|
||||
- {Note 1}
|
||||
- {Note 2}
|
||||
- {Note 3}
|
||||
|
||||
---
|
||||
|
||||
## Contact
|
||||
|
||||
**Questions or Issues?**
|
||||
|
||||
- Ask in team standup
|
||||
- Tag @{tea_agent_username} in Slack/Discord
|
||||
- Refer to `./bmm/docs/tea-README.md` for workflow documentation
|
||||
- Consult `./bmm/testarch/knowledge` for testing best practices
|
||||
|
||||
---
|
||||
|
||||
**Generated by BMad TEA Agent** - {date}
|
||||
@@ -1,374 +0,0 @@
|
||||
# ATDD Workflow Validation Checklist
|
||||
|
||||
Use this checklist to validate that the ATDD workflow has been executed correctly and all deliverables meet quality standards.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before starting this workflow, verify:
|
||||
|
||||
- [ ] Story approved with clear acceptance criteria (AC must be testable)
|
||||
- [ ] Development sandbox/environment ready
|
||||
- [ ] Framework scaffolding exists (run `framework` workflow if missing)
|
||||
- [ ] Test framework configuration available (playwright.config.ts or cypress.config.ts)
|
||||
- [ ] Package.json has test dependencies installed (Playwright or Cypress)
|
||||
|
||||
**Halt if missing:** Framework scaffolding or story acceptance criteria
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Story Context and Requirements
|
||||
|
||||
- [ ] Story markdown file loaded and parsed successfully
|
||||
- [ ] All acceptance criteria identified and extracted
|
||||
- [ ] Affected systems and components identified
|
||||
- [ ] Technical constraints documented
|
||||
- [ ] Framework configuration loaded (playwright.config.ts or cypress.config.ts)
|
||||
- [ ] Test directory structure identified from config
|
||||
- [ ] Existing fixture patterns reviewed for consistency
|
||||
- [ ] Similar test patterns searched and found in `{test_dir}`
|
||||
- [ ] Knowledge base fragments loaded:
|
||||
- [ ] `fixture-architecture.md`
|
||||
- [ ] `data-factories.md`
|
||||
- [ ] `component-tdd.md`
|
||||
- [ ] `network-first.md`
|
||||
- [ ] `test-quality.md`
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Test Level Selection and Strategy
|
||||
|
||||
- [ ] Each acceptance criterion analyzed for appropriate test level
|
||||
- [ ] Test level selection framework applied (E2E vs API vs Component vs Unit)
|
||||
- [ ] E2E tests: Critical user journeys and multi-system integration identified
|
||||
- [ ] API tests: Business logic and service contracts identified
|
||||
- [ ] Component tests: UI component behavior and interactions identified
|
||||
- [ ] Unit tests: Pure logic and edge cases identified (if applicable)
|
||||
- [ ] Duplicate coverage avoided (same behavior not tested at multiple levels unnecessarily)
|
||||
- [ ] Tests prioritized using P0-P3 framework (if test-design document exists)
|
||||
- [ ] Primary test level set in `primary_level` variable (typically E2E or API)
|
||||
- [ ] Test levels documented in ATDD checklist
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Failing Tests Generated
|
||||
|
||||
### Test File Structure Created
|
||||
|
||||
- [ ] Test files organized in appropriate directories:
|
||||
- [ ] `tests/e2e/` for end-to-end tests
|
||||
- [ ] `tests/api/` for API tests
|
||||
- [ ] `tests/component/` for component tests
|
||||
- [ ] `tests/support/` for infrastructure (fixtures, factories, helpers)
|
||||
|
||||
### E2E Tests (If Applicable)
|
||||
|
||||
- [ ] E2E test files created in `tests/e2e/`
|
||||
- [ ] All tests follow Given-When-Then format
|
||||
- [ ] Tests use `data-testid` selectors (not CSS classes or fragile selectors)
|
||||
- [ ] One assertion per test (atomic test design)
|
||||
- [ ] No hard waits or sleeps (explicit waits only)
|
||||
- [ ] Network-first pattern applied (route interception BEFORE navigation)
|
||||
- [ ] Tests fail initially (RED phase verified by local test run)
|
||||
- [ ] Failure messages are clear and actionable
|
||||
|
||||
### API Tests (If Applicable)
|
||||
|
||||
- [ ] API test files created in `tests/api/`
|
||||
- [ ] Tests follow Given-When-Then format
|
||||
- [ ] API contracts validated (request/response structure)
|
||||
- [ ] HTTP status codes verified
|
||||
- [ ] Response body validation includes all required fields
|
||||
- [ ] Error cases tested (400, 401, 403, 404, 500)
|
||||
- [ ] Tests fail initially (RED phase verified)
|
||||
|
||||
### Component Tests (If Applicable)
|
||||
|
||||
- [ ] Component test files created in `tests/component/`
|
||||
- [ ] Tests follow Given-When-Then format
|
||||
- [ ] Component mounting works correctly
|
||||
- [ ] Interaction testing covers user actions (click, hover, keyboard)
|
||||
- [ ] State management within component validated
|
||||
- [ ] Props and events tested
|
||||
- [ ] Tests fail initially (RED phase verified)
|
||||
|
||||
### Test Quality Validation
|
||||
|
||||
- [ ] All tests use Given-When-Then structure with clear comments
|
||||
- [ ] All tests have descriptive names explaining what they test
|
||||
- [ ] No duplicate tests (same behavior tested multiple times)
|
||||
- [ ] No flaky patterns (race conditions, timing issues)
|
||||
- [ ] No test interdependencies (tests can run in any order)
|
||||
- [ ] Tests are deterministic (same input always produces same result)
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Data Infrastructure Built
|
||||
|
||||
### Data Factories Created
|
||||
|
||||
- [ ] Factory files created in `tests/support/factories/`
|
||||
- [ ] All factories use `@faker-js/faker` for random data generation (no hardcoded values)
|
||||
- [ ] Factories support overrides for specific test scenarios
|
||||
- [ ] Factories generate complete valid objects matching API contracts
|
||||
- [ ] Helper functions for bulk creation provided (e.g., `createUsers(count)`)
|
||||
- [ ] Factory exports are properly typed (TypeScript)
|
||||
|
||||
### Test Fixtures Created
|
||||
|
||||
- [ ] Fixture files created in `tests/support/fixtures/`
|
||||
- [ ] All fixtures use Playwright's `test.extend()` pattern
|
||||
- [ ] Fixtures have setup phase (arrange test preconditions)
|
||||
- [ ] Fixtures provide data to tests via `await use(data)`
|
||||
- [ ] Fixtures have teardown phase with auto-cleanup (delete created data)
|
||||
- [ ] Fixtures are composable (can use other fixtures if needed)
|
||||
- [ ] Fixtures are isolated (each test gets fresh data)
|
||||
- [ ] Fixtures are type-safe (TypeScript types defined)
|
||||
|
||||
### Mock Requirements Documented
|
||||
|
||||
- [ ] External service mocking requirements identified
|
||||
- [ ] Mock endpoints documented with URLs and methods
|
||||
- [ ] Success response examples provided
|
||||
- [ ] Failure response examples provided
|
||||
- [ ] Mock requirements documented in ATDD checklist for DEV team
|
||||
|
||||
### data-testid Requirements Listed
|
||||
|
||||
- [ ] All required data-testid attributes identified from E2E tests
|
||||
- [ ] data-testid list organized by page or component
|
||||
- [ ] Each data-testid has clear description of element it targets
|
||||
- [ ] data-testid list included in ATDD checklist for DEV team
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Implementation Checklist Created
|
||||
|
||||
- [ ] Implementation checklist created with clear structure
|
||||
- [ ] Each failing test mapped to concrete implementation tasks
|
||||
- [ ] Tasks include:
|
||||
- [ ] Route/component creation
|
||||
- [ ] Business logic implementation
|
||||
- [ ] API integration
|
||||
- [ ] data-testid attribute additions
|
||||
- [ ] Error handling
|
||||
- [ ] Test execution command
|
||||
- [ ] Completion checkbox
|
||||
- [ ] Red-Green-Refactor workflow documented in checklist
|
||||
- [ ] RED phase marked as complete (TEA responsibility)
|
||||
- [ ] GREEN phase tasks listed for DEV team
|
||||
- [ ] REFACTOR phase guidance provided
|
||||
- [ ] Execution commands provided:
|
||||
- [ ] Run all tests: `npm run test:e2e`
|
||||
- [ ] Run specific test file
|
||||
- [ ] Run in headed mode
|
||||
- [ ] Debug specific test
|
||||
- [ ] Estimated effort included (hours or story points)
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Deliverables Generated
|
||||
|
||||
### ATDD Checklist Document Created
|
||||
|
||||
- [ ] Output file created at `{output_folder}/atdd-checklist-{story_id}.md`
|
||||
- [ ] Document follows template structure from `atdd-checklist-template.md`
|
||||
- [ ] Document includes all required sections:
|
||||
- [ ] Story summary
|
||||
- [ ] Acceptance criteria breakdown
|
||||
- [ ] Failing tests created (paths and line counts)
|
||||
- [ ] Data factories created
|
||||
- [ ] Fixtures created
|
||||
- [ ] Mock requirements
|
||||
- [ ] Required data-testid attributes
|
||||
- [ ] Implementation checklist
|
||||
- [ ] Red-green-refactor workflow
|
||||
- [ ] Execution commands
|
||||
- [ ] Next steps for DEV team
|
||||
- [ ] Output shared with DEV workflow (manual handoff; not auto-consumed)
|
||||
|
||||
### All Tests Verified to Fail (RED Phase)
|
||||
|
||||
- [ ] Full test suite run locally before finalizing
|
||||
- [ ] All tests fail as expected (RED phase confirmed)
|
||||
- [ ] No tests passing before implementation (if passing, test is invalid)
|
||||
- [ ] Failure messages documented in ATDD checklist
|
||||
- [ ] Failures are due to missing implementation, not test bugs
|
||||
- [ ] Test run output captured for reference
|
||||
|
||||
### Summary Provided
|
||||
|
||||
- [ ] Summary includes:
|
||||
- [ ] Story ID
|
||||
- [ ] Primary test level
|
||||
- [ ] Test counts (E2E, API, Component)
|
||||
- [ ] Test file paths
|
||||
- [ ] Factory count
|
||||
- [ ] Fixture count
|
||||
- [ ] Mock requirements count
|
||||
- [ ] data-testid count
|
||||
- [ ] Implementation task count
|
||||
- [ ] Estimated effort
|
||||
- [ ] Next steps for DEV team
|
||||
- [ ] Output file path
|
||||
- [ ] Knowledge base references applied
|
||||
|
||||
---
|
||||
|
||||
## Quality Checks
|
||||
|
||||
### Test Design Quality
|
||||
|
||||
- [ ] Tests are readable (clear Given-When-Then structure)
|
||||
- [ ] Tests are maintainable (use factories and fixtures, not hardcoded data)
|
||||
- [ ] Tests are isolated (no shared state between tests)
|
||||
- [ ] Tests are deterministic (no race conditions or flaky patterns)
|
||||
- [ ] Tests are atomic (one assertion per test)
|
||||
- [ ] Tests are fast (no unnecessary waits or delays)
|
||||
|
||||
### Knowledge Base Integration
|
||||
|
||||
- [ ] fixture-architecture.md patterns applied to all fixtures
|
||||
- [ ] data-factories.md patterns applied to all factories
|
||||
- [ ] network-first.md patterns applied to E2E tests with network requests
|
||||
- [ ] component-tdd.md patterns applied to component tests
|
||||
- [ ] test-quality.md principles applied to all test design
|
||||
|
||||
### Code Quality
|
||||
|
||||
- [ ] All TypeScript types are correct and complete
|
||||
- [ ] No linting errors in generated test files
|
||||
- [ ] Consistent naming conventions followed
|
||||
- [ ] Imports are organized and correct
|
||||
- [ ] Code follows project style guide
|
||||
|
||||
---
|
||||
|
||||
## Integration Points
|
||||
|
||||
### With DEV Agent
|
||||
|
||||
- [ ] ATDD checklist provides clear implementation guidance
|
||||
- [ ] Implementation tasks are granular and actionable
|
||||
- [ ] data-testid requirements are complete and clear
|
||||
- [ ] Mock requirements include all necessary details
|
||||
- [ ] Execution commands work correctly
|
||||
|
||||
### With Story Workflow
|
||||
|
||||
- [ ] Story ID correctly referenced in output files
|
||||
- [ ] Acceptance criteria from story accurately reflected in tests
|
||||
- [ ] Technical constraints from story considered in test design
|
||||
|
||||
### With Framework Workflow
|
||||
|
||||
- [ ] Test framework configuration correctly detected and used
|
||||
- [ ] Directory structure matches framework setup
|
||||
- [ ] Fixtures and helpers follow established patterns
|
||||
- [ ] Naming conventions consistent with framework standards
|
||||
|
||||
### With test-design Workflow (If Available)
|
||||
|
||||
- [ ] P0 scenarios from test-design prioritized in ATDD
|
||||
- [ ] Risk assessment from test-design considered in test coverage
|
||||
- [ ] Coverage strategy from test-design aligned with ATDD tests
|
||||
|
||||
---
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
All of the following must be true before marking this workflow as complete:
|
||||
|
||||
- [ ] **Story acceptance criteria analyzed** and mapped to appropriate test levels
|
||||
- [ ] **Failing tests created** at all appropriate levels (E2E, API, Component)
|
||||
- [ ] **Given-When-Then format** used consistently across all tests
|
||||
- [ ] **RED phase verified** by local test run (all tests failing as expected)
|
||||
- [ ] **Network-first pattern** applied to E2E tests with network requests
|
||||
- [ ] **Data factories created** using faker (no hardcoded test data)
|
||||
- [ ] **Fixtures created** with auto-cleanup in teardown
|
||||
- [ ] **Mock requirements documented** for external services
|
||||
- [ ] **data-testid attributes listed** for DEV team
|
||||
- [ ] **Implementation checklist created** mapping tests to code tasks
|
||||
- [ ] **Red-green-refactor workflow documented** in ATDD checklist
|
||||
- [ ] **Execution commands provided** and verified to work
|
||||
- [ ] **ATDD checklist document created** and saved to correct location
|
||||
- [ ] **Output file formatted correctly** using template structure
|
||||
- [ ] **Knowledge base references applied** and documented in summary
|
||||
- [ ] **No test quality issues** (flaky patterns, race conditions, hardcoded data)
|
||||
|
||||
---
|
||||
|
||||
## Common Issues and Resolutions
|
||||
|
||||
### Issue: Tests pass before implementation
|
||||
|
||||
**Problem:** A test passes even though no implementation code exists yet.
|
||||
|
||||
**Resolution:**
|
||||
|
||||
- Review test to ensure it's testing actual behavior, not mocked/stubbed behavior
|
||||
- Check if test is accidentally using existing functionality
|
||||
- Verify test assertions are correct and meaningful
|
||||
- Rewrite test to fail until implementation is complete
|
||||
|
||||
### Issue: Network-first pattern not applied
|
||||
|
||||
**Problem:** Route interception happens after navigation, causing race conditions.
|
||||
|
||||
**Resolution:**
|
||||
|
||||
- Move `await page.route()` calls BEFORE `await page.goto()`
|
||||
- Review `network-first.md` knowledge fragment
|
||||
- Update all E2E tests to follow network-first pattern
|
||||
|
||||
### Issue: Hardcoded test data in tests
|
||||
|
||||
**Problem:** Tests use hardcoded strings/numbers instead of factories.
|
||||
|
||||
**Resolution:**
|
||||
|
||||
- Replace all hardcoded data with factory function calls
|
||||
- Use `faker` for all random data generation
|
||||
- Update data-factories to support all required test scenarios
|
||||
|
||||
### Issue: Fixtures missing auto-cleanup
|
||||
|
||||
**Problem:** Fixtures create data but don't clean it up in teardown.
|
||||
|
||||
**Resolution:**
|
||||
|
||||
- Add cleanup logic after `await use(data)` in fixture
|
||||
- Call deletion/cleanup functions in teardown
|
||||
- Verify cleanup works by checking database/storage after test run
|
||||
|
||||
### Issue: Tests have multiple assertions
|
||||
|
||||
**Problem:** Tests verify multiple behaviors in single test (not atomic).
|
||||
|
||||
**Resolution:**
|
||||
|
||||
- Split into separate tests (one assertion per test)
|
||||
- Each test should verify exactly one behavior
|
||||
- Use descriptive test names to clarify what each test verifies
|
||||
|
||||
### Issue: Tests depend on execution order
|
||||
|
||||
**Problem:** Tests fail when run in isolation or different order.
|
||||
|
||||
**Resolution:**
|
||||
|
||||
- Remove shared state between tests
|
||||
- Each test should create its own test data
|
||||
- Use fixtures for consistent setup across tests
|
||||
- Verify tests can run with `.only` flag
|
||||
|
||||
---
|
||||
|
||||
## Notes for TEA Agent
|
||||
|
||||
- **Preflight halt is critical:** Do not proceed if story has no acceptance criteria or framework is missing
|
||||
- **RED phase verification is mandatory:** Tests must fail before sharing with DEV team
|
||||
- **Network-first pattern:** Route interception BEFORE navigation prevents race conditions
|
||||
- **One assertion per test:** Atomic tests provide clear failure diagnosis
|
||||
- **Auto-cleanup is non-negotiable:** Every fixture must clean up data in teardown
|
||||
- **Use knowledge base:** Load relevant fragments (fixture-architecture, data-factories, network-first, component-tdd, test-quality) for guidance
|
||||
- **Share with DEV agent:** ATDD checklist provides implementation roadmap from red to green
|
||||
@@ -1,806 +0,0 @@
|
||||
<!-- Powered by BMAD-CORE™ -->
|
||||
|
||||
# Acceptance Test-Driven Development (ATDD)
|
||||
|
||||
**Workflow ID**: `_bmad/bmm/testarch/atdd`
|
||||
**Version**: 4.0 (BMad v6)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Generates failing acceptance tests BEFORE implementation following TDD's red-green-refactor cycle. This workflow creates comprehensive test coverage at appropriate levels (E2E, API, Component) with supporting infrastructure (fixtures, factories, mocks) and provides an implementation checklist to guide development.
|
||||
|
||||
**Core Principle**: Tests fail first (red phase), then guide development to green, then enable confident refactoring.
|
||||
|
||||
---
|
||||
|
||||
## Preflight Requirements
|
||||
|
||||
**Critical:** Verify these requirements before proceeding. If any fail, HALT and notify the user.
|
||||
|
||||
- ✅ Story approved with clear acceptance criteria
|
||||
- ✅ Development sandbox/environment ready
|
||||
- ✅ Framework scaffolding exists (run `framework` workflow if missing)
|
||||
- ✅ Test framework configuration available (playwright.config.ts or cypress.config.ts)
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Load Story Context and Requirements
|
||||
|
||||
### Actions
|
||||
|
||||
1. **Read Story Markdown**
|
||||
- Load story file from `{story_file}` variable
|
||||
- Extract acceptance criteria (all testable requirements)
|
||||
- Identify affected systems and components
|
||||
- Note any technical constraints or dependencies
|
||||
|
||||
2. **Load Framework Configuration**
|
||||
- Read framework config (playwright.config.ts or cypress.config.ts)
|
||||
- Identify test directory structure
|
||||
- Check existing fixture patterns
|
||||
- Note test runner capabilities
|
||||
|
||||
3. **Load Existing Test Patterns**
|
||||
- Search `{test_dir}` for similar tests
|
||||
- Identify reusable fixtures and helpers
|
||||
- Check data factory patterns
|
||||
- Note naming conventions
|
||||
|
||||
4. **Check Playwright Utils Flag**
|
||||
|
||||
Read `{config_source}` and check `config.tea_use_playwright_utils`.
|
||||
|
||||
5. **Load Knowledge Base Fragments**
|
||||
|
||||
**Critical:** Consult `{project-root}/_bmad/bmm/testarch/tea-index.csv` to load:
|
||||
|
||||
**Core Patterns (Always load):**
|
||||
- `data-factories.md` - Factory patterns using faker (override patterns, nested factories, API seeding, 498 lines, 5 examples)
|
||||
- `component-tdd.md` - Component test strategies (red-green-refactor, provider isolation, accessibility, visual regression, 480 lines, 4 examples)
|
||||
- `test-quality.md` - Test design principles (deterministic tests, isolated with cleanup, explicit assertions, length limits, execution time optimization, 658 lines, 5 examples)
|
||||
- `test-healing-patterns.md` - Common failure patterns and healing strategies (stale selectors, race conditions, dynamic data, network errors, hard waits, 648 lines, 5 examples)
|
||||
- `selector-resilience.md` - Selector best practices (data-testid > ARIA > text > CSS hierarchy, dynamic patterns, anti-patterns, 541 lines, 4 examples)
|
||||
- `timing-debugging.md` - Race condition prevention and async debugging (network-first, deterministic waiting, anti-patterns, 370 lines, 3 examples)
|
||||
|
||||
**If `config.tea_use_playwright_utils: true` (All Utilities):**
|
||||
- `overview.md` - Playwright utils for ATDD patterns
|
||||
- `api-request.md` - API test examples with schema validation
|
||||
- `network-recorder.md` - HAR record/playback for UI acceptance tests
|
||||
- `auth-session.md` - Auth setup for acceptance tests
|
||||
- `intercept-network-call.md` - Network interception in ATDD scenarios
|
||||
- `recurse.md` - Polling for async acceptance criteria
|
||||
- `log.md` - Logging in ATDD tests
|
||||
- `file-utils.md` - File download validation in acceptance tests
|
||||
- `network-error-monitor.md` - Catch silent failures in ATDD
|
||||
- `fixtures-composition.md` - Composing utilities for ATDD
|
||||
|
||||
**If `config.tea_use_playwright_utils: false`:**
|
||||
- `fixture-architecture.md` - Test fixture patterns with auto-cleanup (pure function → fixture → mergeTests composition, 406 lines, 5 examples)
|
||||
- `network-first.md` - Route interception patterns (intercept before navigate, HAR capture, deterministic waiting, 489 lines, 5 examples)
|
||||
|
||||
**Halt Condition:** If story has no acceptance criteria or framework is missing, HALT with message: "ATDD requires clear acceptance criteria and test framework setup"
|
||||
|
||||
---
|
||||
|
||||
## Step 1.5: Generation Mode Selection (NEW - Phase 2.5)
|
||||
|
||||
### Actions
|
||||
|
||||
1. **Detect Generation Mode**
|
||||
|
||||
Determine mode based on scenario complexity:
|
||||
|
||||
**AI Generation Mode (DEFAULT)**:
|
||||
- Clear acceptance criteria with standard patterns
|
||||
- Uses: AI-generated tests from requirements
|
||||
- Appropriate for: CRUD, auth, navigation, API tests
|
||||
- Fastest approach
|
||||
|
||||
**Recording Mode (OPTIONAL - Complex UI)**:
|
||||
- Complex UI interactions (drag-drop, wizards, multi-page flows)
|
||||
- Uses: Interactive test recording with Playwright MCP
|
||||
- Appropriate for: Visual workflows, unclear requirements
|
||||
- Only if config.tea_use_mcp_enhancements is true AND MCP available
|
||||
|
||||
2. **AI Generation Mode (DEFAULT - Continue to Step 2)**
|
||||
|
||||
For standard scenarios:
|
||||
- Continue with existing workflow (Step 2: Select Test Levels and Strategy)
|
||||
- AI generates tests based on acceptance criteria from Step 1
|
||||
- Use knowledge base patterns for test structure
|
||||
|
||||
3. **Recording Mode (OPTIONAL - Complex UI Only)**
|
||||
|
||||
For complex UI scenarios AND config.tea_use_mcp_enhancements is true:
|
||||
|
||||
**A. Check MCP Availability**
|
||||
|
||||
If Playwright MCP tools are available in your IDE:
|
||||
- Use MCP recording mode (Step 3.B)
|
||||
|
||||
If MCP unavailable:
|
||||
- Fallback to AI generation mode (silent, automatic)
|
||||
- Continue to Step 2
|
||||
|
||||
**B. Interactive Test Recording (MCP-Based)**
|
||||
|
||||
Use Playwright MCP test-generator tools:
|
||||
|
||||
**Setup:**
|
||||
|
||||
```
|
||||
1. Use generator_setup_page to initialize recording session
|
||||
2. Navigate to application starting URL (from story context)
|
||||
3. Ready to record user interactions
|
||||
```
|
||||
|
||||
**Recording Process (Per Acceptance Criterion):**
|
||||
|
||||
```
|
||||
4. Read acceptance criterion from story
|
||||
5. Manually execute test scenario using browser_* tools:
|
||||
- browser_navigate: Navigate to pages
|
||||
- browser_click: Click buttons, links, elements
|
||||
- browser_type: Fill form fields
|
||||
- browser_select: Select dropdown options
|
||||
- browser_check: Check/uncheck checkboxes
|
||||
6. Add verification steps using browser_verify_* tools:
|
||||
- browser_verify_text: Verify text content
|
||||
- browser_verify_visible: Verify element visibility
|
||||
- browser_verify_url: Verify URL navigation
|
||||
7. Capture interaction log with generator_read_log
|
||||
8. Generate test file with generator_write_test
|
||||
9. Repeat for next acceptance criterion
|
||||
```
|
||||
|
||||
**Post-Recording Enhancement:**
|
||||
|
||||
```
|
||||
10. Review generated test code
|
||||
11. Enhance with knowledge base patterns:
|
||||
- Add Given-When-Then comments
|
||||
- Replace recorded selectors with data-testid (if needed)
|
||||
- Add network-first interception (from network-first.md)
|
||||
- Add fixtures for auth/data setup (from fixture-architecture.md)
|
||||
- Use factories for test data (from data-factories.md)
|
||||
12. Verify tests fail (missing implementation)
|
||||
13. Continue to Step 4 (Build Data Infrastructure)
|
||||
```
|
||||
|
||||
**When to Use Recording Mode:**
|
||||
- ✅ Complex UI interactions (drag-drop, multi-step forms, wizards)
|
||||
- ✅ Visual workflows (modals, dialogs, animations)
|
||||
- ✅ Unclear requirements (exploratory, discovering expected behavior)
|
||||
- ✅ Multi-page flows (checkout, registration, onboarding)
|
||||
- ❌ NOT for simple CRUD (AI generation faster)
|
||||
- ❌ NOT for API-only tests (no UI to record)
|
||||
|
||||
**When to Use AI Generation (Default):**
|
||||
- ✅ Clear acceptance criteria available
|
||||
- ✅ Standard patterns (login, CRUD, navigation)
|
||||
- ✅ Need many tests quickly
|
||||
- ✅ API/backend tests (no UI interaction)
|
||||
|
||||
4. **Proceed to Test Level Selection**
|
||||
|
||||
After mode selection:
|
||||
- AI Generation: Continue to Step 2 (Select Test Levels and Strategy)
|
||||
- Recording: Skip to Step 4 (Build Data Infrastructure) - tests already generated
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Select Test Levels and Strategy
|
||||
|
||||
### Actions
|
||||
|
||||
1. **Analyze Acceptance Criteria**
|
||||
|
||||
For each acceptance criterion, determine:
|
||||
- Does it require full user journey? → E2E test
|
||||
- Does it test business logic/API contract? → API test
|
||||
- Does it validate UI component behavior? → Component test
|
||||
- Can it be unit tested? → Unit test
|
||||
|
||||
2. **Apply Test Level Selection Framework**
|
||||
|
||||
**Knowledge Base Reference**: `test-levels-framework.md`
|
||||
|
||||
**E2E (End-to-End)**:
|
||||
- Critical user journeys (login, checkout, core workflow)
|
||||
- Multi-system integration
|
||||
- User-facing acceptance criteria
|
||||
- **Characteristics**: High confidence, slow execution, brittle
|
||||
|
||||
**API (Integration)**:
|
||||
- Business logic validation
|
||||
- Service contracts
|
||||
- Data transformations
|
||||
- **Characteristics**: Fast feedback, good balance, stable
|
||||
|
||||
**Component**:
|
||||
- UI component behavior (buttons, forms, modals)
|
||||
- Interaction testing
|
||||
- Visual regression
|
||||
- **Characteristics**: Fast, isolated, granular
|
||||
|
||||
**Unit**:
|
||||
- Pure business logic
|
||||
- Edge cases
|
||||
- Error handling
|
||||
- **Characteristics**: Fastest, most granular
|
||||
|
||||
3. **Avoid Duplicate Coverage**
|
||||
|
||||
Don't test same behavior at multiple levels unless necessary:
|
||||
- Use E2E for critical happy path only
|
||||
- Use API tests for complex business logic variations
|
||||
- Use component tests for UI interaction edge cases
|
||||
- Use unit tests for pure logic edge cases
|
||||
|
||||
4. **Prioritize Tests**
|
||||
|
||||
If test-design document exists, align with priority levels:
|
||||
- P0 scenarios → Must cover in failing tests
|
||||
- P1 scenarios → Should cover if time permits
|
||||
- P2/P3 scenarios → Optional for this iteration
|
||||
|
||||
**Decision Point:** Set `primary_level` variable to main test level for this story (typically E2E or API)
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Generate Failing Tests
|
||||
|
||||
### Actions
|
||||
|
||||
1. **Create Test File Structure**
|
||||
|
||||
```
|
||||
tests/
|
||||
├── e2e/
|
||||
│ └── {feature-name}.spec.ts # E2E acceptance tests
|
||||
├── api/
|
||||
│ └── {feature-name}.api.spec.ts # API contract tests
|
||||
├── component/
|
||||
│ └── {ComponentName}.test.tsx # Component tests
|
||||
└── support/
|
||||
├── fixtures/ # Test fixtures
|
||||
├── factories/ # Data factories
|
||||
└── helpers/ # Utility functions
|
||||
```
|
||||
|
||||
2. **Write Failing E2E Tests (If Applicable)**
|
||||
|
||||
**Use Given-When-Then format:**
|
||||
|
||||
```typescript
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
test.describe('User Login', () => {
|
||||
test('should display error for invalid credentials', async ({ page }) => {
|
||||
// GIVEN: User is on login page
|
||||
await page.goto('/login');
|
||||
|
||||
// WHEN: User submits invalid credentials
|
||||
await page.fill('[data-testid="email-input"]', 'invalid@example.com');
|
||||
await page.fill('[data-testid="password-input"]', 'wrongpassword');
|
||||
await page.click('[data-testid="login-button"]');
|
||||
|
||||
// THEN: Error message is displayed
|
||||
await expect(page.locator('[data-testid="error-message"]')).toHaveText('Invalid email or password');
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Critical patterns:**
|
||||
- One assertion per test (atomic tests)
|
||||
- Explicit waits (no hard waits/sleeps)
|
||||
- Network-first approach (route interception before navigation)
|
||||
- data-testid selectors for stability
|
||||
- Clear Given-When-Then structure
|
||||
|
||||
3. **Apply Network-First Pattern**
|
||||
|
||||
**Knowledge Base Reference**: `network-first.md`
|
||||
|
||||
```typescript
|
||||
test('should load user dashboard after login', async ({ page }) => {
|
||||
// CRITICAL: Intercept routes BEFORE navigation
|
||||
await page.route('**/api/user', (route) =>
|
||||
route.fulfill({
|
||||
status: 200,
|
||||
body: JSON.stringify({ id: 1, name: 'Test User' }),
|
||||
}),
|
||||
);
|
||||
|
||||
// NOW navigate
|
||||
await page.goto('/dashboard');
|
||||
|
||||
await expect(page.locator('[data-testid="user-name"]')).toHaveText('Test User');
|
||||
});
|
||||
```
|
||||
|
||||
4. **Write Failing API Tests (If Applicable)**
|
||||
|
||||
```typescript
|
||||
import { test, expect } from '@playwright/test';
|
||||
|
||||
test.describe('User API', () => {
|
||||
test('POST /api/users - should create new user', async ({ request }) => {
|
||||
// GIVEN: Valid user data
|
||||
const userData = {
|
||||
email: 'newuser@example.com',
|
||||
name: 'New User',
|
||||
};
|
||||
|
||||
// WHEN: Creating user via API
|
||||
const response = await request.post('/api/users', {
|
||||
data: userData,
|
||||
});
|
||||
|
||||
// THEN: User is created successfully
|
||||
expect(response.status()).toBe(201);
|
||||
const body = await response.json();
|
||||
expect(body).toMatchObject({
|
||||
email: userData.email,
|
||||
name: userData.name,
|
||||
id: expect.any(Number),
|
||||
});
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
5. **Write Failing Component Tests (If Applicable)**
|
||||
|
||||
**Knowledge Base Reference**: `component-tdd.md`
|
||||
|
||||
```typescript
|
||||
import { test, expect } from '@playwright/experimental-ct-react';
|
||||
import { LoginForm } from './LoginForm';
|
||||
|
||||
test.describe('LoginForm Component', () => {
|
||||
test('should disable submit button when fields are empty', async ({ mount }) => {
|
||||
// GIVEN: LoginForm is mounted
|
||||
const component = await mount(<LoginForm />);
|
||||
|
||||
// WHEN: Form is initially rendered
|
||||
const submitButton = component.locator('button[type="submit"]');
|
||||
|
||||
// THEN: Submit button is disabled
|
||||
await expect(submitButton).toBeDisabled();
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
6. **Verify Tests Fail Initially**
|
||||
|
||||
**Critical verification:**
|
||||
- Run tests locally to confirm they fail
|
||||
- Failure should be due to missing implementation, not test errors
|
||||
- Failure messages should be clear and actionable
|
||||
- All tests must be in RED phase before sharing with DEV
|
||||
|
||||
**Important:** Tests MUST fail initially. If a test passes before implementation, it's not a valid acceptance test.
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Build Data Infrastructure
|
||||
|
||||
### Actions
|
||||
|
||||
1. **Create Data Factories**
|
||||
|
||||
**Knowledge Base Reference**: `data-factories.md`
|
||||
|
||||
```typescript
|
||||
// tests/support/factories/user.factory.ts
|
||||
import { faker } from '@faker-js/faker';
|
||||
|
||||
export const createUser = (overrides = {}) => ({
|
||||
id: faker.number.int(),
|
||||
email: faker.internet.email(),
|
||||
name: faker.person.fullName(),
|
||||
createdAt: faker.date.recent().toISOString(),
|
||||
...overrides,
|
||||
});
|
||||
|
||||
export const createUsers = (count: number) => Array.from({ length: count }, () => createUser());
|
||||
```
|
||||
|
||||
**Factory principles:**
|
||||
- Use faker for random data (no hardcoded values)
|
||||
- Support overrides for specific scenarios
|
||||
- Generate complete valid objects
|
||||
- Include helper functions for bulk creation
|
||||
|
||||
2. **Create Test Fixtures**
|
||||
|
||||
**Knowledge Base Reference**: `fixture-architecture.md`
|
||||
|
||||
```typescript
|
||||
// tests/support/fixtures/auth.fixture.ts
|
||||
import { test as base } from '@playwright/test';
|
||||
|
||||
export const test = base.extend({
|
||||
authenticatedUser: async ({ page }, use) => {
|
||||
// Setup: Create and authenticate user
|
||||
const user = await createUser();
|
||||
await page.goto('/login');
|
||||
await page.fill('[data-testid="email"]', user.email);
|
||||
await page.fill('[data-testid="password"]', 'password123');
|
||||
await page.click('[data-testid="login-button"]');
|
||||
await page.waitForURL('/dashboard');
|
||||
|
||||
// Provide to test
|
||||
await use(user);
|
||||
|
||||
// Cleanup: Delete user
|
||||
await deleteUser(user.id);
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
**Fixture principles:**
|
||||
- Auto-cleanup (always delete created data)
|
||||
- Composable (fixtures can use other fixtures)
|
||||
- Isolated (each test gets fresh data)
|
||||
- Type-safe
|
||||
|
||||
3. **Document Mock Requirements**
|
||||
|
||||
If external services need mocking, document requirements:
|
||||
|
||||
```markdown
|
||||
### Mock Requirements for DEV Team
|
||||
|
||||
**Payment Gateway Mock**:
|
||||
|
||||
- Endpoint: `POST /api/payments`
|
||||
- Success response: `{ status: 'success', transactionId: '123' }`
|
||||
- Failure response: `{ status: 'failed', error: 'Insufficient funds' }`
|
||||
|
||||
**Email Service Mock**:
|
||||
|
||||
- Should not send real emails in test environment
|
||||
- Log email contents for verification
|
||||
```
|
||||
|
||||
4. **List Required data-testid Attributes**
|
||||
|
||||
```markdown
|
||||
### Required data-testid Attributes
|
||||
|
||||
**Login Page**:
|
||||
|
||||
- `email-input` - Email input field
|
||||
- `password-input` - Password input field
|
||||
- `login-button` - Submit button
|
||||
- `error-message` - Error message container
|
||||
|
||||
**Dashboard Page**:
|
||||
|
||||
- `user-name` - User name display
|
||||
- `logout-button` - Logout button
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Create Implementation Checklist
|
||||
|
||||
### Actions
|
||||
|
||||
1. **Map Tests to Implementation Tasks**
|
||||
|
||||
For each failing test, create corresponding implementation task:
|
||||
|
||||
```markdown
|
||||
## Implementation Checklist
|
||||
|
||||
### Epic X - User Authentication
|
||||
|
||||
#### Test: User Login with Valid Credentials
|
||||
|
||||
- [ ] Create `/login` route
|
||||
- [ ] Implement login form component
|
||||
- [ ] Add email/password validation
|
||||
- [ ] Integrate authentication API
|
||||
- [ ] Add `data-testid` attributes: `email-input`, `password-input`, `login-button`
|
||||
- [ ] Implement error handling
|
||||
- [ ] Run test: `npm run test:e2e -- login.spec.ts`
|
||||
- [ ] ✅ Test passes (green phase)
|
||||
|
||||
#### Test: Display Error for Invalid Credentials
|
||||
|
||||
- [ ] Add error state management
|
||||
- [ ] Display error message UI
|
||||
- [ ] Add `data-testid="error-message"`
|
||||
- [ ] Run test: `npm run test:e2e -- login.spec.ts`
|
||||
- [ ] ✅ Test passes (green phase)
|
||||
```
|
||||
|
||||
2. **Include Red-Green-Refactor Guidance**
|
||||
|
||||
```markdown
|
||||
## Red-Green-Refactor Workflow
|
||||
|
||||
**RED Phase** (Complete):
|
||||
|
||||
- ✅ All tests written and failing
|
||||
- ✅ Fixtures and factories created
|
||||
- ✅ Mock requirements documented
|
||||
|
||||
**GREEN Phase** (DEV Team):
|
||||
|
||||
1. Pick one failing test
|
||||
2. Implement minimal code to make it pass
|
||||
3. Run test to verify green
|
||||
4. Move to next test
|
||||
5. Repeat until all tests pass
|
||||
|
||||
**REFACTOR Phase** (DEV Team):
|
||||
|
||||
1. All tests passing (green)
|
||||
2. Improve code quality
|
||||
3. Extract duplications
|
||||
4. Optimize performance
|
||||
5. Ensure tests still pass
|
||||
```
|
||||
|
||||
3. **Add Execution Commands**
|
||||
|
||||
````markdown
|
||||
## Running Tests
|
||||
|
||||
```bash
|
||||
# Run all failing tests
|
||||
npm run test:e2e
|
||||
|
||||
# Run specific test file
|
||||
npm run test:e2e -- login.spec.ts
|
||||
|
||||
# Run tests in headed mode (see browser)
|
||||
npm run test:e2e -- --headed
|
||||
|
||||
# Debug specific test
|
||||
npm run test:e2e -- login.spec.ts --debug
|
||||
```
|
||||
````
|
||||
|
||||
```
|
||||
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Generate Deliverables
|
||||
|
||||
### Actions
|
||||
|
||||
1. **Create ATDD Checklist Document**
|
||||
|
||||
Use template structure at `{installed_path}/atdd-checklist-template.md`:
|
||||
- Story summary
|
||||
- Acceptance criteria breakdown
|
||||
- Test files created (with paths)
|
||||
- Data factories created
|
||||
- Fixtures created
|
||||
- Mock requirements
|
||||
- Required data-testid attributes
|
||||
- Implementation checklist
|
||||
- Red-green-refactor workflow
|
||||
- Execution commands
|
||||
|
||||
2. **Verify All Tests Fail**
|
||||
|
||||
Before finalizing:
|
||||
- Run full test suite locally
|
||||
- Confirm all tests in RED phase
|
||||
- Document expected failure messages
|
||||
- Ensure failures are due to missing implementation, not test bugs
|
||||
|
||||
3. **Write to Output File**
|
||||
|
||||
Save to `{output_folder}/atdd-checklist-{story_id}.md`
|
||||
|
||||
---
|
||||
|
||||
## Important Notes
|
||||
|
||||
### Red-Green-Refactor Cycle
|
||||
|
||||
**RED Phase** (TEA responsibility):
|
||||
|
||||
- Write failing tests first
|
||||
- Tests define expected behavior
|
||||
- Tests must fail for right reason (missing implementation)
|
||||
|
||||
**GREEN Phase** (DEV responsibility):
|
||||
|
||||
- Implement minimal code to pass tests
|
||||
- One test at a time
|
||||
- Don't over-engineer
|
||||
|
||||
**REFACTOR Phase** (DEV responsibility):
|
||||
|
||||
- Improve code quality with confidence
|
||||
- Tests provide safety net
|
||||
- Extract duplications, optimize
|
||||
|
||||
### Given-When-Then Structure
|
||||
|
||||
**GIVEN** (Setup):
|
||||
|
||||
- Arrange test preconditions
|
||||
- Create necessary data
|
||||
- Navigate to starting point
|
||||
|
||||
**WHEN** (Action):
|
||||
|
||||
- Execute the behavior being tested
|
||||
- Single action per test
|
||||
|
||||
**THEN** (Assertion):
|
||||
|
||||
- Verify expected outcome
|
||||
- One assertion per test (atomic)
|
||||
|
||||
### Network-First Testing
|
||||
|
||||
**Critical pattern:**
|
||||
|
||||
```typescript
|
||||
// ✅ CORRECT: Intercept BEFORE navigation
|
||||
await page.route('**/api/data', handler);
|
||||
await page.goto('/page');
|
||||
|
||||
// ❌ WRONG: Navigate then intercept (race condition)
|
||||
await page.goto('/page');
|
||||
await page.route('**/api/data', handler); // Too late!
|
||||
```
|
||||
|
||||
### Data Factory Best Practices
|
||||
|
||||
**Use faker for all test data:**
|
||||
|
||||
```typescript
|
||||
// ✅ CORRECT: Random data
|
||||
email: faker.internet.email();
|
||||
|
||||
// ❌ WRONG: Hardcoded data (collisions, maintenance burden)
|
||||
email: 'test@example.com';
|
||||
```
|
||||
|
||||
**Auto-cleanup principle:**
|
||||
|
||||
- Every factory that creates data must provide cleanup
|
||||
- Fixtures automatically cleanup in teardown
|
||||
- No manual cleanup in test code
|
||||
|
||||
### One Assertion Per Test
|
||||
|
||||
**Atomic test design:**
|
||||
|
||||
```typescript
|
||||
// ✅ CORRECT: One assertion
|
||||
test('should display user name', async ({ page }) => {
|
||||
await expect(page.locator('[data-testid="user-name"]')).toHaveText('John');
|
||||
});
|
||||
|
||||
// ❌ WRONG: Multiple assertions (not atomic)
|
||||
test('should display user info', async ({ page }) => {
|
||||
await expect(page.locator('[data-testid="user-name"]')).toHaveText('John');
|
||||
await expect(page.locator('[data-testid="user-email"]')).toHaveText('john@example.com');
|
||||
});
|
||||
```
|
||||
|
||||
**Why?** If second assertion fails, you don't know if first is still valid.
|
||||
|
||||
### Component Test Strategy
|
||||
|
||||
**When to use component tests:**
|
||||
|
||||
- Complex UI interactions (drag-drop, keyboard nav)
|
||||
- Form validation logic
|
||||
- State management within component
|
||||
- Visual edge cases
|
||||
|
||||
**When NOT to use:**
|
||||
|
||||
- Simple rendering (snapshot tests are sufficient)
|
||||
- Integration with backend (use E2E or API tests)
|
||||
- Full user journeys (use E2E tests)
|
||||
|
||||
### Knowledge Base Integration
|
||||
|
||||
**Core Fragments (Auto-loaded in Step 1):**
|
||||
|
||||
- `fixture-architecture.md` - Pure function → fixture → mergeTests patterns (406 lines, 5 examples)
|
||||
- `data-factories.md` - Factory patterns with faker, overrides, API seeding (498 lines, 5 examples)
|
||||
- `component-tdd.md` - Red-green-refactor, provider isolation, accessibility, visual regression (480 lines, 4 examples)
|
||||
- `network-first.md` - Intercept before navigate, HAR capture, deterministic waiting (489 lines, 5 examples)
|
||||
- `test-quality.md` - Deterministic tests, cleanup, explicit assertions, length/time limits (658 lines, 5 examples)
|
||||
- `test-healing-patterns.md` - Common failure patterns: stale selectors, race conditions, dynamic data, network errors, hard waits (648 lines, 5 examples)
|
||||
- `selector-resilience.md` - Selector hierarchy (data-testid > ARIA > text > CSS), dynamic patterns, anti-patterns (541 lines, 4 examples)
|
||||
- `timing-debugging.md` - Race condition prevention, deterministic waiting, async debugging (370 lines, 3 examples)
|
||||
|
||||
**Reference for Test Level Selection:**
|
||||
|
||||
- `test-levels-framework.md` - E2E vs API vs Component vs Unit decision framework (467 lines, 4 examples)
|
||||
|
||||
**Manual Reference (Optional):**
|
||||
|
||||
- Use `tea-index.csv` to find additional specialized fragments as needed
|
||||
|
||||
---
|
||||
|
||||
## Output Summary
|
||||
|
||||
After completing this workflow, provide a summary:
|
||||
|
||||
```markdown
|
||||
## ATDD Complete - Tests in RED Phase
|
||||
|
||||
**Story**: {story_id}
|
||||
**Primary Test Level**: {primary_level}
|
||||
|
||||
**Failing Tests Created**:
|
||||
|
||||
- E2E tests: {e2e_count} tests in {e2e_files}
|
||||
- API tests: {api_count} tests in {api_files}
|
||||
- Component tests: {component_count} tests in {component_files}
|
||||
|
||||
**Supporting Infrastructure**:
|
||||
|
||||
- Data factories: {factory_count} factories created
|
||||
- Fixtures: {fixture_count} fixtures with auto-cleanup
|
||||
- Mock requirements: {mock_count} services documented
|
||||
|
||||
**Implementation Checklist**:
|
||||
|
||||
- Total tasks: {task_count}
|
||||
- Estimated effort: {effort_estimate} hours
|
||||
|
||||
**Required data-testid Attributes**: {data_testid_count} attributes documented
|
||||
|
||||
**Next Steps for DEV Team**:
|
||||
|
||||
1. Run failing tests: `npm run test:e2e`
|
||||
2. Review implementation checklist
|
||||
3. Implement one test at a time (RED → GREEN)
|
||||
4. Refactor with confidence (tests provide safety net)
|
||||
5. Share progress in daily standup
|
||||
|
||||
**Output File**: {output_file}
|
||||
**Manual Handoff**: Share `{output_file}` and failing tests with the dev workflow (not auto-consumed).
|
||||
|
||||
**Knowledge Base References Applied**:
|
||||
|
||||
- Fixture architecture patterns
|
||||
- Data factory patterns with faker
|
||||
- Network-first route interception
|
||||
- Component TDD strategies
|
||||
- Test quality principles
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Validation
|
||||
|
||||
After completing all steps, verify:
|
||||
|
||||
- [ ] Story acceptance criteria analyzed and mapped to tests
|
||||
- [ ] Appropriate test levels selected (E2E, API, Component)
|
||||
- [ ] All tests written in Given-When-Then format
|
||||
- [ ] All tests fail initially (RED phase verified)
|
||||
- [ ] Network-first pattern applied (route interception before navigation)
|
||||
- [ ] Data factories created with faker
|
||||
- [ ] Fixtures created with auto-cleanup
|
||||
- [ ] Mock requirements documented for DEV team
|
||||
- [ ] Required data-testid attributes listed
|
||||
- [ ] Implementation checklist created with clear tasks
|
||||
- [ ] Red-green-refactor workflow documented
|
||||
- [ ] Execution commands provided
|
||||
- [ ] Output file created and formatted correctly
|
||||
|
||||
Refer to `checklist.md` for comprehensive validation criteria.
|
||||
@@ -1,45 +0,0 @@
|
||||
# Test Architect workflow: atdd
|
||||
name: testarch-atdd
|
||||
description: "Generate failing acceptance tests before implementation using TDD red-green-refactor cycle"
|
||||
author: "BMad"
|
||||
|
||||
# Critical variables from config
|
||||
config_source: "{project-root}/_bmad/bmm/config.yaml"
|
||||
output_folder: "{config_source}:output_folder"
|
||||
user_name: "{config_source}:user_name"
|
||||
communication_language: "{config_source}:communication_language"
|
||||
document_output_language: "{config_source}:document_output_language"
|
||||
date: system-generated
|
||||
|
||||
# Workflow components
|
||||
installed_path: "{project-root}/_bmad/bmm/workflows/testarch/atdd"
|
||||
instructions: "{installed_path}/instructions.md"
|
||||
validation: "{installed_path}/checklist.md"
|
||||
template: "{installed_path}/atdd-checklist-template.md"
|
||||
|
||||
# Variables and inputs
|
||||
variables:
|
||||
test_dir: "{project-root}/tests" # Root test directory
|
||||
|
||||
# Output configuration
|
||||
default_output_file: "{output_folder}/atdd-checklist-{story_id}.md"
|
||||
|
||||
# Required tools
|
||||
required_tools:
|
||||
- read_file # Read story markdown, framework config
|
||||
- write_file # Create test files, checklist, factory stubs
|
||||
- create_directory # Create test directories
|
||||
- list_files # Find existing fixtures and helpers
|
||||
- search_repo # Search for similar test patterns
|
||||
|
||||
tags:
|
||||
- qa
|
||||
- atdd
|
||||
- test-architect
|
||||
- tdd
|
||||
- red-green-refactor
|
||||
|
||||
execution_hints:
|
||||
interactive: false # Minimize prompts
|
||||
autonomous: true # Proceed without user input unless blocked
|
||||
iterative: true
|
||||
@@ -1,582 +0,0 @@
|
||||
# Automate Workflow Validation Checklist
|
||||
|
||||
Use this checklist to validate that the automate workflow has been executed correctly and all deliverables meet quality standards.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before starting this workflow, verify:
|
||||
|
||||
- [ ] Framework scaffolding configured (playwright.config.ts or cypress.config.ts exists)
|
||||
- [ ] Test directory structure exists (tests/ folder with subdirectories)
|
||||
- [ ] Package.json has test framework dependencies installed
|
||||
|
||||
**Halt only if:** Framework scaffolding is completely missing (run `framework` workflow first)
|
||||
|
||||
**Note:** BMad artifacts (story, tech-spec, PRD) are OPTIONAL - workflow can run without them
|
||||
**Note:** `automate` generates tests; it does not run `*atdd` or `*test-review`. If ATDD outputs exist, use them as input and avoid duplicate coverage.
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Execution Mode Determination and Context Loading
|
||||
|
||||
### Mode Detection
|
||||
|
||||
- [ ] Execution mode correctly determined:
|
||||
- [ ] BMad-Integrated Mode (story_file variable set) OR
|
||||
- [ ] Standalone Mode (target_feature or target_files set) OR
|
||||
- [ ] Auto-discover Mode (no targets specified)
|
||||
|
||||
### BMad Artifacts (If Available - OPTIONAL)
|
||||
|
||||
- [ ] Story markdown loaded (if `{story_file}` provided)
|
||||
- [ ] Acceptance criteria extracted from story (if available)
|
||||
- [ ] Tech-spec.md loaded (if `{use_tech_spec}` true and file exists)
|
||||
- [ ] Test-design.md loaded (if `{use_test_design}` true and file exists)
|
||||
- [ ] PRD.md loaded (if `{use_prd}` true and file exists)
|
||||
- [ ] **Note**: Absence of BMad artifacts does NOT halt workflow
|
||||
|
||||
### Framework Configuration
|
||||
|
||||
- [ ] Test framework config loaded (playwright.config.ts or cypress.config.ts)
|
||||
- [ ] Test directory structure identified from `{test_dir}`
|
||||
- [ ] Existing test patterns reviewed
|
||||
- [ ] Test runner capabilities noted (parallel execution, fixtures, etc.)
|
||||
|
||||
### Coverage Analysis
|
||||
|
||||
- [ ] Existing test files searched in `{test_dir}` (if `{analyze_coverage}` true)
|
||||
- [ ] Tested features vs untested features identified
|
||||
- [ ] Coverage gaps mapped (tests to source files)
|
||||
- [ ] Existing fixture and factory patterns checked
|
||||
|
||||
### Knowledge Base Fragments Loaded
|
||||
|
||||
- [ ] `test-levels-framework.md` - Test level selection
|
||||
- [ ] `test-priorities.md` - Priority classification (P0-P3)
|
||||
- [ ] `fixture-architecture.md` - Fixture patterns with auto-cleanup
|
||||
- [ ] `data-factories.md` - Factory patterns using faker
|
||||
- [ ] `selective-testing.md` - Targeted test execution strategies
|
||||
- [ ] `ci-burn-in.md` - Flaky test detection patterns
|
||||
- [ ] `test-quality.md` - Test design principles
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Automation Targets Identification
|
||||
|
||||
### Target Determination
|
||||
|
||||
**BMad-Integrated Mode (if story available):**
|
||||
|
||||
- [ ] Acceptance criteria mapped to test scenarios
|
||||
- [ ] Features implemented in story identified
|
||||
- [ ] Existing ATDD tests checked (if any)
|
||||
- [ ] Expansion beyond ATDD planned (edge cases, negative paths)
|
||||
|
||||
**Standalone Mode (if no story):**
|
||||
|
||||
- [ ] Specific feature analyzed (if `{target_feature}` specified)
|
||||
- [ ] Specific files analyzed (if `{target_files}` specified)
|
||||
- [ ] Features auto-discovered (if `{auto_discover_features}` true)
|
||||
- [ ] Features prioritized by:
|
||||
- [ ] No test coverage (highest priority)
|
||||
- [ ] Complex business logic
|
||||
- [ ] External integrations (API, database, auth)
|
||||
- [ ] Critical user paths (login, checkout, etc.)
|
||||
|
||||
### Test Level Selection
|
||||
|
||||
- [ ] Test level selection framework applied (from `test-levels-framework.md`)
|
||||
- [ ] E2E tests identified: Critical user journeys, multi-system integration
|
||||
- [ ] API tests identified: Business logic, service contracts, data transformations
|
||||
- [ ] Component tests identified: UI behavior, interactions, state management
|
||||
- [ ] Unit tests identified: Pure logic, edge cases, error handling
|
||||
|
||||
### Duplicate Coverage Avoidance
|
||||
|
||||
- [ ] Same behavior NOT tested at multiple levels unnecessarily
|
||||
- [ ] E2E used for critical happy path only
|
||||
- [ ] API tests used for business logic variations
|
||||
- [ ] Component tests used for UI interaction edge cases
|
||||
- [ ] Unit tests used for pure logic edge cases
|
||||
|
||||
### Priority Assignment
|
||||
|
||||
- [ ] Test priorities assigned using `test-priorities.md` framework
|
||||
- [ ] P0 tests: Critical paths, security-critical, data integrity
|
||||
- [ ] P1 tests: Important features, integration points, error handling
|
||||
- [ ] P2 tests: Edge cases, less-critical variations, performance
|
||||
- [ ] P3 tests: Nice-to-have, rarely-used features, exploratory
|
||||
- [ ] Priority variables respected:
|
||||
- [ ] `{include_p0}` = true (always include)
|
||||
- [ ] `{include_p1}` = true (high priority)
|
||||
- [ ] `{include_p2}` = true (medium priority)
|
||||
- [ ] `{include_p3}` = false (low priority, skip by default)
|
||||
|
||||
### Coverage Plan Created
|
||||
|
||||
- [ ] Test coverage plan documented
|
||||
- [ ] What will be tested at each level listed
|
||||
- [ ] Priorities assigned to each test
|
||||
- [ ] Coverage strategy clear (critical-paths, comprehensive, or selective)
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Test Infrastructure Generated
|
||||
|
||||
### Fixture Architecture
|
||||
|
||||
- [ ] Existing fixtures checked in `tests/support/fixtures/`
|
||||
- [ ] Fixture architecture created/enhanced (if `{generate_fixtures}` true)
|
||||
- [ ] All fixtures use Playwright's `test.extend()` pattern
|
||||
- [ ] All fixtures have auto-cleanup in teardown
|
||||
- [ ] Common fixtures created/enhanced:
|
||||
- [ ] authenticatedUser (with auto-delete)
|
||||
- [ ] apiRequest (authenticated client)
|
||||
- [ ] mockNetwork (external service mocking)
|
||||
- [ ] testDatabase (with auto-cleanup)
|
||||
|
||||
### Data Factories
|
||||
|
||||
- [ ] Existing factories checked in `tests/support/factories/`
|
||||
- [ ] Factory architecture created/enhanced (if `{generate_factories}` true)
|
||||
- [ ] All factories use `@faker-js/faker` for random data (no hardcoded values)
|
||||
- [ ] All factories support overrides for specific scenarios
|
||||
- [ ] Common factories created/enhanced:
|
||||
- [ ] User factory (email, password, name, role)
|
||||
- [ ] Product factory (name, price, SKU)
|
||||
- [ ] Order factory (items, total, status)
|
||||
- [ ] Cleanup helpers provided (e.g., deleteUser(), deleteProduct())
|
||||
|
||||
### Helper Utilities
|
||||
|
||||
- [ ] Existing helpers checked in `tests/support/helpers/` (if `{update_helpers}` true)
|
||||
- [ ] Common utilities created/enhanced:
|
||||
- [ ] waitFor (polling for complex conditions)
|
||||
- [ ] retry (retry helper for flaky operations)
|
||||
- [ ] testData (test data generation)
|
||||
- [ ] assertions (custom assertion helpers)
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Test Files Generated
|
||||
|
||||
### Test File Structure
|
||||
|
||||
- [ ] Test files organized correctly:
|
||||
- [ ] `tests/e2e/` for E2E tests
|
||||
- [ ] `tests/api/` for API tests
|
||||
- [ ] `tests/component/` for component tests
|
||||
- [ ] `tests/unit/` for unit tests
|
||||
- [ ] `tests/support/` for fixtures/factories/helpers
|
||||
|
||||
### E2E Tests (If Applicable)
|
||||
|
||||
- [ ] E2E test files created in `tests/e2e/`
|
||||
- [ ] All tests follow Given-When-Then format
|
||||
- [ ] All tests have priority tags ([P0], [P1], [P2], [P3]) in test name
|
||||
- [ ] All tests use data-testid selectors (not CSS classes)
|
||||
- [ ] One assertion per test (atomic design)
|
||||
- [ ] No hard waits or sleeps (explicit waits only)
|
||||
- [ ] Network-first pattern applied (route interception BEFORE navigation)
|
||||
- [ ] Clear Given-When-Then comments in test code
|
||||
|
||||
### API Tests (If Applicable)
|
||||
|
||||
- [ ] API test files created in `tests/api/`
|
||||
- [ ] All tests follow Given-When-Then format
|
||||
- [ ] All tests have priority tags in test name
|
||||
- [ ] API contracts validated (request/response structure)
|
||||
- [ ] HTTP status codes verified
|
||||
- [ ] Response body validation includes required fields
|
||||
- [ ] Error cases tested (400, 401, 403, 404, 500)
|
||||
- [ ] JWT token format validated (if auth tests)
|
||||
|
||||
### Component Tests (If Applicable)
|
||||
|
||||
- [ ] Component test files created in `tests/component/`
|
||||
- [ ] All tests follow Given-When-Then format
|
||||
- [ ] All tests have priority tags in test name
|
||||
- [ ] Component mounting works correctly
|
||||
- [ ] Interaction testing covers user actions (click, hover, keyboard)
|
||||
- [ ] State management validated
|
||||
- [ ] Props and events tested
|
||||
|
||||
### Unit Tests (If Applicable)
|
||||
|
||||
- [ ] Unit test files created in `tests/unit/`
|
||||
- [ ] All tests follow Given-When-Then format
|
||||
- [ ] All tests have priority tags in test name
|
||||
- [ ] Pure logic tested (no dependencies)
|
||||
- [ ] Edge cases covered
|
||||
- [ ] Error handling tested
|
||||
|
||||
### Quality Standards Enforced
|
||||
|
||||
- [ ] All tests use Given-When-Then format with clear comments
|
||||
- [ ] All tests have descriptive names with priority tags
|
||||
- [ ] No duplicate tests (same behavior tested multiple times)
|
||||
- [ ] No flaky patterns (race conditions, timing issues)
|
||||
- [ ] No test interdependencies (tests can run in any order)
|
||||
- [ ] Tests are deterministic (same input always produces same result)
|
||||
- [ ] All tests use data-testid selectors (E2E tests)
|
||||
- [ ] No hard waits: `await page.waitForTimeout()` (forbidden)
|
||||
- [ ] No conditional flow: `if (await element.isVisible())` (forbidden)
|
||||
- [ ] No try-catch for test logic (only for cleanup)
|
||||
- [ ] No hardcoded test data (use factories with faker)
|
||||
- [ ] No page object classes (tests are direct and simple)
|
||||
- [ ] No shared state between tests
|
||||
|
||||
### Network-First Pattern Applied
|
||||
|
||||
- [ ] Route interception set up BEFORE navigation (E2E tests with network requests)
|
||||
- [ ] `page.route()` called before `page.goto()` to prevent race conditions
|
||||
- [ ] Network-first pattern verified in all E2E tests that make API calls
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Test Validation and Healing (NEW - Phase 2.5)
|
||||
|
||||
### Healing Configuration
|
||||
|
||||
- [ ] Healing configuration checked:
|
||||
- [ ] `{auto_validate}` setting noted (default: true)
|
||||
- [ ] `{auto_heal_failures}` setting noted (default: false)
|
||||
- [ ] `{max_healing_iterations}` setting noted (default: 3)
|
||||
- [ ] `{use_mcp_healing}` setting noted (default: true)
|
||||
|
||||
### Healing Knowledge Fragments Loaded (If Healing Enabled)
|
||||
|
||||
- [ ] `test-healing-patterns.md` loaded (common failure patterns and fixes)
|
||||
- [ ] `selector-resilience.md` loaded (selector refactoring guide)
|
||||
- [ ] `timing-debugging.md` loaded (race condition fixes)
|
||||
|
||||
### Test Execution and Validation
|
||||
|
||||
- [ ] Generated tests executed (if `{auto_validate}` true)
|
||||
- [ ] Test results captured:
|
||||
- [ ] Total tests run
|
||||
- [ ] Passing tests count
|
||||
- [ ] Failing tests count
|
||||
- [ ] Error messages and stack traces captured
|
||||
|
||||
### Healing Loop (If Enabled and Tests Failed)
|
||||
|
||||
- [ ] Healing loop entered (if `{auto_heal_failures}` true AND tests failed)
|
||||
- [ ] For each failing test:
|
||||
- [ ] Failure pattern identified (selector, timing, data, network, hard wait)
|
||||
- [ ] Appropriate healing strategy applied:
|
||||
- [ ] Stale selector → Replaced with data-testid or ARIA role
|
||||
- [ ] Race condition → Added network-first interception or state waits
|
||||
- [ ] Dynamic data → Replaced hardcoded values with regex/dynamic generation
|
||||
- [ ] Network error → Added route mocking
|
||||
- [ ] Hard wait → Replaced with event-based wait
|
||||
- [ ] Healed test re-run to validate fix
|
||||
- [ ] Iteration count tracked (max 3 attempts)
|
||||
|
||||
### Unfixable Tests Handling
|
||||
|
||||
- [ ] Tests that couldn't be healed after 3 iterations marked with `test.fixme()` (if `{mark_unhealable_as_fixme}` true)
|
||||
- [ ] Detailed comment added to test.fixme() tests:
|
||||
- [ ] What failure occurred
|
||||
- [ ] What healing was attempted (3 iterations)
|
||||
- [ ] Why healing failed
|
||||
- [ ] Manual investigation steps needed
|
||||
- [ ] Original test logic preserved in comments
|
||||
|
||||
### Healing Report Generated
|
||||
|
||||
- [ ] Healing report generated (if healing attempted)
|
||||
- [ ] Report includes:
|
||||
- [ ] Auto-heal enabled status
|
||||
- [ ] Healing mode (MCP-assisted or Pattern-based)
|
||||
- [ ] Iterations allowed (max_healing_iterations)
|
||||
- [ ] Validation results (total, passing, failing)
|
||||
- [ ] Successfully healed tests (count, file:line, fix applied)
|
||||
- [ ] Unable to heal tests (count, file:line, reason)
|
||||
- [ ] Healing patterns applied (selector fixes, timing fixes, data fixes)
|
||||
- [ ] Knowledge base references used
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Documentation and Scripts Updated
|
||||
|
||||
### Test README Updated
|
||||
|
||||
- [ ] `tests/README.md` created or updated (if `{update_readme}` true)
|
||||
- [ ] Test suite structure overview included
|
||||
- [ ] Test execution instructions provided (all, specific files, by priority)
|
||||
- [ ] Fixture usage examples provided
|
||||
- [ ] Factory usage examples provided
|
||||
- [ ] Priority tagging convention explained ([P0], [P1], [P2], [P3])
|
||||
- [ ] How to write new tests documented
|
||||
- [ ] Common patterns documented
|
||||
- [ ] Anti-patterns documented (what to avoid)
|
||||
|
||||
### package.json Scripts Updated
|
||||
|
||||
- [ ] package.json scripts added/updated (if `{update_package_scripts}` true)
|
||||
- [ ] `test:e2e` script for all E2E tests
|
||||
- [ ] `test:e2e:p0` script for P0 tests only
|
||||
- [ ] `test:e2e:p1` script for P0 + P1 tests
|
||||
- [ ] `test:api` script for API tests
|
||||
- [ ] `test:component` script for component tests
|
||||
- [ ] `test:unit` script for unit tests (if applicable)
|
||||
|
||||
### Test Suite Executed
|
||||
|
||||
- [ ] Test suite run locally (if `{run_tests_after_generation}` true)
|
||||
- [ ] Test results captured (passing/failing counts)
|
||||
- [ ] No flaky patterns detected (tests are deterministic)
|
||||
- [ ] Setup requirements documented (if any)
|
||||
- [ ] Known issues documented (if any)
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Automation Summary Generated
|
||||
|
||||
### Automation Summary Document
|
||||
|
||||
- [ ] Output file created at `{output_summary}`
|
||||
- [ ] Document includes execution mode (BMad-Integrated, Standalone, Auto-discover)
|
||||
- [ ] Feature analysis included (source files, coverage gaps) - Standalone mode
|
||||
- [ ] Tests created listed (E2E, API, Component, Unit) with counts and paths
|
||||
- [ ] Infrastructure created listed (fixtures, factories, helpers)
|
||||
- [ ] Test execution instructions provided
|
||||
- [ ] Coverage analysis included:
|
||||
- [ ] Total test count
|
||||
- [ ] Priority breakdown (P0, P1, P2, P3 counts)
|
||||
- [ ] Test level breakdown (E2E, API, Component, Unit counts)
|
||||
- [ ] Coverage percentage (if calculated)
|
||||
- [ ] Coverage status (acceptance criteria covered, gaps identified)
|
||||
- [ ] Definition of Done checklist included
|
||||
- [ ] Next steps provided
|
||||
- [ ] Recommendations included (if Standalone mode)
|
||||
|
||||
### Summary Provided to User
|
||||
|
||||
- [ ] Concise summary output provided
|
||||
- [ ] Total tests created across test levels
|
||||
- [ ] Priority breakdown (P0, P1, P2, P3 counts)
|
||||
- [ ] Infrastructure counts (fixtures, factories, helpers)
|
||||
- [ ] Test execution command provided
|
||||
- [ ] Output file path provided
|
||||
- [ ] Next steps listed
|
||||
|
||||
---
|
||||
|
||||
## Quality Checks
|
||||
|
||||
### Test Design Quality
|
||||
|
||||
- [ ] Tests are readable (clear Given-When-Then structure)
|
||||
- [ ] Tests are maintainable (use factories/fixtures, not hardcoded data)
|
||||
- [ ] Tests are isolated (no shared state between tests)
|
||||
- [ ] Tests are deterministic (no race conditions or flaky patterns)
|
||||
- [ ] Tests are atomic (one assertion per test)
|
||||
- [ ] Tests are fast (no unnecessary waits or delays)
|
||||
- [ ] Tests are lean (files under {max_file_lines} lines)
|
||||
|
||||
### Knowledge Base Integration
|
||||
|
||||
- [ ] Test level selection framework applied (from `test-levels-framework.md`)
|
||||
- [ ] Priority classification applied (from `test-priorities.md`)
|
||||
- [ ] Fixture architecture patterns applied (from `fixture-architecture.md`)
|
||||
- [ ] Data factory patterns applied (from `data-factories.md`)
|
||||
- [ ] Selective testing strategies considered (from `selective-testing.md`)
|
||||
- [ ] Flaky test detection patterns considered (from `ci-burn-in.md`)
|
||||
- [ ] Test quality principles applied (from `test-quality.md`)
|
||||
|
||||
### Code Quality
|
||||
|
||||
- [ ] All TypeScript types are correct and complete
|
||||
- [ ] No linting errors in generated test files
|
||||
- [ ] Consistent naming conventions followed
|
||||
- [ ] Imports are organized and correct
|
||||
- [ ] Code follows project style guide
|
||||
- [ ] No console.log or debug statements in test code
|
||||
|
||||
---
|
||||
|
||||
## Integration Points
|
||||
|
||||
### With Framework Workflow
|
||||
|
||||
- [ ] Test framework configuration detected and used
|
||||
- [ ] Directory structure matches framework setup
|
||||
- [ ] Fixtures and helpers follow established patterns
|
||||
- [ ] Naming conventions consistent with framework standards
|
||||
|
||||
### With BMad Workflows (If Available - OPTIONAL)
|
||||
|
||||
**With Story Workflow:**
|
||||
|
||||
- [ ] Story ID correctly referenced in output (if story available)
|
||||
- [ ] Acceptance criteria from story reflected in tests (if story available)
|
||||
- [ ] Technical constraints from story considered (if story available)
|
||||
|
||||
**With test-design Workflow:**
|
||||
|
||||
- [ ] P0 scenarios from test-design prioritized (if test-design available)
|
||||
- [ ] Risk assessment from test-design considered (if test-design available)
|
||||
- [ ] Coverage strategy aligned with test-design (if test-design available)
|
||||
|
||||
**With atdd Workflow:**
|
||||
|
||||
- [ ] ATDD artifacts provided or located (manual handoff; `atdd` not auto-run)
|
||||
- [ ] Existing ATDD tests checked (if story had ATDD workflow run)
|
||||
- [ ] Expansion beyond ATDD planned (edge cases, negative paths)
|
||||
- [ ] No duplicate coverage with ATDD tests
|
||||
|
||||
### With CI Pipeline
|
||||
|
||||
- [ ] Tests can run in CI environment
|
||||
- [ ] Tests are parallelizable (no shared state)
|
||||
- [ ] Tests have appropriate timeouts
|
||||
- [ ] Tests clean up their data (no CI environment pollution)
|
||||
|
||||
---
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
All of the following must be true before marking this workflow as complete:
|
||||
|
||||
- [ ] **Execution mode determined** (BMad-Integrated, Standalone, or Auto-discover)
|
||||
- [ ] **Framework configuration loaded** and validated
|
||||
- [ ] **Coverage analysis completed** (gaps identified if analyze_coverage true)
|
||||
- [ ] **Automation targets identified** (what needs testing)
|
||||
- [ ] **Test levels selected** appropriately (E2E, API, Component, Unit)
|
||||
- [ ] **Duplicate coverage avoided** (same behavior not tested at multiple levels)
|
||||
- [ ] **Test priorities assigned** (P0, P1, P2, P3)
|
||||
- [ ] **Fixture architecture created/enhanced** with auto-cleanup
|
||||
- [ ] **Data factories created/enhanced** using faker (no hardcoded data)
|
||||
- [ ] **Helper utilities created/enhanced** (if needed)
|
||||
- [ ] **Test files generated** at appropriate levels (E2E, API, Component, Unit)
|
||||
- [ ] **Given-When-Then format used** consistently across all tests
|
||||
- [ ] **Priority tags added** to all test names ([P0], [P1], [P2], [P3])
|
||||
- [ ] **data-testid selectors used** in E2E tests (not CSS classes)
|
||||
- [ ] **Network-first pattern applied** (route interception before navigation)
|
||||
- [ ] **Quality standards enforced** (no hard waits, no flaky patterns, self-cleaning, deterministic)
|
||||
- [ ] **Test README updated** with execution instructions and patterns
|
||||
- [ ] **package.json scripts updated** with test execution commands
|
||||
- [ ] **Test suite run locally** (if run_tests_after_generation true)
|
||||
- [ ] **Tests validated** (if auto_validate enabled)
|
||||
- [ ] **Failures healed** (if auto_heal_failures enabled and tests failed)
|
||||
- [ ] **Healing report generated** (if healing attempted)
|
||||
- [ ] **Unfixable tests marked** with test.fixme() and detailed comments (if any)
|
||||
- [ ] **Automation summary created** and saved to correct location
|
||||
- [ ] **Output file formatted correctly**
|
||||
- [ ] **Knowledge base references applied** and documented (including healing fragments if used)
|
||||
- [ ] **No test quality issues** (flaky patterns, race conditions, hardcoded data, page objects)
|
||||
|
||||
---
|
||||
|
||||
## Common Issues and Resolutions
|
||||
|
||||
### Issue: BMad artifacts not found
|
||||
|
||||
**Problem:** Story, tech-spec, or PRD files not found when variables are set.
|
||||
|
||||
**Resolution:**
|
||||
|
||||
- **automate does NOT require BMad artifacts** - they are OPTIONAL enhancements
|
||||
- If files not found, switch to Standalone Mode automatically
|
||||
- Analyze source code directly without BMad context
|
||||
- Continue workflow without halting
|
||||
|
||||
### Issue: Framework configuration not found
|
||||
|
||||
**Problem:** No playwright.config.ts or cypress.config.ts found.
|
||||
|
||||
**Resolution:**
|
||||
|
||||
- **HALT workflow** - framework is required
|
||||
- Message: "Framework scaffolding required. Run `bmad tea *framework` first."
|
||||
- User must run framework workflow before automate
|
||||
|
||||
### Issue: No automation targets identified
|
||||
|
||||
**Problem:** Neither story, target_feature, nor target_files specified, and auto-discover finds nothing.
|
||||
|
||||
**Resolution:**
|
||||
|
||||
- Check if source_dir variable is correct
|
||||
- Verify source code exists in project
|
||||
- Ask user to specify target_feature or target_files explicitly
|
||||
- Provide examples: `target_feature: "src/auth/"` or `target_files: "src/auth/login.ts,src/auth/session.ts"`
|
||||
|
||||
### Issue: Duplicate coverage detected
|
||||
|
||||
**Problem:** Same behavior tested at multiple levels (E2E + API + Component).
|
||||
|
||||
**Resolution:**
|
||||
|
||||
- Review test level selection framework (test-levels-framework.md)
|
||||
- Use E2E for critical happy path ONLY
|
||||
- Use API for business logic variations
|
||||
- Use Component for UI edge cases
|
||||
- Remove redundant tests that duplicate coverage
|
||||
|
||||
### Issue: Tests have hardcoded data
|
||||
|
||||
**Problem:** Tests use hardcoded email addresses, passwords, or other data.
|
||||
|
||||
**Resolution:**
|
||||
|
||||
- Replace all hardcoded data with factory function calls
|
||||
- Use faker for all random data generation
|
||||
- Update data-factories to support all required test scenarios
|
||||
- Example: `createUser({ email: faker.internet.email() })`
|
||||
|
||||
### Issue: Tests are flaky
|
||||
|
||||
**Problem:** Tests fail intermittently, pass on retry.
|
||||
|
||||
**Resolution:**
|
||||
|
||||
- Remove all hard waits (`page.waitForTimeout()`)
|
||||
- Use explicit waits (`page.waitForSelector()`)
|
||||
- Apply network-first pattern (route interception before navigation)
|
||||
- Remove conditional flow (`if (await element.isVisible())`)
|
||||
- Ensure tests are deterministic (no race conditions)
|
||||
- Run burn-in loop (10 iterations) to detect flakiness
|
||||
|
||||
### Issue: Fixtures don't clean up data
|
||||
|
||||
**Problem:** Test data persists after test run, causing test pollution.
|
||||
|
||||
**Resolution:**
|
||||
|
||||
- Ensure all fixtures have cleanup in teardown phase
|
||||
- Cleanup happens AFTER `await use(data)`
|
||||
- Call deletion/cleanup functions (deleteUser, deleteProduct, etc.)
|
||||
- Verify cleanup works by checking database/storage after test run
|
||||
|
||||
### Issue: Tests too slow
|
||||
|
||||
**Problem:** Tests take longer than 90 seconds (max_test_duration).
|
||||
|
||||
**Resolution:**
|
||||
|
||||
- Remove unnecessary waits and delays
|
||||
- Use parallel execution where possible
|
||||
- Mock external services (don't make real API calls)
|
||||
- Use API tests instead of E2E for business logic
|
||||
- Optimize test data creation (use in-memory database, etc.)
|
||||
|
||||
---
|
||||
|
||||
## Notes for TEA Agent
|
||||
|
||||
- **automate is flexible:** Can work with or without BMad artifacts (story, tech-spec, PRD are OPTIONAL)
|
||||
- **Standalone mode is powerful:** Analyze any codebase and generate tests independently
|
||||
- **Auto-discover mode:** Scan codebase for features needing tests when no targets specified
|
||||
- **Framework is the ONLY hard requirement:** HALT if framework config missing, otherwise proceed
|
||||
- **Avoid duplicate coverage:** E2E for critical paths only, API/Component for variations
|
||||
- **Priority tagging enables selective execution:** P0 tests run on every commit, P1 on PR, P2 nightly
|
||||
- **Network-first pattern prevents race conditions:** Route interception BEFORE navigation
|
||||
- **No page objects:** Keep tests simple, direct, and maintainable
|
||||
- **Use knowledge base:** Load relevant fragments (test-levels, test-priorities, fixture-architecture, data-factories, healing patterns) for guidance
|
||||
- **Deterministic tests only:** No hard waits, no conditional flow, no flaky patterns allowed
|
||||
- **Optional healing:** auto_heal_failures disabled by default (opt-in for automatic test healing)
|
||||
- **Graceful degradation:** Healing works without Playwright MCP (pattern-based fallback)
|
||||
- **Unfixable tests handled:** Mark with test.fixme() and detailed comments (not silently broken)
|
||||
File diff suppressed because it is too large
Load Diff
@@ -1,52 +0,0 @@
|
||||
# Test Architect workflow: automate
|
||||
name: testarch-automate
|
||||
description: "Expand test automation coverage after implementation or analyze existing codebase to generate comprehensive test suite"
|
||||
author: "BMad"
|
||||
|
||||
# Critical variables from config
|
||||
config_source: "{project-root}/_bmad/bmm/config.yaml"
|
||||
output_folder: "{config_source}:output_folder"
|
||||
user_name: "{config_source}:user_name"
|
||||
communication_language: "{config_source}:communication_language"
|
||||
document_output_language: "{config_source}:document_output_language"
|
||||
date: system-generated
|
||||
|
||||
# Workflow components
|
||||
installed_path: "{project-root}/_bmad/bmm/workflows/testarch/automate"
|
||||
instructions: "{installed_path}/instructions.md"
|
||||
validation: "{installed_path}/checklist.md"
|
||||
template: false
|
||||
|
||||
# Variables and inputs
|
||||
variables:
|
||||
# Execution mode and targeting
|
||||
standalone_mode: true # Can work without BMad artifacts (true) or integrate with BMad (false)
|
||||
coverage_target: "critical-paths" # critical-paths, comprehensive, selective
|
||||
|
||||
# Directory paths
|
||||
test_dir: "{project-root}/tests" # Root test directory
|
||||
source_dir: "{project-root}/src" # Source code directory
|
||||
|
||||
# Output configuration
|
||||
default_output_file: "{output_folder}/automation-summary.md"
|
||||
|
||||
# Required tools
|
||||
required_tools:
|
||||
- read_file # Read source code, existing tests, BMad artifacts
|
||||
- write_file # Create test files, fixtures, factories, summaries
|
||||
- create_directory # Create test directories
|
||||
- list_files # Discover features and existing tests
|
||||
- search_repo # Find coverage gaps and patterns
|
||||
- glob # Find test files and source files
|
||||
|
||||
tags:
|
||||
- qa
|
||||
- automation
|
||||
- test-architect
|
||||
- regression
|
||||
- coverage
|
||||
|
||||
execution_hints:
|
||||
interactive: false # Minimize prompts
|
||||
autonomous: true # Proceed without user input unless blocked
|
||||
iterative: true
|
||||
@@ -1,248 +0,0 @@
|
||||
# CI/CD Pipeline Setup - Validation Checklist
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- [ ] Git repository initialized (`.git/` exists)
|
||||
- [ ] Git remote configured (`git remote -v` shows origin)
|
||||
- [ ] Test framework configured (`playwright.config._` or `cypress.config._`)
|
||||
- [ ] Local tests pass (`npm run test:e2e` succeeds)
|
||||
- [ ] Team agrees on CI platform
|
||||
- [ ] Access to CI platform settings (if updating)
|
||||
|
||||
Note: CI setup is typically a one-time task per repo and can be run any time after the test framework is configured.
|
||||
|
||||
## Process Steps
|
||||
|
||||
### Step 1: Preflight Checks
|
||||
|
||||
- [ ] Git repository validated
|
||||
- [ ] Framework configuration detected
|
||||
- [ ] Local test execution successful
|
||||
- [ ] CI platform detected or selected
|
||||
- [ ] Node version identified (.nvmrc or default)
|
||||
- [ ] No blocking issues found
|
||||
|
||||
### Step 2: CI Pipeline Configuration
|
||||
|
||||
- [ ] CI configuration file created (`.github/workflows/test.yml` or `.gitlab-ci.yml`)
|
||||
- [ ] File is syntactically valid (no YAML errors)
|
||||
- [ ] Correct framework commands configured
|
||||
- [ ] Node version matches project
|
||||
- [ ] Test directory paths correct
|
||||
|
||||
### Step 3: Parallel Sharding
|
||||
|
||||
- [ ] Matrix strategy configured (4 shards default)
|
||||
- [ ] Shard syntax correct for framework
|
||||
- [ ] fail-fast set to false
|
||||
- [ ] Shard count appropriate for test suite size
|
||||
|
||||
### Step 4: Burn-In Loop
|
||||
|
||||
- [ ] Burn-in job created
|
||||
- [ ] 10 iterations configured
|
||||
- [ ] Proper exit on failure (`|| exit 1`)
|
||||
- [ ] Runs on appropriate triggers (PR, cron)
|
||||
- [ ] Failure artifacts uploaded
|
||||
|
||||
### Step 5: Caching Configuration
|
||||
|
||||
- [ ] Dependency cache configured (npm/yarn)
|
||||
- [ ] Cache key uses lockfile hash
|
||||
- [ ] Browser cache configured (Playwright/Cypress)
|
||||
- [ ] Restore-keys defined for fallback
|
||||
- [ ] Cache paths correct for platform
|
||||
|
||||
### Step 6: Artifact Collection
|
||||
|
||||
- [ ] Artifacts upload on failure only
|
||||
- [ ] Correct artifact paths (test-results/, traces/, etc.)
|
||||
- [ ] Retention days set (30 default)
|
||||
- [ ] Artifact names unique per shard
|
||||
- [ ] No sensitive data in artifacts
|
||||
|
||||
### Step 7: Retry Logic
|
||||
|
||||
- [ ] Retry action/strategy configured
|
||||
- [ ] Max attempts: 2-3
|
||||
- [ ] Timeout appropriate (30 min)
|
||||
- [ ] Retry only on transient errors
|
||||
|
||||
### Step 8: Helper Scripts
|
||||
|
||||
- [ ] `scripts/test-changed.sh` created
|
||||
- [ ] `scripts/ci-local.sh` created
|
||||
- [ ] `scripts/burn-in.sh` created (optional)
|
||||
- [ ] Scripts are executable (`chmod +x`)
|
||||
- [ ] Scripts use correct test commands
|
||||
- [ ] Shebang present (`#!/bin/bash`)
|
||||
|
||||
### Step 9: Documentation
|
||||
|
||||
- [ ] `docs/ci.md` created with pipeline guide
|
||||
- [ ] `docs/ci-secrets-checklist.md` created
|
||||
- [ ] Required secrets documented
|
||||
- [ ] Setup instructions clear
|
||||
- [ ] Troubleshooting section included
|
||||
- [ ] Badge URLs provided (optional)
|
||||
|
||||
## Output Validation
|
||||
|
||||
### Configuration Validation
|
||||
|
||||
- [ ] CI file loads without errors
|
||||
- [ ] All paths resolve correctly
|
||||
- [ ] No hardcoded values (use env vars)
|
||||
- [ ] Triggers configured (push, pull_request, schedule)
|
||||
- [ ] Platform-specific syntax correct
|
||||
|
||||
### Execution Validation
|
||||
|
||||
- [ ] First CI run triggered (push to remote)
|
||||
- [ ] Pipeline starts without errors
|
||||
- [ ] All jobs appear in CI dashboard
|
||||
- [ ] Caching works (check logs for cache hit)
|
||||
- [ ] Tests execute in parallel
|
||||
- [ ] Artifacts collected on failure
|
||||
|
||||
### Performance Validation
|
||||
|
||||
- [ ] Lint stage: <2 minutes
|
||||
- [ ] Test stage (per shard): <10 minutes
|
||||
- [ ] Burn-in stage: <30 minutes
|
||||
- [ ] Total pipeline: <45 minutes
|
||||
- [ ] Cache reduces install time by 2-5 minutes
|
||||
|
||||
## Quality Checks
|
||||
|
||||
### Best Practices Compliance
|
||||
|
||||
- [ ] Burn-in loop follows production patterns
|
||||
- [ ] Parallel sharding configured optimally
|
||||
- [ ] Failure-only artifact collection
|
||||
- [ ] Selective testing enabled (optional)
|
||||
- [ ] Retry logic handles transient failures only
|
||||
- [ ] No secrets in configuration files
|
||||
|
||||
### Knowledge Base Alignment
|
||||
|
||||
- [ ] Burn-in pattern matches `ci-burn-in.md`
|
||||
- [ ] Selective testing matches `selective-testing.md`
|
||||
- [ ] Artifact collection matches `visual-debugging.md`
|
||||
- [ ] Test quality matches `test-quality.md`
|
||||
|
||||
### Security Checks
|
||||
|
||||
- [ ] No credentials in CI configuration
|
||||
- [ ] Secrets use platform secret management
|
||||
- [ ] Environment variables for sensitive data
|
||||
- [ ] Artifact retention appropriate (not too long)
|
||||
- [ ] No debug output exposing secrets
|
||||
|
||||
## Integration Points
|
||||
|
||||
### Status File Integration
|
||||
|
||||
- [ ] `bmm-workflow-status.md` exists
|
||||
- [ ] CI setup logged in Quality & Testing Progress section
|
||||
- [ ] Status updated with completion timestamp
|
||||
- [ ] Platform and configuration noted
|
||||
|
||||
### Knowledge Base Integration
|
||||
|
||||
- [ ] Relevant knowledge fragments loaded
|
||||
- [ ] Patterns applied from knowledge base
|
||||
- [ ] Documentation references knowledge base
|
||||
- [ ] Knowledge base references in README
|
||||
|
||||
### Workflow Dependencies
|
||||
|
||||
- [ ] `framework` workflow completed first
|
||||
- [ ] Can proceed to `atdd` workflow after CI setup
|
||||
- [ ] Can proceed to `automate` workflow
|
||||
- [ ] CI integrates with `gate` workflow
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
**All must be true:**
|
||||
|
||||
- [ ] All prerequisites met
|
||||
- [ ] All process steps completed
|
||||
- [ ] All output validations passed
|
||||
- [ ] All quality checks passed
|
||||
- [ ] All integration points verified
|
||||
- [ ] First CI run successful
|
||||
- [ ] Performance targets met
|
||||
- [ ] Documentation complete
|
||||
|
||||
## Post-Workflow Actions
|
||||
|
||||
**User must complete:**
|
||||
|
||||
1. [ ] Commit CI configuration
|
||||
2. [ ] Push to remote repository
|
||||
3. [ ] Configure required secrets in CI platform
|
||||
4. [ ] Open PR to trigger first CI run
|
||||
5. [ ] Monitor and verify pipeline execution
|
||||
6. [ ] Adjust parallelism if needed (based on actual run times)
|
||||
7. [ ] Set up notifications (optional)
|
||||
|
||||
**Recommended next workflows:**
|
||||
|
||||
1. [ ] Run `atdd` workflow for test generation
|
||||
2. [ ] Run `automate` workflow for coverage expansion
|
||||
3. [ ] Run `gate` workflow for quality gates
|
||||
|
||||
## Rollback Procedure
|
||||
|
||||
If workflow fails:
|
||||
|
||||
1. [ ] Delete CI configuration file
|
||||
2. [ ] Remove helper scripts directory
|
||||
3. [ ] Remove documentation (docs/ci.md, etc.)
|
||||
4. [ ] Clear CI platform secrets (if added)
|
||||
5. [ ] Review error logs
|
||||
6. [ ] Fix issues and retry workflow
|
||||
|
||||
## Notes
|
||||
|
||||
### Common Issues
|
||||
|
||||
**Issue**: CI file syntax errors
|
||||
|
||||
- **Solution**: Validate YAML syntax online or with linter
|
||||
|
||||
**Issue**: Tests fail in CI but pass locally
|
||||
|
||||
- **Solution**: Use `scripts/ci-local.sh` to mirror CI environment
|
||||
|
||||
**Issue**: Caching not working
|
||||
|
||||
- **Solution**: Check cache key formula, verify paths
|
||||
|
||||
**Issue**: Burn-in too slow
|
||||
|
||||
- **Solution**: Reduce iterations or run on cron only
|
||||
|
||||
### Platform-Specific
|
||||
|
||||
**GitHub Actions:**
|
||||
|
||||
- Secrets: Repository Settings → Secrets and variables → Actions
|
||||
- Runners: Ubuntu latest recommended
|
||||
- Concurrency limits: 20 jobs for free tier
|
||||
|
||||
**GitLab CI:**
|
||||
|
||||
- Variables: Project Settings → CI/CD → Variables
|
||||
- Runners: Shared or project-specific
|
||||
- Pipeline quota: 400 minutes/month free tier
|
||||
|
||||
---
|
||||
|
||||
**Checklist Complete**: Sign off when all items validated.
|
||||
|
||||
**Completed by:** {name}
|
||||
**Date:** {date}
|
||||
**Platform:** {GitHub Actions, GitLab CI, Other}
|
||||
**Notes:** {notes}
|
||||
@@ -1,198 +0,0 @@
|
||||
# GitHub Actions CI/CD Pipeline for Test Execution
|
||||
# Generated by BMad TEA Agent - Test Architect Module
|
||||
# Optimized for: Playwright/Cypress, Parallel Sharding, Burn-In Loop
|
||||
|
||||
name: Test Pipeline
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [main, develop]
|
||||
pull_request:
|
||||
branches: [main, develop]
|
||||
schedule:
|
||||
# Weekly burn-in on Sundays at 2 AM UTC
|
||||
- cron: "0 2 * * 0"
|
||||
|
||||
concurrency:
|
||||
group: ${{ github.workflow }}-${{ github.ref }}
|
||||
cancel-in-progress: true
|
||||
|
||||
jobs:
|
||||
# Lint stage - Code quality checks
|
||||
lint:
|
||||
name: Lint
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 5
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Determine Node version
|
||||
id: node-version
|
||||
run: |
|
||||
if [ -f .nvmrc ]; then
|
||||
echo "value=$(cat .nvmrc)" >> "$GITHUB_OUTPUT"
|
||||
echo "Using Node from .nvmrc"
|
||||
else
|
||||
echo "value=24" >> "$GITHUB_OUTPUT"
|
||||
echo "Using default Node 24 (current LTS)"
|
||||
fi
|
||||
|
||||
- name: Setup Node.js
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: ${{ steps.node-version.outputs.value }}
|
||||
cache: "npm"
|
||||
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
|
||||
- name: Run linter
|
||||
run: npm run lint
|
||||
|
||||
# Test stage - Parallel execution with sharding
|
||||
test:
|
||||
name: Test (Shard ${{ matrix.shard }})
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 30
|
||||
needs: lint
|
||||
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
shard: [1, 2, 3, 4]
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Determine Node version
|
||||
id: node-version
|
||||
run: |
|
||||
if [ -f .nvmrc ]; then
|
||||
echo "value=$(cat .nvmrc)" >> "$GITHUB_OUTPUT"
|
||||
echo "Using Node from .nvmrc"
|
||||
else
|
||||
echo "value=22" >> "$GITHUB_OUTPUT"
|
||||
echo "Using default Node 22 (current LTS)"
|
||||
fi
|
||||
|
||||
- name: Setup Node.js
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: ${{ steps.node-version.outputs.value }}
|
||||
cache: "npm"
|
||||
|
||||
- name: Cache Playwright browsers
|
||||
uses: actions/cache@v4
|
||||
with:
|
||||
path: ~/.cache/ms-playwright
|
||||
key: ${{ runner.os }}-playwright-${{ hashFiles('**/package-lock.json') }}
|
||||
restore-keys: |
|
||||
${{ runner.os }}-playwright-
|
||||
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
|
||||
- name: Install Playwright browsers
|
||||
run: npx playwright install --with-deps chromium
|
||||
|
||||
- name: Run tests (shard ${{ matrix.shard }}/4)
|
||||
run: npm run test:e2e -- --shard=${{ matrix.shard }}/4
|
||||
|
||||
- name: Upload test results
|
||||
if: failure()
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: test-results-${{ matrix.shard }}
|
||||
path: |
|
||||
test-results/
|
||||
playwright-report/
|
||||
retention-days: 30
|
||||
|
||||
# Burn-in stage - Flaky test detection
|
||||
burn-in:
|
||||
name: Burn-In (Flaky Detection)
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 60
|
||||
needs: test
|
||||
# Only run burn-in on PRs to main/develop or on schedule
|
||||
if: github.event_name == 'pull_request' || github.event_name == 'schedule'
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Determine Node version
|
||||
id: node-version
|
||||
run: |
|
||||
if [ -f .nvmrc ]; then
|
||||
echo "value=$(cat .nvmrc)" >> "$GITHUB_OUTPUT"
|
||||
echo "Using Node from .nvmrc"
|
||||
else
|
||||
echo "value=22" >> "$GITHUB_OUTPUT"
|
||||
echo "Using default Node 22 (current LTS)"
|
||||
fi
|
||||
|
||||
- name: Setup Node.js
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: ${{ steps.node-version.outputs.value }}
|
||||
cache: "npm"
|
||||
|
||||
- name: Cache Playwright browsers
|
||||
uses: actions/cache@v4
|
||||
with:
|
||||
path: ~/.cache/ms-playwright
|
||||
key: ${{ runner.os }}-playwright-${{ hashFiles('**/package-lock.json') }}
|
||||
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
|
||||
- name: Install Playwright browsers
|
||||
run: npx playwright install --with-deps chromium
|
||||
|
||||
- name: Run burn-in loop (10 iterations)
|
||||
run: |
|
||||
echo "🔥 Starting burn-in loop - detecting flaky tests"
|
||||
for i in {1..10}; do
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "🔥 Burn-in iteration $i/10"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
npm run test:e2e || exit 1
|
||||
done
|
||||
echo "✅ Burn-in complete - no flaky tests detected"
|
||||
|
||||
- name: Upload burn-in failure artifacts
|
||||
if: failure()
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: burn-in-failures
|
||||
path: |
|
||||
test-results/
|
||||
playwright-report/
|
||||
retention-days: 30
|
||||
|
||||
# Report stage - Aggregate and publish results
|
||||
report:
|
||||
name: Test Report
|
||||
runs-on: ubuntu-latest
|
||||
needs: [test, burn-in]
|
||||
if: always()
|
||||
|
||||
steps:
|
||||
- name: Download all artifacts
|
||||
uses: actions/download-artifact@v4
|
||||
with:
|
||||
path: artifacts
|
||||
|
||||
- name: Generate summary
|
||||
run: |
|
||||
echo "## Test Execution Summary" >> $GITHUB_STEP_SUMMARY
|
||||
echo "" >> $GITHUB_STEP_SUMMARY
|
||||
echo "- **Status**: ${{ needs.test.result }}" >> $GITHUB_STEP_SUMMARY
|
||||
echo "- **Burn-in**: ${{ needs.burn-in.result }}" >> $GITHUB_STEP_SUMMARY
|
||||
echo "- **Shards**: 4" >> $GITHUB_STEP_SUMMARY
|
||||
echo "" >> $GITHUB_STEP_SUMMARY
|
||||
|
||||
if [ "${{ needs.burn-in.result }}" == "failure" ]; then
|
||||
echo "⚠️ **Flaky tests detected** - Review burn-in artifacts" >> $GITHUB_STEP_SUMMARY
|
||||
fi
|
||||
@@ -1,149 +0,0 @@
|
||||
# GitLab CI/CD Pipeline for Test Execution
|
||||
# Generated by BMad TEA Agent - Test Architect Module
|
||||
# Optimized for: Playwright/Cypress, Parallel Sharding, Burn-In Loop
|
||||
|
||||
stages:
|
||||
- lint
|
||||
- test
|
||||
- burn-in
|
||||
- report
|
||||
|
||||
variables:
|
||||
# Disable git depth for accurate change detection
|
||||
GIT_DEPTH: 0
|
||||
# Use npm ci for faster, deterministic installs
|
||||
npm_config_cache: "$CI_PROJECT_DIR/.npm"
|
||||
# Playwright browser cache
|
||||
PLAYWRIGHT_BROWSERS_PATH: "$CI_PROJECT_DIR/.cache/ms-playwright"
|
||||
# Default Node version when .nvmrc is missing
|
||||
DEFAULT_NODE_VERSION: "24"
|
||||
|
||||
# Caching configuration
|
||||
cache:
|
||||
key:
|
||||
files:
|
||||
- package-lock.json
|
||||
paths:
|
||||
- .npm/
|
||||
- .cache/ms-playwright/
|
||||
- node_modules/
|
||||
|
||||
# Lint stage - Code quality checks
|
||||
lint:
|
||||
stage: lint
|
||||
image: node:$DEFAULT_NODE_VERSION
|
||||
before_script:
|
||||
- |
|
||||
NODE_VERSION=$(cat .nvmrc 2>/dev/null || echo "$DEFAULT_NODE_VERSION")
|
||||
echo "Using Node $NODE_VERSION"
|
||||
npm install -g n
|
||||
n "$NODE_VERSION"
|
||||
node -v
|
||||
- npm ci
|
||||
script:
|
||||
- npm run lint
|
||||
timeout: 5 minutes
|
||||
|
||||
# Test stage - Parallel execution with sharding
|
||||
.test-template: &test-template
|
||||
stage: test
|
||||
image: node:$DEFAULT_NODE_VERSION
|
||||
needs:
|
||||
- lint
|
||||
before_script:
|
||||
- |
|
||||
NODE_VERSION=$(cat .nvmrc 2>/dev/null || echo "$DEFAULT_NODE_VERSION")
|
||||
echo "Using Node $NODE_VERSION"
|
||||
npm install -g n
|
||||
n "$NODE_VERSION"
|
||||
node -v
|
||||
- npm ci
|
||||
- npx playwright install --with-deps chromium
|
||||
artifacts:
|
||||
when: on_failure
|
||||
paths:
|
||||
- test-results/
|
||||
- playwright-report/
|
||||
expire_in: 30 days
|
||||
timeout: 30 minutes
|
||||
|
||||
test:shard-1:
|
||||
<<: *test-template
|
||||
script:
|
||||
- npm run test:e2e -- --shard=1/4
|
||||
|
||||
test:shard-2:
|
||||
<<: *test-template
|
||||
script:
|
||||
- npm run test:e2e -- --shard=2/4
|
||||
|
||||
test:shard-3:
|
||||
<<: *test-template
|
||||
script:
|
||||
- npm run test:e2e -- --shard=3/4
|
||||
|
||||
test:shard-4:
|
||||
<<: *test-template
|
||||
script:
|
||||
- npm run test:e2e -- --shard=4/4
|
||||
|
||||
# Burn-in stage - Flaky test detection
|
||||
burn-in:
|
||||
stage: burn-in
|
||||
image: node:$DEFAULT_NODE_VERSION
|
||||
needs:
|
||||
- test:shard-1
|
||||
- test:shard-2
|
||||
- test:shard-3
|
||||
- test:shard-4
|
||||
# Only run burn-in on merge requests to main/develop or on schedule
|
||||
rules:
|
||||
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
|
||||
- if: '$CI_PIPELINE_SOURCE == "schedule"'
|
||||
before_script:
|
||||
- |
|
||||
NODE_VERSION=$(cat .nvmrc 2>/dev/null || echo "$DEFAULT_NODE_VERSION")
|
||||
echo "Using Node $NODE_VERSION"
|
||||
npm install -g n
|
||||
n "$NODE_VERSION"
|
||||
node -v
|
||||
- npm ci
|
||||
- npx playwright install --with-deps chromium
|
||||
script:
|
||||
- |
|
||||
echo "🔥 Starting burn-in loop - detecting flaky tests"
|
||||
for i in {1..10}; do
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "🔥 Burn-in iteration $i/10"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
npm run test:e2e || exit 1
|
||||
done
|
||||
echo "✅ Burn-in complete - no flaky tests detected"
|
||||
artifacts:
|
||||
when: on_failure
|
||||
paths:
|
||||
- test-results/
|
||||
- playwright-report/
|
||||
expire_in: 30 days
|
||||
timeout: 60 minutes
|
||||
|
||||
# Report stage - Aggregate results
|
||||
report:
|
||||
stage: report
|
||||
image: alpine:latest
|
||||
needs:
|
||||
- test:shard-1
|
||||
- test:shard-2
|
||||
- test:shard-3
|
||||
- test:shard-4
|
||||
- burn-in
|
||||
when: always
|
||||
script:
|
||||
- |
|
||||
echo "## Test Execution Summary"
|
||||
echo ""
|
||||
echo "- Pipeline: $CI_PIPELINE_ID"
|
||||
echo "- Shards: 4"
|
||||
echo "- Branch: $CI_COMMIT_REF_NAME"
|
||||
echo ""
|
||||
echo "View detailed results in job artifacts"
|
||||
@@ -1,536 +0,0 @@
|
||||
<!-- Powered by BMAD-CORE™ -->
|
||||
|
||||
# CI/CD Pipeline Setup
|
||||
|
||||
**Workflow ID**: `_bmad/bmm/testarch/ci`
|
||||
**Version**: 4.0 (BMad v6)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Scaffolds a production-ready CI/CD quality pipeline with test execution, burn-in loops for flaky test detection, parallel sharding, artifact collection, and notification configuration. This workflow creates platform-specific CI configuration optimized for fast feedback and reliable test execution.
|
||||
|
||||
Note: This is typically a one-time setup per repo; run it any time after the test framework exists, ideally before feature work starts.
|
||||
|
||||
---
|
||||
|
||||
## Preflight Requirements
|
||||
|
||||
**Critical:** Verify these requirements before proceeding. If any fail, HALT and notify the user.
|
||||
|
||||
- ✅ Git repository is initialized (`.git/` directory exists)
|
||||
- ✅ Local test suite passes (`npm run test:e2e` succeeds)
|
||||
- ✅ Test framework is configured (from `framework` workflow)
|
||||
- ✅ Team agrees on target CI platform (GitHub Actions, GitLab CI, Circle CI, etc.)
|
||||
- ✅ Access to CI platform settings/secrets available (if updating existing pipeline)
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Run Preflight Checks
|
||||
|
||||
### Actions
|
||||
|
||||
1. **Verify Git Repository**
|
||||
- Check for `.git/` directory
|
||||
- Confirm remote repository configured (`git remote -v`)
|
||||
- If not initialized, HALT with message: "Git repository required for CI/CD setup"
|
||||
|
||||
2. **Validate Test Framework**
|
||||
- Look for `playwright.config.*` or `cypress.config.*`
|
||||
- Read framework configuration to extract:
|
||||
- Test directory location
|
||||
- Test command
|
||||
- Reporter configuration
|
||||
- Timeout settings
|
||||
- If not found, HALT with message: "Run `framework` workflow first to set up test infrastructure"
|
||||
|
||||
3. **Run Local Tests**
|
||||
- Execute `npm run test:e2e` (or equivalent from package.json)
|
||||
- Ensure tests pass before CI setup
|
||||
- If tests fail, HALT with message: "Fix failing tests before setting up CI/CD"
|
||||
|
||||
4. **Detect CI Platform**
|
||||
- Check for existing CI configuration:
|
||||
- `.github/workflows/*.yml` (GitHub Actions)
|
||||
- `.gitlab-ci.yml` (GitLab CI)
|
||||
- `.circleci/config.yml` (Circle CI)
|
||||
- `Jenkinsfile` (Jenkins)
|
||||
- If found, ask user: "Update existing CI configuration or create new?"
|
||||
- If not found, detect platform from git remote:
|
||||
- `github.com` → GitHub Actions (default)
|
||||
- `gitlab.com` → GitLab CI
|
||||
- Ask user if unable to auto-detect
|
||||
|
||||
5. **Read Environment Configuration**
|
||||
- Use `.nvmrc` for Node version if present
|
||||
- If missing, default to a current LTS (Node 24) or newer instead of a fixed old version
|
||||
- Read `package.json` to identify dependencies (affects caching strategy)
|
||||
|
||||
**Halt Condition:** If preflight checks fail, stop immediately and report which requirement failed.
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Scaffold CI Pipeline
|
||||
|
||||
### Actions
|
||||
|
||||
1. **Select CI Platform Template**
|
||||
|
||||
Based on detection or user preference, use the appropriate template:
|
||||
|
||||
**GitHub Actions** (`.github/workflows/test.yml`):
|
||||
- Most common platform
|
||||
- Excellent caching and matrix support
|
||||
- Free for public repos, generous free tier for private
|
||||
|
||||
**GitLab CI** (`.gitlab-ci.yml`):
|
||||
- Integrated with GitLab
|
||||
- Built-in registry and runners
|
||||
- Powerful pipeline features
|
||||
|
||||
**Circle CI** (`.circleci/config.yml`):
|
||||
- Fast execution with parallelism
|
||||
- Docker-first approach
|
||||
- Enterprise features
|
||||
|
||||
**Jenkins** (`Jenkinsfile`):
|
||||
- Self-hosted option
|
||||
- Maximum customization
|
||||
- Requires infrastructure management
|
||||
|
||||
2. **Generate Pipeline Configuration**
|
||||
|
||||
Use templates from `{installed_path}/` directory:
|
||||
- `github-actions-template.yml`
|
||||
- `gitlab-ci-template.yml`
|
||||
|
||||
**Key pipeline stages:**
|
||||
|
||||
```yaml
|
||||
stages:
|
||||
- lint # Code quality checks
|
||||
- test # Test execution (parallel shards)
|
||||
- burn-in # Flaky test detection
|
||||
- report # Aggregate results and publish
|
||||
```
|
||||
|
||||
3. **Configure Test Execution**
|
||||
|
||||
**Parallel Sharding:**
|
||||
|
||||
```yaml
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
shard: [1, 2, 3, 4]
|
||||
|
||||
steps:
|
||||
- name: Run tests
|
||||
run: npm run test:e2e -- --shard=${{ matrix.shard }}/${{ strategy.job-total }}
|
||||
```
|
||||
|
||||
**Purpose:** Splits tests into N parallel jobs for faster execution (target: <10 min per shard)
|
||||
|
||||
4. **Add Burn-In Loop**
|
||||
|
||||
**Critical pattern from production systems:**
|
||||
|
||||
```yaml
|
||||
burn-in:
|
||||
name: Flaky Test Detection
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Setup Node
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version-file: '.nvmrc'
|
||||
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
|
||||
- name: Run burn-in loop (10 iterations)
|
||||
run: |
|
||||
for i in {1..10}; do
|
||||
echo "🔥 Burn-in iteration $i/10"
|
||||
npm run test:e2e || exit 1
|
||||
done
|
||||
|
||||
- name: Upload failure artifacts
|
||||
if: failure()
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: burn-in-failures
|
||||
path: test-results/
|
||||
retention-days: 30
|
||||
```
|
||||
|
||||
**Purpose:** Runs tests multiple times to catch non-deterministic failures before they reach main branch.
|
||||
|
||||
**When to run:**
|
||||
- On pull requests to main/develop
|
||||
- Weekly on cron schedule
|
||||
- After significant test infrastructure changes
|
||||
|
||||
5. **Configure Caching**
|
||||
|
||||
**Node modules cache:**
|
||||
|
||||
```yaml
|
||||
- name: Cache dependencies
|
||||
uses: actions/cache@v4
|
||||
with:
|
||||
path: ~/.npm
|
||||
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
|
||||
restore-keys: |
|
||||
${{ runner.os }}-node-
|
||||
```
|
||||
|
||||
**Browser binaries cache (Playwright):**
|
||||
|
||||
```yaml
|
||||
- name: Cache Playwright browsers
|
||||
uses: actions/cache@v4
|
||||
with:
|
||||
path: ~/.cache/ms-playwright
|
||||
key: ${{ runner.os }}-playwright-${{ hashFiles('**/package-lock.json') }}
|
||||
```
|
||||
|
||||
**Purpose:** Reduces CI execution time by 2-5 minutes per run.
|
||||
|
||||
6. **Configure Artifact Collection**
|
||||
|
||||
**Failure artifacts only:**
|
||||
|
||||
```yaml
|
||||
- name: Upload test results
|
||||
if: failure()
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: test-results-${{ matrix.shard }}
|
||||
path: |
|
||||
test-results/
|
||||
playwright-report/
|
||||
retention-days: 30
|
||||
```
|
||||
|
||||
**Artifacts to collect:**
|
||||
- Traces (Playwright) - full debugging context
|
||||
- Screenshots - visual evidence of failures
|
||||
- Videos - interaction playback
|
||||
- HTML reports - detailed test results
|
||||
- Console logs - error messages and warnings
|
||||
|
||||
7. **Add Retry Logic**
|
||||
|
||||
```yaml
|
||||
- name: Run tests with retries
|
||||
uses: nick-invision/retry@v2
|
||||
with:
|
||||
timeout_minutes: 30
|
||||
max_attempts: 3
|
||||
retry_on: error
|
||||
command: npm run test:e2e
|
||||
```
|
||||
|
||||
**Purpose:** Handles transient failures (network issues, race conditions)
|
||||
|
||||
8. **Configure Notifications** (Optional)
|
||||
|
||||
If `notify_on_failure` is enabled:
|
||||
|
||||
```yaml
|
||||
- name: Notify on failure
|
||||
if: failure()
|
||||
uses: 8398a7/action-slack@v3
|
||||
with:
|
||||
status: ${{ job.status }}
|
||||
text: 'Test failures detected in PR #${{ github.event.pull_request.number }}'
|
||||
webhook_url: ${{ secrets.SLACK_WEBHOOK }}
|
||||
```
|
||||
|
||||
9. **Generate Helper Scripts**
|
||||
|
||||
**Selective testing script** (`scripts/test-changed.sh`):
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Run only tests for changed files
|
||||
|
||||
CHANGED_FILES=$(git diff --name-only HEAD~1)
|
||||
|
||||
if echo "$CHANGED_FILES" | grep -q "src/.*\.ts$"; then
|
||||
echo "Running affected tests..."
|
||||
npm run test:e2e -- --grep="$(echo $CHANGED_FILES | sed 's/src\///g' | sed 's/\.ts//g')"
|
||||
else
|
||||
echo "No test-affecting changes detected"
|
||||
fi
|
||||
```
|
||||
|
||||
**Local mirror script** (`scripts/ci-local.sh`):
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Mirror CI execution locally for debugging
|
||||
|
||||
echo "🔍 Running CI pipeline locally..."
|
||||
|
||||
# Lint
|
||||
npm run lint || exit 1
|
||||
|
||||
# Tests
|
||||
npm run test:e2e || exit 1
|
||||
|
||||
# Burn-in (reduced iterations)
|
||||
for i in {1..3}; do
|
||||
echo "🔥 Burn-in $i/3"
|
||||
npm run test:e2e || exit 1
|
||||
done
|
||||
|
||||
echo "✅ Local CI pipeline passed"
|
||||
```
|
||||
|
||||
10. **Generate Documentation**
|
||||
|
||||
**CI README** (`docs/ci.md`):
|
||||
- Pipeline stages and purpose
|
||||
- How to run locally
|
||||
- Debugging failed CI runs
|
||||
- Secrets and environment variables needed
|
||||
- Notification setup
|
||||
- Badge URLs for README
|
||||
|
||||
**Secrets checklist** (`docs/ci-secrets-checklist.md`):
|
||||
- Required secrets list (SLACK_WEBHOOK, etc.)
|
||||
- Where to configure in CI platform
|
||||
- Security best practices
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Deliverables
|
||||
|
||||
### Primary Artifacts Created
|
||||
|
||||
1. **CI Configuration File**
|
||||
- `.github/workflows/test.yml` (GitHub Actions)
|
||||
- `.gitlab-ci.yml` (GitLab CI)
|
||||
- `.circleci/config.yml` (Circle CI)
|
||||
|
||||
2. **Pipeline Stages**
|
||||
- **Lint**: Code quality checks (ESLint, Prettier)
|
||||
- **Test**: Parallel test execution (4 shards)
|
||||
- **Burn-in**: Flaky test detection (10 iterations)
|
||||
- **Report**: Result aggregation and publishing
|
||||
|
||||
3. **Helper Scripts**
|
||||
- `scripts/test-changed.sh` - Selective testing
|
||||
- `scripts/ci-local.sh` - Local CI mirror
|
||||
- `scripts/burn-in.sh` - Standalone burn-in execution
|
||||
|
||||
4. **Documentation**
|
||||
- `docs/ci.md` - CI pipeline guide
|
||||
- `docs/ci-secrets-checklist.md` - Required secrets
|
||||
- Inline comments in CI configuration
|
||||
|
||||
5. **Optimization Features**
|
||||
- Dependency caching (npm, browser binaries)
|
||||
- Parallel sharding (4 jobs default)
|
||||
- Retry logic (2 retries on failure)
|
||||
- Failure-only artifact upload
|
||||
|
||||
### Performance Targets
|
||||
|
||||
- **Lint stage**: <2 minutes
|
||||
- **Test stage** (per shard): <10 minutes
|
||||
- **Burn-in stage**: <30 minutes (10 iterations)
|
||||
- **Total pipeline**: <45 minutes
|
||||
|
||||
**Speedup:** 20× faster than sequential execution through parallelism and caching.
|
||||
|
||||
---
|
||||
|
||||
## Important Notes
|
||||
|
||||
### Knowledge Base Integration
|
||||
|
||||
**Critical:** Check configuration and load appropriate fragments.
|
||||
|
||||
Read `{config_source}` and check `config.tea_use_playwright_utils`.
|
||||
|
||||
**Core CI Patterns (Always load):**
|
||||
|
||||
- `ci-burn-in.md` - Burn-in loop patterns: 10-iteration detection, GitHub Actions workflow, shard orchestration, selective execution (678 lines, 4 examples)
|
||||
- `selective-testing.md` - Changed test detection strategies: tag-based, spec filters, diff-based selection, promotion rules (727 lines, 4 examples)
|
||||
- `visual-debugging.md` - Artifact collection best practices: trace viewer, HAR recording, custom artifacts, accessibility integration (522 lines, 5 examples)
|
||||
- `test-quality.md` - CI-specific test quality criteria: deterministic tests, isolated with cleanup, explicit assertions, length/time optimization (658 lines, 5 examples)
|
||||
- `playwright-config.md` - CI-optimized configuration: parallelization, artifact output, project dependencies, sharding (722 lines, 5 examples)
|
||||
|
||||
**If `config.tea_use_playwright_utils: true`:**
|
||||
|
||||
Load playwright-utils CI-relevant fragments:
|
||||
|
||||
- `burn-in.md` - Smart test selection with git diff analysis (very important for CI optimization)
|
||||
- `network-error-monitor.md` - Automatic HTTP 4xx/5xx detection (recommend in CI pipelines)
|
||||
|
||||
Recommend:
|
||||
|
||||
- Add burn-in script for pull request validation
|
||||
- Enable network-error-monitor in merged fixtures for catching silent failures
|
||||
- Reference full docs in `*framework` and `*automate` workflows
|
||||
|
||||
### CI Platform-Specific Guidance
|
||||
|
||||
**GitHub Actions:**
|
||||
|
||||
- Use `actions/cache` for caching
|
||||
- Matrix strategy for parallelism
|
||||
- Secrets in repository settings
|
||||
- Free 2000 minutes/month for private repos
|
||||
|
||||
**GitLab CI:**
|
||||
|
||||
- Use `.gitlab-ci.yml` in root
|
||||
- `cache:` directive for caching
|
||||
- Parallel execution with `parallel: 4`
|
||||
- Variables in project CI/CD settings
|
||||
|
||||
**Circle CI:**
|
||||
|
||||
- Use `.circleci/config.yml`
|
||||
- Docker executors recommended
|
||||
- Parallelism with `parallelism: 4`
|
||||
- Context for shared secrets
|
||||
|
||||
### Burn-In Loop Strategy
|
||||
|
||||
**When to run:**
|
||||
|
||||
- ✅ On PRs to main/develop branches
|
||||
- ✅ Weekly on schedule (cron)
|
||||
- ✅ After test infrastructure changes
|
||||
- ❌ Not on every commit (too slow)
|
||||
|
||||
**Iterations:**
|
||||
|
||||
- **10 iterations** for thorough detection
|
||||
- **3 iterations** for quick feedback
|
||||
- **100 iterations** for high-confidence stability
|
||||
|
||||
**Failure threshold:**
|
||||
|
||||
- Even ONE failure in burn-in → tests are flaky
|
||||
- Must fix before merging
|
||||
|
||||
### Artifact Retention
|
||||
|
||||
**Failure artifacts only:**
|
||||
|
||||
- Saves storage costs
|
||||
- Maintains debugging capability
|
||||
- 30-day retention default
|
||||
|
||||
**Artifact types:**
|
||||
|
||||
- Traces (Playwright) - 5-10 MB per test
|
||||
- Screenshots - 100-500 KB per screenshot
|
||||
- Videos - 2-5 MB per test
|
||||
- HTML reports - 1-2 MB per run
|
||||
|
||||
### Selective Testing
|
||||
|
||||
**Detect changed files:**
|
||||
|
||||
```bash
|
||||
git diff --name-only HEAD~1
|
||||
```
|
||||
|
||||
**Run affected tests only:**
|
||||
|
||||
- Faster feedback for small changes
|
||||
- Full suite still runs on main branch
|
||||
- Reduces CI time by 50-80% for focused PRs
|
||||
|
||||
**Trade-off:**
|
||||
|
||||
- May miss integration issues
|
||||
- Run full suite at least on merge
|
||||
|
||||
### Local CI Mirror
|
||||
|
||||
**Purpose:** Debug CI failures locally
|
||||
|
||||
**Usage:**
|
||||
|
||||
```bash
|
||||
./scripts/ci-local.sh
|
||||
```
|
||||
|
||||
**Mirrors CI environment:**
|
||||
|
||||
- Same Node version
|
||||
- Same test command
|
||||
- Same stages (lint → test → burn-in)
|
||||
- Reduced burn-in iterations (3 vs 10)
|
||||
|
||||
---
|
||||
|
||||
## Output Summary
|
||||
|
||||
After completing this workflow, provide a summary:
|
||||
|
||||
```markdown
|
||||
## CI/CD Pipeline Complete
|
||||
|
||||
**Platform**: GitHub Actions (or GitLab CI, etc.)
|
||||
|
||||
**Artifacts Created**:
|
||||
|
||||
- ✅ Pipeline configuration: .github/workflows/test.yml
|
||||
- ✅ Burn-in loop: 10 iterations for flaky detection
|
||||
- ✅ Parallel sharding: 4 jobs for fast execution
|
||||
- ✅ Caching: Dependencies + browser binaries
|
||||
- ✅ Artifact collection: Failure-only traces/screenshots/videos
|
||||
- ✅ Helper scripts: test-changed.sh, ci-local.sh, burn-in.sh
|
||||
- ✅ Documentation: docs/ci.md, docs/ci-secrets-checklist.md
|
||||
|
||||
**Performance:**
|
||||
|
||||
- Lint: <2 min
|
||||
- Test (per shard): <10 min
|
||||
- Burn-in: <30 min
|
||||
- Total: <45 min (20× speedup vs sequential)
|
||||
|
||||
**Next Steps**:
|
||||
|
||||
1. Commit CI configuration: `git add .github/workflows/test.yml && git commit -m "ci: add test pipeline"`
|
||||
2. Push to remote: `git push`
|
||||
3. Configure required secrets in CI platform settings (see docs/ci-secrets-checklist.md)
|
||||
4. Open a PR to trigger first CI run
|
||||
5. Monitor pipeline execution and adjust parallelism if needed
|
||||
|
||||
**Knowledge Base References Applied**:
|
||||
|
||||
- Burn-in loop pattern (ci-burn-in.md)
|
||||
- Selective testing strategy (selective-testing.md)
|
||||
- Artifact collection (visual-debugging.md)
|
||||
- Test quality criteria (test-quality.md)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Validation
|
||||
|
||||
After completing all steps, verify:
|
||||
|
||||
- [ ] CI configuration file created and syntactically valid
|
||||
- [ ] Burn-in loop configured (10 iterations)
|
||||
- [ ] Parallel sharding enabled (4 jobs)
|
||||
- [ ] Caching configured (dependencies + browsers)
|
||||
- [ ] Artifact collection on failure only
|
||||
- [ ] Helper scripts created and executable (`chmod +x`)
|
||||
- [ ] Documentation complete (ci.md, secrets checklist)
|
||||
- [ ] No errors or warnings during scaffold
|
||||
|
||||
Refer to `checklist.md` for comprehensive validation criteria.
|
||||
@@ -1,45 +0,0 @@
|
||||
# Test Architect workflow: ci
|
||||
name: testarch-ci
|
||||
description: "Scaffold CI/CD quality pipeline with test execution, burn-in loops, and artifact collection"
|
||||
author: "BMad"
|
||||
|
||||
# Critical variables from config
|
||||
config_source: "{project-root}/_bmad/bmm/config.yaml"
|
||||
output_folder: "{config_source}:output_folder"
|
||||
user_name: "{config_source}:user_name"
|
||||
communication_language: "{config_source}:communication_language"
|
||||
document_output_language: "{config_source}:document_output_language"
|
||||
date: system-generated
|
||||
|
||||
# Workflow components
|
||||
installed_path: "{project-root}/_bmad/bmm/workflows/testarch/ci"
|
||||
instructions: "{installed_path}/instructions.md"
|
||||
validation: "{installed_path}/checklist.md"
|
||||
|
||||
# Variables and inputs
|
||||
variables:
|
||||
ci_platform: "auto" # auto, github-actions, gitlab-ci, circle-ci, jenkins - user can override
|
||||
test_dir: "{project-root}/tests" # Root test directory
|
||||
|
||||
# Output configuration
|
||||
default_output_file: "{project-root}/.github/workflows/test.yml" # GitHub Actions default
|
||||
|
||||
# Required tools
|
||||
required_tools:
|
||||
- read_file # Read .nvmrc, package.json, framework config
|
||||
- write_file # Create CI config, scripts, documentation
|
||||
- create_directory # Create .github/workflows/ or .gitlab-ci/ directories
|
||||
- list_files # Detect existing CI configuration
|
||||
- search_repo # Find test files for selective testing
|
||||
|
||||
tags:
|
||||
- qa
|
||||
- ci-cd
|
||||
- test-architect
|
||||
- pipeline
|
||||
- automation
|
||||
|
||||
execution_hints:
|
||||
interactive: false # Minimize prompts, auto-detect when possible
|
||||
autonomous: true # Proceed without user input unless blocked
|
||||
iterative: true
|
||||
@@ -1,321 +0,0 @@
|
||||
# Test Framework Setup - Validation Checklist
|
||||
|
||||
This checklist ensures the framework workflow completes successfully and all deliverables meet quality standards.
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before starting the workflow:
|
||||
|
||||
- [ ] Project root contains valid `package.json`
|
||||
- [ ] No existing modern E2E framework detected (`playwright.config.*`, `cypress.config.*`)
|
||||
- [ ] Project type identifiable (React, Vue, Angular, Next.js, Node, etc.)
|
||||
- [ ] Bundler identifiable (Vite, Webpack, Rollup, esbuild) or not applicable
|
||||
- [ ] User has write permissions to create directories and files
|
||||
|
||||
---
|
||||
|
||||
## Process Steps
|
||||
|
||||
### Step 1: Preflight Checks
|
||||
|
||||
- [ ] package.json successfully read and parsed
|
||||
- [ ] Project type extracted correctly
|
||||
- [ ] Bundler identified (or marked as N/A for backend projects)
|
||||
- [ ] No framework conflicts detected
|
||||
- [ ] Architecture documents located (if available)
|
||||
|
||||
### Step 2: Framework Selection
|
||||
|
||||
- [ ] Framework auto-detection logic executed
|
||||
- [ ] Framework choice justified (Playwright vs Cypress)
|
||||
- [ ] Framework preference respected (if explicitly set)
|
||||
- [ ] User notified of framework selection and rationale
|
||||
|
||||
### Step 3: Directory Structure
|
||||
|
||||
- [ ] `tests/` root directory created
|
||||
- [ ] `tests/e2e/` directory created (or user's preferred structure)
|
||||
- [ ] `tests/support/` directory created (critical pattern)
|
||||
- [ ] `tests/support/fixtures/` directory created
|
||||
- [ ] `tests/support/fixtures/factories/` directory created
|
||||
- [ ] `tests/support/helpers/` directory created
|
||||
- [ ] `tests/support/page-objects/` directory created (if applicable)
|
||||
- [ ] All directories have correct permissions
|
||||
|
||||
**Note**: Test organization is flexible (e2e/, api/, integration/). The **support/** folder is the key pattern.
|
||||
|
||||
### Step 4: Configuration Files
|
||||
|
||||
- [ ] Framework config file created (`playwright.config.ts` or `cypress.config.ts`)
|
||||
- [ ] Config file uses TypeScript (if `use_typescript: true`)
|
||||
- [ ] Timeouts configured correctly (action: 15s, navigation: 30s, test: 60s)
|
||||
- [ ] Base URL configured with environment variable fallback
|
||||
- [ ] Trace/screenshot/video set to retain-on-failure
|
||||
- [ ] Multiple reporters configured (HTML + JUnit + console)
|
||||
- [ ] Parallel execution enabled
|
||||
- [ ] CI-specific settings configured (retries, workers)
|
||||
- [ ] Config file is syntactically valid (no compilation errors)
|
||||
|
||||
### Step 5: Environment Configuration
|
||||
|
||||
- [ ] `.env.example` created in project root
|
||||
- [ ] `TEST_ENV` variable defined
|
||||
- [ ] `BASE_URL` variable defined with default
|
||||
- [ ] `API_URL` variable defined (if applicable)
|
||||
- [ ] Authentication variables defined (if applicable)
|
||||
- [ ] Feature flag variables defined (if applicable)
|
||||
- [ ] `.nvmrc` created with appropriate Node version
|
||||
|
||||
### Step 6: Fixture Architecture
|
||||
|
||||
- [ ] `tests/support/fixtures/index.ts` created
|
||||
- [ ] Base fixture extended from Playwright/Cypress
|
||||
- [ ] Type definitions for fixtures created
|
||||
- [ ] mergeTests pattern implemented (if multiple fixtures)
|
||||
- [ ] Auto-cleanup logic included in fixtures
|
||||
- [ ] Fixture architecture follows knowledge base patterns
|
||||
|
||||
### Step 7: Data Factories
|
||||
|
||||
- [ ] At least one factory created (e.g., UserFactory)
|
||||
- [ ] Factories use @faker-js/faker for realistic data
|
||||
- [ ] Factories track created entities (for cleanup)
|
||||
- [ ] Factories implement `cleanup()` method
|
||||
- [ ] Factories integrate with fixtures
|
||||
- [ ] Factories follow knowledge base patterns
|
||||
|
||||
### Step 8: Sample Tests
|
||||
|
||||
- [ ] Example test file created (`tests/e2e/example.spec.ts`)
|
||||
- [ ] Test uses fixture architecture
|
||||
- [ ] Test demonstrates data factory usage
|
||||
- [ ] Test uses proper selector strategy (data-testid)
|
||||
- [ ] Test follows Given-When-Then structure
|
||||
- [ ] Test includes proper assertions
|
||||
- [ ] Network interception demonstrated (if applicable)
|
||||
|
||||
### Step 9: Helper Utilities
|
||||
|
||||
- [ ] API helper created (if API testing needed)
|
||||
- [ ] Network helper created (if network mocking needed)
|
||||
- [ ] Auth helper created (if authentication needed)
|
||||
- [ ] Helpers follow functional patterns
|
||||
- [ ] Helpers have proper error handling
|
||||
|
||||
### Step 10: Documentation
|
||||
|
||||
- [ ] `tests/README.md` created
|
||||
- [ ] Setup instructions included
|
||||
- [ ] Running tests section included
|
||||
- [ ] Architecture overview section included
|
||||
- [ ] Best practices section included
|
||||
- [ ] CI integration section included
|
||||
- [ ] Knowledge base references included
|
||||
- [ ] Troubleshooting section included
|
||||
|
||||
### Step 11: Package.json Updates
|
||||
|
||||
- [ ] Minimal test script added to package.json: `test:e2e`
|
||||
- [ ] Test framework dependency added (if not already present)
|
||||
- [ ] Type definitions added (if TypeScript)
|
||||
- [ ] Users can extend with additional scripts as needed
|
||||
|
||||
---
|
||||
|
||||
## Output Validation
|
||||
|
||||
### Configuration Validation
|
||||
|
||||
- [ ] Config file loads without errors
|
||||
- [ ] Config file passes linting (if linter configured)
|
||||
- [ ] Config file uses correct syntax for chosen framework
|
||||
- [ ] All paths in config resolve correctly
|
||||
- [ ] Reporter output directories exist or are created on test run
|
||||
|
||||
### Test Execution Validation
|
||||
|
||||
- [ ] Sample test runs successfully
|
||||
- [ ] Test execution produces expected output (pass/fail)
|
||||
- [ ] Test artifacts generated correctly (traces, screenshots, videos)
|
||||
- [ ] Test report generated successfully
|
||||
- [ ] No console errors or warnings during test run
|
||||
|
||||
### Directory Structure Validation
|
||||
|
||||
- [ ] All required directories exist
|
||||
- [ ] Directory structure matches framework conventions
|
||||
- [ ] No duplicate or conflicting directories
|
||||
- [ ] Directories accessible with correct permissions
|
||||
|
||||
### File Integrity Validation
|
||||
|
||||
- [ ] All generated files are syntactically correct
|
||||
- [ ] No placeholder text left in files (e.g., "TODO", "FIXME")
|
||||
- [ ] All imports resolve correctly
|
||||
- [ ] No hardcoded credentials or secrets in files
|
||||
- [ ] All file paths use correct separators for OS
|
||||
|
||||
---
|
||||
|
||||
## Quality Checks
|
||||
|
||||
### Code Quality
|
||||
|
||||
- [ ] Generated code follows project coding standards
|
||||
- [ ] TypeScript types are complete and accurate (no `any` unless necessary)
|
||||
- [ ] No unused imports or variables
|
||||
- [ ] Consistent code formatting (matches project style)
|
||||
- [ ] No linting errors in generated files
|
||||
|
||||
### Best Practices Compliance
|
||||
|
||||
- [ ] Fixture architecture follows pure function → fixture → mergeTests pattern
|
||||
- [ ] Data factories implement auto-cleanup
|
||||
- [ ] Network interception occurs before navigation
|
||||
- [ ] Selectors use data-testid strategy
|
||||
- [ ] Artifacts only captured on failure
|
||||
- [ ] Tests follow Given-When-Then structure
|
||||
- [ ] No hard-coded waits or sleeps
|
||||
|
||||
### Knowledge Base Alignment
|
||||
|
||||
- [ ] Fixture pattern matches `fixture-architecture.md`
|
||||
- [ ] Data factories match `data-factories.md`
|
||||
- [ ] Network handling matches `network-first.md`
|
||||
- [ ] Config follows `playwright-config.md` or `test-config.md`
|
||||
- [ ] Test quality matches `test-quality.md`
|
||||
|
||||
### Security Checks
|
||||
|
||||
- [ ] No credentials in configuration files
|
||||
- [ ] .env.example contains placeholders, not real values
|
||||
- [ ] Sensitive test data handled securely
|
||||
- [ ] API keys and tokens use environment variables
|
||||
- [ ] No secrets committed to version control
|
||||
|
||||
---
|
||||
|
||||
## Integration Points
|
||||
|
||||
### Status File Integration
|
||||
|
||||
- [ ] `bmm-workflow-status.md` exists
|
||||
- [ ] Framework initialization logged in Quality & Testing Progress section
|
||||
- [ ] Status file updated with completion timestamp
|
||||
- [ ] Status file shows framework: Playwright or Cypress
|
||||
|
||||
### Knowledge Base Integration
|
||||
|
||||
- [ ] Relevant knowledge fragments identified from tea-index.csv
|
||||
- [ ] Knowledge fragments successfully loaded
|
||||
- [ ] Patterns from knowledge base applied correctly
|
||||
- [ ] Knowledge base references included in documentation
|
||||
|
||||
### Workflow Dependencies
|
||||
|
||||
- [ ] Can proceed to `ci` workflow after completion
|
||||
- [ ] Can proceed to `test-design` workflow after completion
|
||||
- [ ] Can proceed to `atdd` workflow after completion
|
||||
- [ ] Framework setup compatible with downstream workflows
|
||||
|
||||
---
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
**All of the following must be true:**
|
||||
|
||||
- [ ] All prerequisite checks passed
|
||||
- [ ] All process steps completed without errors
|
||||
- [ ] All output validations passed
|
||||
- [ ] All quality checks passed
|
||||
- [ ] All integration points verified
|
||||
- [ ] Sample test executes successfully
|
||||
- [ ] User can run `npm run test:e2e` without errors
|
||||
- [ ] Documentation is complete and accurate
|
||||
- [ ] No critical issues or blockers identified
|
||||
|
||||
---
|
||||
|
||||
## Post-Workflow Actions
|
||||
|
||||
**User must complete:**
|
||||
|
||||
1. [ ] Copy `.env.example` to `.env`
|
||||
2. [ ] Fill in environment-specific values in `.env`
|
||||
3. [ ] Run `npm install` to install test dependencies
|
||||
4. [ ] Run `npm run test:e2e` to verify setup
|
||||
5. [ ] Review `tests/README.md` for project-specific guidance
|
||||
|
||||
**Recommended next workflows:**
|
||||
|
||||
1. [ ] Run `ci` workflow to set up CI/CD pipeline
|
||||
2. [ ] Run `test-design` workflow to plan test coverage
|
||||
3. [ ] Run `atdd` workflow when ready to develop stories
|
||||
|
||||
---
|
||||
|
||||
## Rollback Procedure
|
||||
|
||||
If workflow fails and needs to be rolled back:
|
||||
|
||||
1. [ ] Delete `tests/` directory
|
||||
2. [ ] Remove test scripts from package.json
|
||||
3. [ ] Delete `.env.example` (if created)
|
||||
4. [ ] Delete `.nvmrc` (if created)
|
||||
5. [ ] Delete framework config file
|
||||
6. [ ] Remove test dependencies from package.json (if added)
|
||||
7. [ ] Run `npm install` to clean up node_modules
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
### Common Issues
|
||||
|
||||
**Issue**: Config file has TypeScript errors
|
||||
|
||||
- **Solution**: Ensure `@playwright/test` or `cypress` types are installed
|
||||
|
||||
**Issue**: Sample test fails to run
|
||||
|
||||
- **Solution**: Check BASE_URL in .env, ensure app is running
|
||||
|
||||
**Issue**: Fixture cleanup not working
|
||||
|
||||
- **Solution**: Verify cleanup() is called in fixture teardown
|
||||
|
||||
**Issue**: Network interception not working
|
||||
|
||||
- **Solution**: Ensure route setup occurs before page.goto()
|
||||
|
||||
### Framework-Specific Considerations
|
||||
|
||||
**Playwright:**
|
||||
|
||||
- Requires Node.js 18+
|
||||
- Browser binaries auto-installed on first run
|
||||
- Trace viewer requires running `npx playwright show-trace`
|
||||
|
||||
**Cypress:**
|
||||
|
||||
- Requires Node.js 18+
|
||||
- Cypress app opens on first run
|
||||
- Component testing requires additional setup
|
||||
|
||||
### Version Compatibility
|
||||
|
||||
- [ ] Node.js version matches .nvmrc
|
||||
- [ ] Framework version compatible with Node.js version
|
||||
- [ ] TypeScript version compatible with framework
|
||||
- [ ] All peer dependencies satisfied
|
||||
|
||||
---
|
||||
|
||||
**Checklist Complete**: Sign off when all items checked and validated.
|
||||
|
||||
**Completed by:** {name}
|
||||
**Date:** {date}
|
||||
**Framework:** { Playwright / Cypress or something else}
|
||||
**Notes:** {notes}
|
||||
@@ -1,481 +0,0 @@
|
||||
<!-- Powered by BMAD-CORE™ -->
|
||||
|
||||
# Test Framework Setup
|
||||
|
||||
**Workflow ID**: `_bmad/bmm/testarch/framework`
|
||||
**Version**: 4.0 (BMad v6)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Initialize a production-ready test framework architecture (Playwright or Cypress) with fixtures, helpers, configuration, and best practices. This workflow scaffolds the complete testing infrastructure for modern web applications.
|
||||
|
||||
---
|
||||
|
||||
## Preflight Requirements
|
||||
|
||||
**Critical:** Verify these requirements before proceeding. If any fail, HALT and notify the user.
|
||||
|
||||
- ✅ `package.json` exists in project root
|
||||
- ✅ No modern E2E test harness is already configured (check for existing `playwright.config.*` or `cypress.config.*`)
|
||||
- ✅ Architectural/stack context available (project type, bundler, dependencies)
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Run Preflight Checks
|
||||
|
||||
### Actions
|
||||
|
||||
1. **Validate package.json**
|
||||
- Read `{project-root}/package.json`
|
||||
- Extract project type (React, Vue, Angular, Next.js, Node, etc.)
|
||||
- Identify bundler (Vite, Webpack, Rollup, esbuild)
|
||||
- Note existing test dependencies
|
||||
|
||||
2. **Check for Existing Framework**
|
||||
- Search for `playwright.config.*`, `cypress.config.*`, `cypress.json`
|
||||
- Check `package.json` for `@playwright/test` or `cypress` dependencies
|
||||
- If found, HALT with message: "Existing test framework detected. Use workflow `upgrade-framework` instead."
|
||||
|
||||
3. **Gather Context**
|
||||
- Look for architecture documents (`architecture.md`, `tech-spec*.md`)
|
||||
- Check for API documentation or endpoint lists
|
||||
- Identify authentication requirements
|
||||
|
||||
**Halt Condition:** If preflight checks fail, stop immediately and report which requirement failed.
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Scaffold Framework
|
||||
|
||||
### Actions
|
||||
|
||||
1. **Framework Selection**
|
||||
|
||||
**Default Logic:**
|
||||
- **Playwright** (recommended for):
|
||||
- Large repositories (100+ files)
|
||||
- Performance-critical applications
|
||||
- Multi-browser support needed
|
||||
- Complex user flows requiring video/trace debugging
|
||||
- Projects requiring worker parallelism
|
||||
|
||||
- **Cypress** (recommended for):
|
||||
- Small teams prioritizing developer experience
|
||||
- Component testing focus
|
||||
- Real-time reloading during test development
|
||||
- Simpler setup requirements
|
||||
|
||||
**Detection Strategy:**
|
||||
- Check `package.json` for existing preference
|
||||
- Consider `project_size` variable from workflow config
|
||||
- Use `framework_preference` variable if set
|
||||
- Default to **Playwright** if uncertain
|
||||
|
||||
2. **Create Directory Structure**
|
||||
|
||||
```
|
||||
{project-root}/
|
||||
├── tests/ # Root test directory
|
||||
│ ├── e2e/ # Test files (users organize as needed)
|
||||
│ ├── support/ # Framework infrastructure (key pattern)
|
||||
│ │ ├── fixtures/ # Test fixtures (data, mocks)
|
||||
│ │ ├── helpers/ # Utility functions
|
||||
│ │ └── page-objects/ # Page object models (optional)
|
||||
│ └── README.md # Test suite documentation
|
||||
```
|
||||
|
||||
**Note**: Users organize test files (e2e/, api/, integration/, component/) as needed. The **support/** folder is the critical pattern for fixtures and helpers used across tests.
|
||||
|
||||
3. **Generate Configuration File**
|
||||
|
||||
**For Playwright** (`playwright.config.ts` or `playwright.config.js`):
|
||||
|
||||
```typescript
|
||||
import { defineConfig, devices } from '@playwright/test';
|
||||
|
||||
export default defineConfig({
|
||||
testDir: './tests/e2e',
|
||||
fullyParallel: true,
|
||||
forbidOnly: !!process.env.CI,
|
||||
retries: process.env.CI ? 2 : 0,
|
||||
workers: process.env.CI ? 1 : undefined,
|
||||
|
||||
timeout: 60 * 1000, // Test timeout: 60s
|
||||
expect: {
|
||||
timeout: 15 * 1000, // Assertion timeout: 15s
|
||||
},
|
||||
|
||||
use: {
|
||||
baseURL: process.env.BASE_URL || 'http://localhost:3000',
|
||||
trace: 'retain-on-failure',
|
||||
screenshot: 'only-on-failure',
|
||||
video: 'retain-on-failure',
|
||||
actionTimeout: 15 * 1000, // Action timeout: 15s
|
||||
navigationTimeout: 30 * 1000, // Navigation timeout: 30s
|
||||
},
|
||||
|
||||
reporter: [['html', { outputFolder: 'test-results/html' }], ['junit', { outputFile: 'test-results/junit.xml' }], ['list']],
|
||||
|
||||
projects: [
|
||||
{ name: 'chromium', use: { ...devices['Desktop Chrome'] } },
|
||||
{ name: 'firefox', use: { ...devices['Desktop Firefox'] } },
|
||||
{ name: 'webkit', use: { ...devices['Desktop Safari'] } },
|
||||
],
|
||||
});
|
||||
```
|
||||
|
||||
**For Cypress** (`cypress.config.ts` or `cypress.config.js`):
|
||||
|
||||
```typescript
|
||||
import { defineConfig } from 'cypress';
|
||||
|
||||
export default defineConfig({
|
||||
e2e: {
|
||||
baseUrl: process.env.BASE_URL || 'http://localhost:3000',
|
||||
specPattern: 'tests/e2e/**/*.cy.{js,jsx,ts,tsx}',
|
||||
supportFile: 'tests/support/e2e.ts',
|
||||
video: false,
|
||||
screenshotOnRunFailure: true,
|
||||
|
||||
setupNodeEvents(on, config) {
|
||||
// implement node event listeners here
|
||||
},
|
||||
},
|
||||
|
||||
retries: {
|
||||
runMode: 2,
|
||||
openMode: 0,
|
||||
},
|
||||
|
||||
defaultCommandTimeout: 15000,
|
||||
requestTimeout: 30000,
|
||||
responseTimeout: 30000,
|
||||
pageLoadTimeout: 60000,
|
||||
});
|
||||
```
|
||||
|
||||
4. **Generate Environment Configuration**
|
||||
|
||||
Create `.env.example`:
|
||||
|
||||
```bash
|
||||
# Test Environment Configuration
|
||||
TEST_ENV=local
|
||||
BASE_URL=http://localhost:3000
|
||||
API_URL=http://localhost:3001/api
|
||||
|
||||
# Authentication (if applicable)
|
||||
TEST_USER_EMAIL=test@example.com
|
||||
TEST_USER_PASSWORD=
|
||||
|
||||
# Feature Flags (if applicable)
|
||||
FEATURE_FLAG_NEW_UI=true
|
||||
|
||||
# API Keys (if applicable)
|
||||
TEST_API_KEY=
|
||||
```
|
||||
|
||||
5. **Generate Node Version File**
|
||||
|
||||
Create `.nvmrc`:
|
||||
|
||||
```
|
||||
20.11.0
|
||||
```
|
||||
|
||||
(Use Node version from existing `.nvmrc` or default to current LTS)
|
||||
|
||||
6. **Implement Fixture Architecture**
|
||||
|
||||
**Knowledge Base Reference**: `testarch/knowledge/fixture-architecture.md`
|
||||
|
||||
Create `tests/support/fixtures/index.ts`:
|
||||
|
||||
```typescript
|
||||
import { test as base } from '@playwright/test';
|
||||
import { UserFactory } from './factories/user-factory';
|
||||
|
||||
type TestFixtures = {
|
||||
userFactory: UserFactory;
|
||||
};
|
||||
|
||||
export const test = base.extend<TestFixtures>({
|
||||
userFactory: async ({}, use) => {
|
||||
const factory = new UserFactory();
|
||||
await use(factory);
|
||||
await factory.cleanup(); // Auto-cleanup
|
||||
},
|
||||
});
|
||||
|
||||
export { expect } from '@playwright/test';
|
||||
```
|
||||
|
||||
7. **Implement Data Factories**
|
||||
|
||||
**Knowledge Base Reference**: `testarch/knowledge/data-factories.md`
|
||||
|
||||
Create `tests/support/fixtures/factories/user-factory.ts`:
|
||||
|
||||
```typescript
|
||||
import { faker } from '@faker-js/faker';
|
||||
|
||||
export class UserFactory {
|
||||
private createdUsers: string[] = [];
|
||||
|
||||
async createUser(overrides = {}) {
|
||||
const user = {
|
||||
email: faker.internet.email(),
|
||||
name: faker.person.fullName(),
|
||||
password: faker.internet.password({ length: 12 }),
|
||||
...overrides,
|
||||
};
|
||||
|
||||
// API call to create user
|
||||
const response = await fetch(`${process.env.API_URL}/users`, {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify(user),
|
||||
});
|
||||
|
||||
const created = await response.json();
|
||||
this.createdUsers.push(created.id);
|
||||
return created;
|
||||
}
|
||||
|
||||
async cleanup() {
|
||||
// Delete all created users
|
||||
for (const userId of this.createdUsers) {
|
||||
await fetch(`${process.env.API_URL}/users/${userId}`, {
|
||||
method: 'DELETE',
|
||||
});
|
||||
}
|
||||
this.createdUsers = [];
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
8. **Generate Sample Tests**
|
||||
|
||||
Create `tests/e2e/example.spec.ts`:
|
||||
|
||||
```typescript
|
||||
import { test, expect } from '../support/fixtures';
|
||||
|
||||
test.describe('Example Test Suite', () => {
|
||||
test('should load homepage', async ({ page }) => {
|
||||
await page.goto('/');
|
||||
await expect(page).toHaveTitle(/Home/i);
|
||||
});
|
||||
|
||||
test('should create user and login', async ({ page, userFactory }) => {
|
||||
// Create test user
|
||||
const user = await userFactory.createUser();
|
||||
|
||||
// Login
|
||||
await page.goto('/login');
|
||||
await page.fill('[data-testid="email-input"]', user.email);
|
||||
await page.fill('[data-testid="password-input"]', user.password);
|
||||
await page.click('[data-testid="login-button"]');
|
||||
|
||||
// Assert login success
|
||||
await expect(page.locator('[data-testid="user-menu"]')).toBeVisible();
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
9. **Update package.json Scripts**
|
||||
|
||||
Add minimal test script to `package.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"scripts": {
|
||||
"test:e2e": "playwright test"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Note**: Users can add additional scripts as needed (e.g., `--ui`, `--headed`, `--debug`, `show-report`).
|
||||
|
||||
10. **Generate Documentation**
|
||||
|
||||
Create `tests/README.md` with setup instructions (see Step 3 deliverables).
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Deliverables
|
||||
|
||||
### Primary Artifacts Created
|
||||
|
||||
1. **Configuration File**
|
||||
- `playwright.config.ts` or `cypress.config.ts`
|
||||
- Timeouts: action 15s, navigation 30s, test 60s
|
||||
- Reporters: HTML + JUnit XML
|
||||
|
||||
2. **Directory Structure**
|
||||
- `tests/` with `e2e/`, `api/`, `support/` subdirectories
|
||||
- `support/fixtures/` for test fixtures
|
||||
- `support/helpers/` for utility functions
|
||||
|
||||
3. **Environment Configuration**
|
||||
- `.env.example` with `TEST_ENV`, `BASE_URL`, `API_URL`
|
||||
- `.nvmrc` with Node version
|
||||
|
||||
4. **Test Infrastructure**
|
||||
- Fixture architecture (`mergeTests` pattern)
|
||||
- Data factories (faker-based, with auto-cleanup)
|
||||
- Sample tests demonstrating patterns
|
||||
|
||||
5. **Documentation**
|
||||
- `tests/README.md` with setup instructions
|
||||
- Comments in config files explaining options
|
||||
|
||||
### README Contents
|
||||
|
||||
The generated `tests/README.md` should include:
|
||||
|
||||
- **Setup Instructions**: How to install dependencies, configure environment
|
||||
- **Running Tests**: Commands for local execution, headed mode, debug mode
|
||||
- **Architecture Overview**: Fixture pattern, data factories, page objects
|
||||
- **Best Practices**: Selector strategy (data-testid), test isolation, cleanup
|
||||
- **CI Integration**: How tests run in CI/CD pipeline
|
||||
- **Knowledge Base References**: Links to relevant TEA knowledge fragments
|
||||
|
||||
---
|
||||
|
||||
## Important Notes
|
||||
|
||||
### Knowledge Base Integration
|
||||
|
||||
**Critical:** Check configuration and load appropriate fragments.
|
||||
|
||||
Read `{config_source}` and check `config.tea_use_playwright_utils`.
|
||||
|
||||
**If `config.tea_use_playwright_utils: true` (Playwright Utils Integration):**
|
||||
|
||||
Consult `{project-root}/_bmad/bmm/testarch/tea-index.csv` and load:
|
||||
|
||||
- `overview.md` - Playwright utils installation and design principles
|
||||
- `fixtures-composition.md` - mergeTests composition with playwright-utils
|
||||
- `auth-session.md` - Token persistence setup (if auth needed)
|
||||
- `api-request.md` - API testing utilities (if API tests planned)
|
||||
- `burn-in.md` - Smart test selection for CI (recommend during framework setup)
|
||||
- `network-error-monitor.md` - Automatic HTTP error detection (recommend in merged fixtures)
|
||||
- `data-factories.md` - Factory patterns with faker (498 lines, 5 examples)
|
||||
|
||||
Recommend installing playwright-utils:
|
||||
|
||||
```bash
|
||||
npm install -D @seontechnologies/playwright-utils
|
||||
```
|
||||
|
||||
Recommend adding burn-in and network-error-monitor to merged fixtures for enhanced reliability.
|
||||
|
||||
**If `config.tea_use_playwright_utils: false` (Traditional Patterns):**
|
||||
|
||||
Consult `{project-root}/_bmad/bmm/testarch/tea-index.csv` and load:
|
||||
|
||||
- `fixture-architecture.md` - Pure function → fixture → `mergeTests` composition with auto-cleanup (406 lines, 5 examples)
|
||||
- `data-factories.md` - Faker-based factories with overrides, nested factories, API seeding, auto-cleanup (498 lines, 5 examples)
|
||||
- `network-first.md` - Network-first testing safeguards: intercept before navigate, HAR capture, deterministic waiting (489 lines, 5 examples)
|
||||
- `playwright-config.md` - Playwright-specific configuration: environment-based, timeout standards, artifact output, parallelization, project config (722 lines, 5 examples)
|
||||
- `test-quality.md` - Test design principles: deterministic, isolated with cleanup, explicit assertions, length/time limits (658 lines, 5 examples)
|
||||
|
||||
### Framework-Specific Guidance
|
||||
|
||||
**Playwright Advantages:**
|
||||
|
||||
- Worker parallelism (significantly faster for large suites)
|
||||
- Trace viewer (powerful debugging with screenshots, network, console)
|
||||
- Multi-language support (TypeScript, JavaScript, Python, C#, Java)
|
||||
- Built-in API testing capabilities
|
||||
- Better handling of multiple browser contexts
|
||||
|
||||
**Cypress Advantages:**
|
||||
|
||||
- Superior developer experience (real-time reloading)
|
||||
- Excellent for component testing (Cypress CT or use Vitest)
|
||||
- Simpler setup for small teams
|
||||
- Better suited for watch mode during development
|
||||
|
||||
**Avoid Cypress when:**
|
||||
|
||||
- API chains are heavy and complex
|
||||
- Multi-tab/window scenarios are common
|
||||
- Worker parallelism is critical for CI performance
|
||||
|
||||
### Selector Strategy
|
||||
|
||||
**Always recommend**:
|
||||
|
||||
- `data-testid` attributes for UI elements
|
||||
- `data-cy` attributes if Cypress is chosen
|
||||
- Avoid brittle CSS selectors or XPath
|
||||
|
||||
### Contract Testing
|
||||
|
||||
For microservices architectures, **recommend Pact** for consumer-driven contract testing alongside E2E tests.
|
||||
|
||||
### Failure Artifacts
|
||||
|
||||
Configure **failure-only** capture:
|
||||
|
||||
- Screenshots: only on failure
|
||||
- Videos: retain on failure (delete on success)
|
||||
- Traces: retain on failure (Playwright)
|
||||
|
||||
This reduces storage overhead while maintaining debugging capability.
|
||||
|
||||
---
|
||||
|
||||
## Output Summary
|
||||
|
||||
After completing this workflow, provide a summary:
|
||||
|
||||
```markdown
|
||||
## Framework Scaffold Complete
|
||||
|
||||
**Framework Selected**: Playwright (or Cypress)
|
||||
|
||||
**Artifacts Created**:
|
||||
|
||||
- ✅ Configuration file: `playwright.config.ts`
|
||||
- ✅ Directory structure: `tests/e2e/`, `tests/support/`
|
||||
- ✅ Environment config: `.env.example`
|
||||
- ✅ Node version: `.nvmrc`
|
||||
- ✅ Fixture architecture: `tests/support/fixtures/`
|
||||
- ✅ Data factories: `tests/support/fixtures/factories/`
|
||||
- ✅ Sample tests: `tests/e2e/example.spec.ts`
|
||||
- ✅ Documentation: `tests/README.md`
|
||||
|
||||
**Next Steps**:
|
||||
|
||||
1. Copy `.env.example` to `.env` and fill in environment variables
|
||||
2. Run `npm install` to install test dependencies
|
||||
3. Run `npm run test:e2e` to execute sample tests
|
||||
4. Review `tests/README.md` for detailed setup instructions
|
||||
|
||||
**Knowledge Base References Applied**:
|
||||
|
||||
- Fixture architecture pattern (pure functions + mergeTests)
|
||||
- Data factories with auto-cleanup (faker-based)
|
||||
- Network-first testing safeguards
|
||||
- Failure-only artifact capture
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Validation
|
||||
|
||||
After completing all steps, verify:
|
||||
|
||||
- [ ] Configuration file created and valid
|
||||
- [ ] Directory structure exists
|
||||
- [ ] Environment configuration generated
|
||||
- [ ] Sample tests run successfully
|
||||
- [ ] Documentation complete and accurate
|
||||
- [ ] No errors or warnings during scaffold
|
||||
|
||||
Refer to `checklist.md` for comprehensive validation criteria.
|
||||
@@ -1,47 +0,0 @@
|
||||
# Test Architect workflow: framework
|
||||
name: testarch-framework
|
||||
description: "Initialize production-ready test framework architecture (Playwright or Cypress) with fixtures, helpers, and configuration"
|
||||
author: "BMad"
|
||||
|
||||
# Critical variables from config
|
||||
config_source: "{project-root}/_bmad/bmm/config.yaml"
|
||||
output_folder: "{config_source}:output_folder"
|
||||
user_name: "{config_source}:user_name"
|
||||
communication_language: "{config_source}:communication_language"
|
||||
document_output_language: "{config_source}:document_output_language"
|
||||
date: system-generated
|
||||
|
||||
# Workflow components
|
||||
installed_path: "{project-root}/_bmad/bmm/workflows/testarch/framework"
|
||||
instructions: "{installed_path}/instructions.md"
|
||||
validation: "{installed_path}/checklist.md"
|
||||
|
||||
# Variables and inputs
|
||||
variables:
|
||||
test_dir: "{project-root}/tests" # Root test directory
|
||||
use_typescript: true # Prefer TypeScript configuration
|
||||
framework_preference: "auto" # auto, playwright, cypress - user can override auto-detection
|
||||
project_size: "auto" # auto, small, large - influences framework recommendation
|
||||
|
||||
# Output configuration
|
||||
default_output_file: "{test_dir}/README.md" # Main deliverable is test setup README
|
||||
|
||||
# Required tools
|
||||
required_tools:
|
||||
- read_file # Read package.json, existing configs
|
||||
- write_file # Create config files, helpers, fixtures, tests
|
||||
- create_directory # Create test directory structure
|
||||
- list_files # Check for existing framework
|
||||
- search_repo # Find architecture docs
|
||||
|
||||
tags:
|
||||
- qa
|
||||
- setup
|
||||
- test-architect
|
||||
- framework
|
||||
- initialization
|
||||
|
||||
execution_hints:
|
||||
interactive: false # Minimize prompts; auto-detect when possible
|
||||
autonomous: true # Proceed without user input unless blocked
|
||||
iterative: true
|
||||
@@ -1,407 +0,0 @@
|
||||
# Non-Functional Requirements Assessment - Validation Checklist
|
||||
|
||||
**Workflow:** `testarch-nfr`
|
||||
**Purpose:** Ensure comprehensive and evidence-based NFR assessment with actionable recommendations
|
||||
|
||||
---
|
||||
|
||||
Note: `nfr-assess` evaluates existing evidence; it does not run tests or CI workflows.
|
||||
|
||||
## Prerequisites Validation
|
||||
|
||||
- [ ] Implementation is deployed and accessible for evaluation
|
||||
- [ ] Evidence sources are available (test results, metrics, logs, CI results)
|
||||
- [ ] NFR categories are determined (performance, security, reliability, maintainability, custom)
|
||||
- [ ] Evidence directories exist and are accessible (`test_results_dir`, `metrics_dir`, `logs_dir`)
|
||||
- [ ] Knowledge base is loaded (nfr-criteria, ci-burn-in, test-quality)
|
||||
|
||||
---
|
||||
|
||||
## Context Loading
|
||||
|
||||
- [ ] Tech-spec.md loaded successfully (if available)
|
||||
- [ ] PRD.md loaded (if available)
|
||||
- [ ] Story file loaded (if applicable)
|
||||
- [ ] Relevant knowledge fragments loaded from `tea-index.csv`:
|
||||
- [ ] `nfr-criteria.md`
|
||||
- [ ] `ci-burn-in.md`
|
||||
- [ ] `test-quality.md`
|
||||
- [ ] `playwright-config.md` (if using Playwright)
|
||||
|
||||
---
|
||||
|
||||
## NFR Categories and Thresholds
|
||||
|
||||
### Performance
|
||||
|
||||
- [ ] Response time threshold defined or marked as UNKNOWN
|
||||
- [ ] Throughput threshold defined or marked as UNKNOWN
|
||||
- [ ] Resource usage thresholds defined or marked as UNKNOWN
|
||||
- [ ] Scalability requirements defined or marked as UNKNOWN
|
||||
|
||||
### Security
|
||||
|
||||
- [ ] Authentication requirements defined or marked as UNKNOWN
|
||||
- [ ] Authorization requirements defined or marked as UNKNOWN
|
||||
- [ ] Data protection requirements defined or marked as UNKNOWN
|
||||
- [ ] Vulnerability management thresholds defined or marked as UNKNOWN
|
||||
- [ ] Compliance requirements identified (GDPR, HIPAA, PCI-DSS, etc.)
|
||||
|
||||
### Reliability
|
||||
|
||||
- [ ] Availability (uptime) threshold defined or marked as UNKNOWN
|
||||
- [ ] Error rate threshold defined or marked as UNKNOWN
|
||||
- [ ] MTTR (Mean Time To Recovery) threshold defined or marked as UNKNOWN
|
||||
- [ ] Fault tolerance requirements defined or marked as UNKNOWN
|
||||
- [ ] Disaster recovery requirements defined (RTO, RPO) or marked as UNKNOWN
|
||||
|
||||
### Maintainability
|
||||
|
||||
- [ ] Test coverage threshold defined or marked as UNKNOWN
|
||||
- [ ] Code quality threshold defined or marked as UNKNOWN
|
||||
- [ ] Technical debt threshold defined or marked as UNKNOWN
|
||||
- [ ] Documentation completeness threshold defined or marked as UNKNOWN
|
||||
|
||||
### Custom NFR Categories (if applicable)
|
||||
|
||||
- [ ] Custom NFR category 1: Thresholds defined or marked as UNKNOWN
|
||||
- [ ] Custom NFR category 2: Thresholds defined or marked as UNKNOWN
|
||||
- [ ] Custom NFR category 3: Thresholds defined or marked as UNKNOWN
|
||||
|
||||
---
|
||||
|
||||
## Evidence Gathering
|
||||
|
||||
### Performance Evidence
|
||||
|
||||
- [ ] Load test results collected (JMeter, k6, Gatling, etc.)
|
||||
- [ ] Application metrics collected (response times, throughput, resource usage)
|
||||
- [ ] APM data collected (New Relic, Datadog, Dynatrace, etc.)
|
||||
- [ ] Lighthouse reports collected (if web app)
|
||||
- [ ] Playwright performance traces collected (if applicable)
|
||||
|
||||
### Security Evidence
|
||||
|
||||
- [ ] SAST results collected (SonarQube, Checkmarx, Veracode, etc.)
|
||||
- [ ] DAST results collected (OWASP ZAP, Burp Suite, etc.)
|
||||
- [ ] Dependency scanning results collected (Snyk, Dependabot, npm audit)
|
||||
- [ ] Penetration test reports collected (if available)
|
||||
- [ ] Security audit logs collected
|
||||
- [ ] Compliance audit results collected (if applicable)
|
||||
|
||||
### Reliability Evidence
|
||||
|
||||
- [ ] Uptime monitoring data collected (Pingdom, UptimeRobot, StatusCake)
|
||||
- [ ] Error logs collected
|
||||
- [ ] Error rate metrics collected
|
||||
- [ ] CI burn-in results collected (stability over time)
|
||||
- [ ] Chaos engineering test results collected (if available)
|
||||
- [ ] Failover/recovery test results collected (if available)
|
||||
- [ ] Incident reports and postmortems collected (if applicable)
|
||||
|
||||
### Maintainability Evidence
|
||||
|
||||
- [ ] Code coverage reports collected (Istanbul, NYC, c8, JaCoCo)
|
||||
- [ ] Static analysis results collected (ESLint, SonarQube, CodeClimate)
|
||||
- [ ] Technical debt metrics collected
|
||||
- [ ] Documentation audit results collected
|
||||
- [ ] Test review report collected (from test-review workflow, if available)
|
||||
- [ ] Git metrics collected (code churn, commit frequency, etc.)
|
||||
|
||||
---
|
||||
|
||||
## NFR Assessment with Deterministic Rules
|
||||
|
||||
### Performance Assessment
|
||||
|
||||
- [ ] Response time assessed against threshold
|
||||
- [ ] Throughput assessed against threshold
|
||||
- [ ] Resource usage assessed against threshold
|
||||
- [ ] Scalability assessed against requirements
|
||||
- [ ] Status classified (PASS/CONCERNS/FAIL) with justification
|
||||
- [ ] Evidence source documented (file path, metric name)
|
||||
|
||||
### Security Assessment
|
||||
|
||||
- [ ] Authentication strength assessed against requirements
|
||||
- [ ] Authorization controls assessed against requirements
|
||||
- [ ] Data protection assessed against requirements
|
||||
- [ ] Vulnerability management assessed against thresholds
|
||||
- [ ] Compliance assessed against requirements
|
||||
- [ ] Status classified (PASS/CONCERNS/FAIL) with justification
|
||||
- [ ] Evidence source documented (file path, scan result)
|
||||
|
||||
### Reliability Assessment
|
||||
|
||||
- [ ] Availability (uptime) assessed against threshold
|
||||
- [ ] Error rate assessed against threshold
|
||||
- [ ] MTTR assessed against threshold
|
||||
- [ ] Fault tolerance assessed against requirements
|
||||
- [ ] Disaster recovery assessed against requirements (RTO, RPO)
|
||||
- [ ] CI burn-in assessed (stability over time)
|
||||
- [ ] Status classified (PASS/CONCERNS/FAIL) with justification
|
||||
- [ ] Evidence source documented (file path, monitoring data)
|
||||
|
||||
### Maintainability Assessment
|
||||
|
||||
- [ ] Test coverage assessed against threshold
|
||||
- [ ] Code quality assessed against threshold
|
||||
- [ ] Technical debt assessed against threshold
|
||||
- [ ] Documentation completeness assessed against threshold
|
||||
- [ ] Test quality assessed (from test-review, if available)
|
||||
- [ ] Status classified (PASS/CONCERNS/FAIL) with justification
|
||||
- [ ] Evidence source documented (file path, coverage report)
|
||||
|
||||
### Custom NFR Assessment (if applicable)
|
||||
|
||||
- [ ] Custom NFR 1 assessed against threshold with justification
|
||||
- [ ] Custom NFR 2 assessed against threshold with justification
|
||||
- [ ] Custom NFR 3 assessed against threshold with justification
|
||||
|
||||
---
|
||||
|
||||
## Status Classification Validation
|
||||
|
||||
### PASS Criteria Verified
|
||||
|
||||
- [ ] Evidence exists for PASS status
|
||||
- [ ] Evidence meets or exceeds threshold
|
||||
- [ ] No concerns flagged in evidence
|
||||
- [ ] Quality is acceptable
|
||||
|
||||
### CONCERNS Criteria Verified
|
||||
|
||||
- [ ] Threshold is UNKNOWN (documented) OR
|
||||
- [ ] Evidence is MISSING or INCOMPLETE (documented) OR
|
||||
- [ ] Evidence is close to threshold (within 10%, documented) OR
|
||||
- [ ] Evidence shows intermittent issues (documented)
|
||||
|
||||
### FAIL Criteria Verified
|
||||
|
||||
- [ ] Evidence exists BUT does not meet threshold (documented) OR
|
||||
- [ ] Critical evidence is MISSING (documented) OR
|
||||
- [ ] Evidence shows consistent failures (documented) OR
|
||||
- [ ] Quality is unacceptable (documented)
|
||||
|
||||
### No Threshold Guessing
|
||||
|
||||
- [ ] All thresholds are either defined or marked as UNKNOWN
|
||||
- [ ] No thresholds were guessed or inferred
|
||||
- [ ] All UNKNOWN thresholds result in CONCERNS status
|
||||
|
||||
---
|
||||
|
||||
## Quick Wins and Recommended Actions
|
||||
|
||||
### Quick Wins Identified
|
||||
|
||||
- [ ] Low-effort, high-impact improvements identified for CONCERNS/FAIL
|
||||
- [ ] Configuration changes (no code changes) identified
|
||||
- [ ] Optimization opportunities identified (caching, indexing, compression)
|
||||
- [ ] Monitoring additions identified (detect issues before failures)
|
||||
|
||||
### Recommended Actions
|
||||
|
||||
- [ ] Specific remediation steps provided (not generic advice)
|
||||
- [ ] Priority assigned (CRITICAL, HIGH, MEDIUM, LOW)
|
||||
- [ ] Estimated effort provided (hours, days)
|
||||
- [ ] Owner suggestions provided (dev, ops, security)
|
||||
|
||||
### Monitoring Hooks
|
||||
|
||||
- [ ] Performance monitoring suggested (APM, synthetic monitoring)
|
||||
- [ ] Error tracking suggested (Sentry, Rollbar, error logs)
|
||||
- [ ] Security monitoring suggested (intrusion detection, audit logs)
|
||||
- [ ] Alerting thresholds suggested (notify before breach)
|
||||
|
||||
### Fail-Fast Mechanisms
|
||||
|
||||
- [ ] Circuit breakers suggested for reliability
|
||||
- [ ] Rate limiting suggested for performance
|
||||
- [ ] Validation gates suggested for security
|
||||
- [ ] Smoke tests suggested for maintainability
|
||||
|
||||
---
|
||||
|
||||
## Deliverables Generated
|
||||
|
||||
### NFR Assessment Report
|
||||
|
||||
- [ ] File created at `{output_folder}/nfr-assessment.md`
|
||||
- [ ] Template from `nfr-report-template.md` used
|
||||
- [ ] Executive summary included (overall status, critical issues)
|
||||
- [ ] Assessment by category included (performance, security, reliability, maintainability)
|
||||
- [ ] Evidence for each NFR documented
|
||||
- [ ] Status classifications documented (PASS/CONCERNS/FAIL)
|
||||
- [ ] Findings summary included (PASS count, CONCERNS count, FAIL count)
|
||||
- [ ] Quick wins section included
|
||||
- [ ] Recommended actions section included
|
||||
- [ ] Evidence gaps checklist included
|
||||
|
||||
### Gate YAML Snippet (if enabled)
|
||||
|
||||
- [ ] YAML snippet generated
|
||||
- [ ] Date included
|
||||
- [ ] Categories status included (performance, security, reliability, maintainability)
|
||||
- [ ] Overall status included (PASS/CONCERNS/FAIL)
|
||||
- [ ] Issue counts included (critical, high, medium, concerns)
|
||||
- [ ] Blockers flag included (true/false)
|
||||
- [ ] Recommendations included
|
||||
|
||||
### Evidence Checklist (if enabled)
|
||||
|
||||
- [ ] All NFRs with MISSING or INCOMPLETE evidence listed
|
||||
- [ ] Owners assigned for evidence collection
|
||||
- [ ] Suggested evidence sources provided
|
||||
- [ ] Deadlines set for evidence collection
|
||||
|
||||
### Updated Story File (if enabled and requested)
|
||||
|
||||
- [ ] "NFR Assessment" section added to story markdown
|
||||
- [ ] Link to NFR assessment report included
|
||||
- [ ] Overall status and critical issues included
|
||||
- [ ] Gate status included
|
||||
|
||||
---
|
||||
|
||||
## Quality Assurance
|
||||
|
||||
### Accuracy Checks
|
||||
|
||||
- [ ] All NFR categories assessed (none skipped)
|
||||
- [ ] All thresholds documented (defined or UNKNOWN)
|
||||
- [ ] All evidence sources documented (file paths, metric names)
|
||||
- [ ] Status classifications are deterministic and consistent
|
||||
- [ ] No false positives (status correctly assigned)
|
||||
- [ ] No false negatives (all issues identified)
|
||||
|
||||
### Completeness Checks
|
||||
|
||||
- [ ] All NFR categories covered (performance, security, reliability, maintainability, custom)
|
||||
- [ ] All evidence sources checked (test results, metrics, logs, CI results)
|
||||
- [ ] All status types used appropriately (PASS, CONCERNS, FAIL)
|
||||
- [ ] All NFRs with CONCERNS/FAIL have recommendations
|
||||
- [ ] All evidence gaps have owners and deadlines
|
||||
|
||||
### Actionability Checks
|
||||
|
||||
- [ ] Recommendations are specific (not generic)
|
||||
- [ ] Remediation steps are clear and actionable
|
||||
- [ ] Priorities are assigned (CRITICAL, HIGH, MEDIUM, LOW)
|
||||
- [ ] Effort estimates are provided (hours, days)
|
||||
- [ ] Owners are suggested (dev, ops, security)
|
||||
|
||||
---
|
||||
|
||||
## Integration with BMad Artifacts
|
||||
|
||||
### With tech-spec.md
|
||||
|
||||
- [ ] Tech spec loaded for NFR requirements and thresholds
|
||||
- [ ] Performance targets extracted
|
||||
- [ ] Security requirements extracted
|
||||
- [ ] Reliability SLAs extracted
|
||||
- [ ] Architectural decisions considered
|
||||
|
||||
### With test-design.md
|
||||
|
||||
- [ ] Test design loaded for NFR test plan
|
||||
- [ ] Test priorities referenced (P0/P1/P2/P3)
|
||||
- [ ] Assessment aligned with planned NFR validation
|
||||
|
||||
### With PRD.md
|
||||
|
||||
- [ ] PRD loaded for product-level NFR context
|
||||
- [ ] User experience goals considered
|
||||
- [ ] Unstated requirements checked
|
||||
- [ ] Product-level SLAs referenced
|
||||
|
||||
---
|
||||
|
||||
## Quality Gates Validation
|
||||
|
||||
### Release Blocker (FAIL)
|
||||
|
||||
- [ ] Critical NFR status checked (security, reliability)
|
||||
- [ ] Performance failures assessed for user impact
|
||||
- [ ] Release blocker flagged if critical NFR has FAIL status
|
||||
|
||||
### PR Blocker (HIGH CONCERNS)
|
||||
|
||||
- [ ] High-priority NFR status checked
|
||||
- [ ] Multiple CONCERNS assessed
|
||||
- [ ] PR blocker flagged if HIGH priority issues exist
|
||||
|
||||
### Warning (CONCERNS)
|
||||
|
||||
- [ ] Any NFR with CONCERNS status flagged
|
||||
- [ ] Missing or incomplete evidence documented
|
||||
- [ ] Warning issued to address before next release
|
||||
|
||||
### Pass (PASS)
|
||||
|
||||
- [ ] All NFRs have PASS status
|
||||
- [ ] No blockers or concerns exist
|
||||
- [ ] Ready for release confirmed
|
||||
|
||||
---
|
||||
|
||||
## Non-Prescriptive Validation
|
||||
|
||||
- [ ] NFR categories adapted to team needs
|
||||
- [ ] Thresholds appropriate for project context
|
||||
- [ ] Assessment criteria customized as needed
|
||||
- [ ] Teams can extend with custom NFR categories
|
||||
- [ ] Integration with external tools supported (New Relic, Datadog, SonarQube, JIRA)
|
||||
|
||||
---
|
||||
|
||||
## Documentation and Communication
|
||||
|
||||
- [ ] NFR assessment report is readable and well-formatted
|
||||
- [ ] Tables render correctly in markdown
|
||||
- [ ] Code blocks have proper syntax highlighting
|
||||
- [ ] Links are valid and accessible
|
||||
- [ ] Recommendations are clear and prioritized
|
||||
- [ ] Overall status is prominent and unambiguous
|
||||
- [ ] Executive summary provides quick understanding
|
||||
|
||||
---
|
||||
|
||||
## Final Validation
|
||||
|
||||
- [ ] All prerequisites met
|
||||
- [ ] All NFR categories assessed with evidence (or gaps documented)
|
||||
- [ ] No thresholds were guessed (all defined or UNKNOWN)
|
||||
- [ ] Status classifications are deterministic and justified
|
||||
- [ ] Quick wins identified for all CONCERNS/FAIL
|
||||
- [ ] Recommended actions are specific and actionable
|
||||
- [ ] Evidence gaps documented with owners and deadlines
|
||||
- [ ] NFR assessment report generated and saved
|
||||
- [ ] Gate YAML snippet generated (if enabled)
|
||||
- [ ] Evidence checklist generated (if enabled)
|
||||
- [ ] Workflow completed successfully
|
||||
|
||||
---
|
||||
|
||||
## Sign-Off
|
||||
|
||||
**NFR Assessment Status:**
|
||||
|
||||
- [ ] ✅ PASS - All NFRs meet requirements, ready for release
|
||||
- [ ] ⚠️ CONCERNS - Some NFRs have concerns, address before next release
|
||||
- [ ] ❌ FAIL - Critical NFRs not met, BLOCKER for release
|
||||
|
||||
**Next Actions:**
|
||||
|
||||
- If PASS ✅: Proceed to `*gate` workflow or release
|
||||
- If CONCERNS ⚠️: Address HIGH/CRITICAL issues, re-run `*nfr-assess`
|
||||
- If FAIL ❌: Resolve FAIL status NFRs, re-run `*nfr-assess`
|
||||
|
||||
**Critical Issues:** {COUNT}
|
||||
**High Priority Issues:** {COUNT}
|
||||
**Concerns:** {COUNT}
|
||||
|
||||
---
|
||||
|
||||
<!-- Powered by BMAD-CORE™ -->
|
||||
@@ -1,722 +0,0 @@
|
||||
# Non-Functional Requirements Assessment - Instructions v4.0
|
||||
|
||||
**Workflow:** `testarch-nfr`
|
||||
**Purpose:** Assess non-functional requirements (performance, security, reliability, maintainability) before release with evidence-based validation
|
||||
**Agent:** Test Architect (TEA)
|
||||
**Format:** Pure Markdown v4.0 (no XML blocks)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This workflow performs a comprehensive assessment of non-functional requirements (NFRs) to validate that the implementation meets performance, security, reliability, and maintainability standards before release. It uses evidence-based validation with deterministic PASS/CONCERNS/FAIL rules and provides actionable recommendations for remediation.
|
||||
|
||||
**Key Capabilities:**
|
||||
|
||||
- Assess multiple NFR categories (performance, security, reliability, maintainability, custom)
|
||||
- Validate NFRs against defined thresholds from tech specs, PRD, or defaults
|
||||
- Classify status deterministically (PASS/CONCERNS/FAIL) based on evidence
|
||||
- Never guess thresholds - mark as CONCERNS if unknown
|
||||
- Generate gate-ready YAML snippets for CI/CD integration
|
||||
- Provide quick wins and recommended actions for remediation
|
||||
- Create evidence checklists for gaps
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
**Required:**
|
||||
|
||||
- Implementation deployed locally or accessible for evaluation
|
||||
- Evidence sources available (test results, metrics, logs, CI results)
|
||||
|
||||
**Recommended:**
|
||||
|
||||
- NFR requirements defined in tech-spec.md, PRD.md, or story
|
||||
- Test results from performance, security, reliability tests
|
||||
- Application metrics (response times, error rates, throughput)
|
||||
- CI/CD pipeline results for burn-in validation
|
||||
|
||||
**Halt Conditions:**
|
||||
|
||||
- If NFR targets are undefined and cannot be obtained, halt and request definition
|
||||
- If implementation is not accessible for evaluation, halt and request deployment
|
||||
|
||||
---
|
||||
|
||||
## Workflow Steps
|
||||
|
||||
### Step 1: Load Context and Knowledge Base
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. Load relevant knowledge fragments from `{project-root}/_bmad/bmm/testarch/tea-index.csv`:
|
||||
- `nfr-criteria.md` - Non-functional requirements criteria and thresholds (security, performance, reliability, maintainability with code examples, 658 lines, 4 examples)
|
||||
- `ci-burn-in.md` - CI/CD burn-in patterns for reliability validation (10-iteration detection, sharding, selective execution, 678 lines, 4 examples)
|
||||
- `test-quality.md` - Test quality expectations for maintainability (deterministic, isolated, explicit assertions, length/time limits, 658 lines, 5 examples)
|
||||
- `playwright-config.md` - Performance configuration patterns: parallelization, timeout standards, artifact output (722 lines, 5 examples)
|
||||
- `error-handling.md` - Reliability validation patterns: scoped exceptions, retry validation, telemetry logging, graceful degradation (736 lines, 4 examples)
|
||||
|
||||
2. Read story file (if provided):
|
||||
- Extract NFR requirements
|
||||
- Identify specific thresholds or SLAs
|
||||
- Note any custom NFR categories
|
||||
|
||||
3. Read related BMad artifacts (if available):
|
||||
- `tech-spec.md` - Technical NFR requirements and targets
|
||||
- `PRD.md` - Product-level NFR context (user expectations)
|
||||
- `test-design.md` - NFR test plan and priorities
|
||||
|
||||
**Output:** Complete understanding of NFR targets, evidence sources, and validation criteria
|
||||
|
||||
---
|
||||
|
||||
### Step 2: Identify NFR Categories and Thresholds
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. Determine which NFR categories to assess (default: performance, security, reliability, maintainability):
|
||||
- **Performance**: Response time, throughput, resource usage
|
||||
- **Security**: Authentication, authorization, data protection, vulnerability scanning
|
||||
- **Reliability**: Error handling, recovery, availability, fault tolerance
|
||||
- **Maintainability**: Code quality, test coverage, documentation, technical debt
|
||||
|
||||
2. Add custom NFR categories if specified (e.g., accessibility, internationalization, compliance)
|
||||
|
||||
3. Gather thresholds for each NFR:
|
||||
- From tech-spec.md (primary source)
|
||||
- From PRD.md (product-level SLAs)
|
||||
- From story file (feature-specific requirements)
|
||||
- From workflow variables (default thresholds)
|
||||
- Mark thresholds as UNKNOWN if not defined
|
||||
|
||||
4. Never guess thresholds - if a threshold is unknown, mark the NFR as CONCERNS
|
||||
|
||||
**Output:** Complete list of NFRs to assess with defined (or UNKNOWN) thresholds
|
||||
|
||||
---
|
||||
|
||||
### Step 3: Gather Evidence
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. For each NFR category, discover evidence sources:
|
||||
|
||||
**Performance Evidence:**
|
||||
- Load test results (JMeter, k6, Lighthouse)
|
||||
- Application metrics (response times, throughput, resource usage)
|
||||
- Performance monitoring data (New Relic, Datadog, APM)
|
||||
- Playwright performance traces (if applicable)
|
||||
|
||||
**Security Evidence:**
|
||||
- Security scan results (SAST, DAST, dependency scanning)
|
||||
- Authentication/authorization test results
|
||||
- Penetration test reports
|
||||
- Vulnerability assessment reports
|
||||
- Compliance audit results
|
||||
|
||||
**Reliability Evidence:**
|
||||
- Error logs and error rates
|
||||
- Uptime monitoring data
|
||||
- Chaos engineering test results
|
||||
- Failover/recovery test results
|
||||
- CI burn-in results (stability over time)
|
||||
|
||||
**Maintainability Evidence:**
|
||||
- Code coverage reports (Istanbul, NYC, c8)
|
||||
- Static analysis results (ESLint, SonarQube)
|
||||
- Technical debt metrics
|
||||
- Documentation completeness
|
||||
- Test quality assessment (from test-review workflow)
|
||||
|
||||
2. Read relevant files from evidence directories:
|
||||
- `{test_results_dir}` for test execution results
|
||||
- `{metrics_dir}` for application metrics
|
||||
- `{logs_dir}` for application logs
|
||||
- CI/CD pipeline results (if `include_ci_results` is true)
|
||||
|
||||
3. Mark NFRs without evidence as "NO EVIDENCE" - never infer or assume
|
||||
|
||||
**Output:** Comprehensive evidence inventory for each NFR
|
||||
|
||||
---
|
||||
|
||||
### Step 4: Assess NFRs with Deterministic Rules
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. For each NFR, apply deterministic PASS/CONCERNS/FAIL rules:
|
||||
|
||||
**PASS Criteria:**
|
||||
- Evidence exists AND meets defined threshold
|
||||
- No concerns flagged in evidence
|
||||
- Example: Response time is 350ms (threshold: 500ms) → PASS
|
||||
|
||||
**CONCERNS Criteria:**
|
||||
- Threshold is UNKNOWN (not defined)
|
||||
- Evidence is MISSING or INCOMPLETE
|
||||
- Evidence is close to threshold (within 10%)
|
||||
- Evidence shows intermittent issues
|
||||
- Example: Response time is 480ms (threshold: 500ms, 96% of threshold) → CONCERNS
|
||||
|
||||
**FAIL Criteria:**
|
||||
- Evidence exists BUT does not meet threshold
|
||||
- Critical evidence is MISSING
|
||||
- Evidence shows consistent failures
|
||||
- Example: Response time is 750ms (threshold: 500ms) → FAIL
|
||||
|
||||
2. Document findings for each NFR:
|
||||
- Status (PASS/CONCERNS/FAIL)
|
||||
- Evidence source (file path, test name, metric name)
|
||||
- Actual value vs threshold
|
||||
- Justification for status classification
|
||||
|
||||
3. Classify severity based on category:
|
||||
- **CRITICAL**: Security failures, reliability failures (affect users immediately)
|
||||
- **HIGH**: Performance failures, maintainability failures (affect users soon)
|
||||
- **MEDIUM**: Concerns without failures (may affect users eventually)
|
||||
- **LOW**: Missing evidence for non-critical NFRs
|
||||
|
||||
**Output:** Complete NFR assessment with deterministic status classifications
|
||||
|
||||
---
|
||||
|
||||
### Step 5: Identify Quick Wins and Recommended Actions
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. For each NFR with CONCERNS or FAIL status, identify quick wins:
|
||||
- Low-effort, high-impact improvements
|
||||
- Configuration changes (no code changes needed)
|
||||
- Optimization opportunities (caching, indexing, compression)
|
||||
- Monitoring additions (detect issues before they become failures)
|
||||
|
||||
2. Provide recommended actions for each issue:
|
||||
- Specific steps to remediate (not generic advice)
|
||||
- Priority (CRITICAL, HIGH, MEDIUM, LOW)
|
||||
- Estimated effort (hours, days)
|
||||
- Owner suggestion (dev, ops, security)
|
||||
|
||||
3. Suggest monitoring hooks for gaps:
|
||||
- Add performance monitoring (APM, synthetic monitoring)
|
||||
- Add error tracking (Sentry, Rollbar, error logs)
|
||||
- Add security monitoring (intrusion detection, audit logs)
|
||||
- Add alerting thresholds (notify before thresholds are breached)
|
||||
|
||||
4. Suggest fail-fast mechanisms:
|
||||
- Add circuit breakers for reliability
|
||||
- Add rate limiting for performance
|
||||
- Add validation gates for security
|
||||
- Add smoke tests for maintainability
|
||||
|
||||
**Output:** Actionable remediation plan with prioritized recommendations
|
||||
|
||||
---
|
||||
|
||||
### Step 6: Generate Deliverables
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. Create NFR assessment markdown file:
|
||||
- Use template from `nfr-report-template.md`
|
||||
- Include executive summary (overall status, critical issues)
|
||||
- Add NFR-by-NFR assessment (status, evidence, thresholds)
|
||||
- Add findings summary (PASS count, CONCERNS count, FAIL count)
|
||||
- Add quick wins section
|
||||
- Add recommended actions section
|
||||
- Add evidence gaps checklist
|
||||
- Save to `{output_folder}/nfr-assessment.md`
|
||||
|
||||
2. Generate gate YAML snippet (if enabled):
|
||||
|
||||
```yaml
|
||||
nfr_assessment:
|
||||
date: '2025-10-14'
|
||||
categories:
|
||||
performance: 'PASS'
|
||||
security: 'CONCERNS'
|
||||
reliability: 'PASS'
|
||||
maintainability: 'PASS'
|
||||
overall_status: 'CONCERNS'
|
||||
critical_issues: 0
|
||||
high_priority_issues: 1
|
||||
concerns: 2
|
||||
blockers: false
|
||||
```
|
||||
|
||||
3. Generate evidence checklist (if enabled):
|
||||
- List all NFRs with MISSING or INCOMPLETE evidence
|
||||
- Assign owners for evidence collection
|
||||
- Suggest evidence sources (tests, metrics, logs)
|
||||
- Set deadlines for evidence collection
|
||||
|
||||
4. Update story file (if enabled and requested):
|
||||
- Add "NFR Assessment" section to story markdown
|
||||
- Link to NFR assessment report
|
||||
- Include overall status and critical issues
|
||||
- Add gate status
|
||||
|
||||
**Output:** Complete NFR assessment documentation ready for review and CI/CD integration
|
||||
|
||||
---
|
||||
|
||||
## Non-Prescriptive Approach
|
||||
|
||||
**Minimal Examples:** This workflow provides principles and patterns, not rigid templates. Teams should adapt NFR categories, thresholds, and assessment criteria to their needs.
|
||||
|
||||
**Key Patterns to Follow:**
|
||||
|
||||
- Use evidence-based validation (no guessing or inference)
|
||||
- Apply deterministic rules (consistent PASS/CONCERNS/FAIL classification)
|
||||
- Never guess thresholds (mark as CONCERNS if unknown)
|
||||
- Provide actionable recommendations (specific steps, not generic advice)
|
||||
- Generate gate-ready artifacts (YAML snippets for CI/CD)
|
||||
|
||||
**Extend as Needed:**
|
||||
|
||||
- Add custom NFR categories (accessibility, internationalization, compliance)
|
||||
- Integrate with external tools (New Relic, Datadog, SonarQube, JIRA)
|
||||
- Add custom thresholds and rules
|
||||
- Link to external assessment systems
|
||||
|
||||
---
|
||||
|
||||
## NFR Categories and Criteria
|
||||
|
||||
### Performance
|
||||
|
||||
**Criteria:**
|
||||
|
||||
- Response time (p50, p95, p99 percentiles)
|
||||
- Throughput (requests per second, transactions per second)
|
||||
- Resource usage (CPU, memory, disk, network)
|
||||
- Scalability (horizontal, vertical)
|
||||
|
||||
**Thresholds (Default):**
|
||||
|
||||
- Response time p95: 500ms
|
||||
- Throughput: 100 RPS
|
||||
- CPU usage: < 70% average
|
||||
- Memory usage: < 80% max
|
||||
|
||||
**Evidence Sources:**
|
||||
|
||||
- Load test results (JMeter, k6, Gatling)
|
||||
- APM data (New Relic, Datadog, Dynatrace)
|
||||
- Lighthouse reports (for web apps)
|
||||
- Playwright performance traces
|
||||
|
||||
---
|
||||
|
||||
### Security
|
||||
|
||||
**Criteria:**
|
||||
|
||||
- Authentication (login security, session management)
|
||||
- Authorization (access control, permissions)
|
||||
- Data protection (encryption, PII handling)
|
||||
- Vulnerability management (SAST, DAST, dependency scanning)
|
||||
- Compliance (GDPR, HIPAA, PCI-DSS)
|
||||
|
||||
**Thresholds (Default):**
|
||||
|
||||
- Security score: >= 85/100
|
||||
- Critical vulnerabilities: 0
|
||||
- High vulnerabilities: < 3
|
||||
- Authentication strength: MFA enabled
|
||||
|
||||
**Evidence Sources:**
|
||||
|
||||
- SAST results (SonarQube, Checkmarx, Veracode)
|
||||
- DAST results (OWASP ZAP, Burp Suite)
|
||||
- Dependency scanning (Snyk, Dependabot, npm audit)
|
||||
- Penetration test reports
|
||||
- Security audit logs
|
||||
|
||||
---
|
||||
|
||||
### Reliability
|
||||
|
||||
**Criteria:**
|
||||
|
||||
- Availability (uptime percentage)
|
||||
- Error handling (graceful degradation, error recovery)
|
||||
- Fault tolerance (redundancy, failover)
|
||||
- Disaster recovery (backup, restore, RTO/RPO)
|
||||
- Stability (CI burn-in, chaos engineering)
|
||||
|
||||
**Thresholds (Default):**
|
||||
|
||||
- Uptime: >= 99.9% (three nines)
|
||||
- Error rate: < 0.1% (1 in 1000 requests)
|
||||
- MTTR (Mean Time To Recovery): < 15 minutes
|
||||
- CI burn-in: 100 consecutive successful runs
|
||||
|
||||
**Evidence Sources:**
|
||||
|
||||
- Uptime monitoring (Pingdom, UptimeRobot, StatusCake)
|
||||
- Error logs and error rates
|
||||
- CI burn-in results (see `ci-burn-in.md`)
|
||||
- Chaos engineering test results (Chaos Monkey, Gremlin)
|
||||
- Incident reports and postmortems
|
||||
|
||||
---
|
||||
|
||||
### Maintainability
|
||||
|
||||
**Criteria:**
|
||||
|
||||
- Code quality (complexity, duplication, code smells)
|
||||
- Test coverage (unit, integration, E2E)
|
||||
- Documentation (code comments, README, architecture docs)
|
||||
- Technical debt (debt ratio, code churn)
|
||||
- Test quality (from test-review workflow)
|
||||
|
||||
**Thresholds (Default):**
|
||||
|
||||
- Test coverage: >= 80%
|
||||
- Code quality score: >= 85/100
|
||||
- Technical debt ratio: < 5%
|
||||
- Documentation completeness: >= 90%
|
||||
|
||||
**Evidence Sources:**
|
||||
|
||||
- Coverage reports (Istanbul, NYC, c8, JaCoCo)
|
||||
- Static analysis (ESLint, SonarQube, CodeClimate)
|
||||
- Documentation audit (manual or automated)
|
||||
- Test review report (from test-review workflow)
|
||||
- Git metrics (code churn, commit frequency)
|
||||
|
||||
---
|
||||
|
||||
## Deterministic Assessment Rules
|
||||
|
||||
### PASS Rules
|
||||
|
||||
- Evidence exists
|
||||
- Evidence meets or exceeds threshold
|
||||
- No concerns flagged
|
||||
- Quality is acceptable
|
||||
|
||||
**Example:**
|
||||
|
||||
```markdown
|
||||
NFR: Response Time p95
|
||||
Threshold: 500ms
|
||||
Evidence: Load test result shows 350ms p95
|
||||
Status: PASS ✅
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### CONCERNS Rules
|
||||
|
||||
- Threshold is UNKNOWN
|
||||
- Evidence is MISSING or INCOMPLETE
|
||||
- Evidence is close to threshold (within 10%)
|
||||
- Evidence shows intermittent issues
|
||||
- Quality is marginal
|
||||
|
||||
**Example:**
|
||||
|
||||
```markdown
|
||||
NFR: Response Time p95
|
||||
Threshold: 500ms
|
||||
Evidence: Load test result shows 480ms p95 (96% of threshold)
|
||||
Status: CONCERNS ⚠️
|
||||
Recommendation: Optimize before production - very close to threshold
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### FAIL Rules
|
||||
|
||||
- Evidence exists BUT does not meet threshold
|
||||
- Critical evidence is MISSING
|
||||
- Evidence shows consistent failures
|
||||
- Quality is unacceptable
|
||||
|
||||
**Example:**
|
||||
|
||||
```markdown
|
||||
NFR: Response Time p95
|
||||
Threshold: 500ms
|
||||
Evidence: Load test result shows 750ms p95 (150% of threshold)
|
||||
Status: FAIL ❌
|
||||
Recommendation: BLOCKER - optimize performance before release
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Integration with BMad Artifacts
|
||||
|
||||
### With tech-spec.md
|
||||
|
||||
- Primary source for NFR requirements and thresholds
|
||||
- Load performance targets, security requirements, reliability SLAs
|
||||
- Use architectural decisions to understand NFR trade-offs
|
||||
|
||||
### With test-design.md
|
||||
|
||||
- Understand NFR test plan and priorities
|
||||
- Reference test priorities (P0/P1/P2/P3) for severity classification
|
||||
- Align assessment with planned NFR validation
|
||||
|
||||
### With PRD.md
|
||||
|
||||
- Understand product-level NFR expectations
|
||||
- Verify NFRs align with user experience goals
|
||||
- Check for unstated NFR requirements (implied by product goals)
|
||||
|
||||
---
|
||||
|
||||
## Quality Gates
|
||||
|
||||
### Release Blocker (FAIL)
|
||||
|
||||
- Critical NFR has FAIL status (security, reliability)
|
||||
- Performance failure affects user experience severely
|
||||
- Do not release until FAIL is resolved
|
||||
|
||||
### PR Blocker (HIGH CONCERNS)
|
||||
|
||||
- High-priority NFR has FAIL status
|
||||
- Multiple CONCERNS exist
|
||||
- Block PR merge until addressed
|
||||
|
||||
### Warning (CONCERNS)
|
||||
|
||||
- Any NFR has CONCERNS status
|
||||
- Evidence is missing or incomplete
|
||||
- Address before next release
|
||||
|
||||
### Pass (PASS)
|
||||
|
||||
- All NFRs have PASS status
|
||||
- No blockers or concerns
|
||||
- Ready for release
|
||||
|
||||
---
|
||||
|
||||
## Example NFR Assessment
|
||||
|
||||
````markdown
|
||||
# NFR Assessment - Story 1.3
|
||||
|
||||
**Feature:** User Authentication
|
||||
**Date:** 2025-10-14
|
||||
**Overall Status:** CONCERNS ⚠️ (1 HIGH issue)
|
||||
|
||||
## Executive Summary
|
||||
|
||||
**Assessment:** 3 PASS, 1 CONCERNS, 0 FAIL
|
||||
**Blockers:** None
|
||||
**High Priority Issues:** 1 (Security - MFA not enforced)
|
||||
**Recommendation:** Address security concern before release
|
||||
|
||||
## Performance Assessment
|
||||
|
||||
### Response Time (p95)
|
||||
|
||||
- **Status:** PASS ✅
|
||||
- **Threshold:** 500ms
|
||||
- **Actual:** 320ms (64% of threshold)
|
||||
- **Evidence:** Load test results (test-results/load-2025-10-14.json)
|
||||
- **Findings:** Response time well below threshold across all percentiles
|
||||
|
||||
### Throughput
|
||||
|
||||
- **Status:** PASS ✅
|
||||
- **Threshold:** 100 RPS
|
||||
- **Actual:** 250 RPS (250% of threshold)
|
||||
- **Evidence:** Load test results (test-results/load-2025-10-14.json)
|
||||
- **Findings:** System handles 2.5x target load without degradation
|
||||
|
||||
## Security Assessment
|
||||
|
||||
### Authentication Strength
|
||||
|
||||
- **Status:** CONCERNS ⚠️
|
||||
- **Threshold:** MFA enabled for all users
|
||||
- **Actual:** MFA optional (not enforced)
|
||||
- **Evidence:** Security audit (security-audit-2025-10-14.md)
|
||||
- **Findings:** MFA is implemented but not enforced by default
|
||||
- **Recommendation:** HIGH - Enforce MFA for all new accounts, provide migration path for existing users
|
||||
|
||||
### Data Protection
|
||||
|
||||
- **Status:** PASS ✅
|
||||
- **Threshold:** PII encrypted at rest and in transit
|
||||
- **Actual:** AES-256 at rest, TLS 1.3 in transit
|
||||
- **Evidence:** Security scan (security-scan-2025-10-14.json)
|
||||
- **Findings:** All PII properly encrypted
|
||||
|
||||
## Reliability Assessment
|
||||
|
||||
### Uptime
|
||||
|
||||
- **Status:** PASS ✅
|
||||
- **Threshold:** 99.9% (three nines)
|
||||
- **Actual:** 99.95% over 30 days
|
||||
- **Evidence:** Uptime monitoring (uptime-report-2025-10-14.csv)
|
||||
- **Findings:** Exceeds target with margin
|
||||
|
||||
### Error Rate
|
||||
|
||||
- **Status:** PASS ✅
|
||||
- **Threshold:** < 0.1% (1 in 1000)
|
||||
- **Actual:** 0.05% (1 in 2000)
|
||||
- **Evidence:** Error logs (logs/errors-2025-10.log)
|
||||
- **Findings:** Error rate well below threshold
|
||||
|
||||
## Maintainability Assessment
|
||||
|
||||
### Test Coverage
|
||||
|
||||
- **Status:** PASS ✅
|
||||
- **Threshold:** >= 80%
|
||||
- **Actual:** 87%
|
||||
- **Evidence:** Coverage report (coverage/lcov-report/index.html)
|
||||
- **Findings:** Coverage exceeds threshold with good distribution
|
||||
|
||||
### Code Quality
|
||||
|
||||
- **Status:** PASS ✅
|
||||
- **Threshold:** >= 85/100
|
||||
- **Actual:** 92/100
|
||||
- **Evidence:** SonarQube analysis (sonarqube-report-2025-10-14.pdf)
|
||||
- **Findings:** High code quality score with low technical debt
|
||||
|
||||
## Quick Wins
|
||||
|
||||
1. **Enforce MFA (Security)** - HIGH - 4 hours
|
||||
- Add configuration flag to enforce MFA for new accounts
|
||||
- No code changes needed, only config adjustment
|
||||
|
||||
## Recommended Actions
|
||||
|
||||
### Immediate (Before Release)
|
||||
|
||||
1. **Enforce MFA for all new accounts** - HIGH - 4 hours - Security Team
|
||||
- Add `ENFORCE_MFA=true` to production config
|
||||
- Update user onboarding flow to require MFA setup
|
||||
- Test MFA enforcement in staging environment
|
||||
|
||||
### Short-term (Next Sprint)
|
||||
|
||||
1. **Migrate existing users to MFA** - MEDIUM - 3 days - Product + Engineering
|
||||
- Design migration UX (prompt, incentives, deadline)
|
||||
- Implement migration flow with grace period
|
||||
- Communicate migration to existing users
|
||||
|
||||
## Evidence Gaps
|
||||
|
||||
- [ ] Chaos engineering test results (reliability)
|
||||
- Owner: DevOps Team
|
||||
- Deadline: 2025-10-21
|
||||
- Suggested evidence: Run chaos monkey tests in staging
|
||||
|
||||
- [ ] Penetration test report (security)
|
||||
- Owner: Security Team
|
||||
- Deadline: 2025-10-28
|
||||
- Suggested evidence: Schedule third-party pentest
|
||||
|
||||
## Gate YAML Snippet
|
||||
|
||||
```yaml
|
||||
nfr_assessment:
|
||||
date: '2025-10-14'
|
||||
story_id: '1.3'
|
||||
categories:
|
||||
performance: 'PASS'
|
||||
security: 'CONCERNS'
|
||||
reliability: 'PASS'
|
||||
maintainability: 'PASS'
|
||||
overall_status: 'CONCERNS'
|
||||
critical_issues: 0
|
||||
high_priority_issues: 1
|
||||
medium_priority_issues: 0
|
||||
concerns: 1
|
||||
blockers: false
|
||||
recommendations:
|
||||
- 'Enforce MFA for all new accounts (HIGH - 4 hours)'
|
||||
evidence_gaps: 2
|
||||
```
|
||||
````
|
||||
|
||||
## Recommendations Summary
|
||||
|
||||
- **Release Blocker:** None ✅
|
||||
- **High Priority:** 1 (Enforce MFA before release)
|
||||
- **Medium Priority:** 1 (Migrate existing users to MFA)
|
||||
- **Next Steps:** Address HIGH priority item, then proceed to gate workflow
|
||||
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Validation Checklist
|
||||
|
||||
Before completing this workflow, verify:
|
||||
|
||||
- ✅ All NFR categories assessed (performance, security, reliability, maintainability, custom)
|
||||
- ✅ Thresholds defined or marked as UNKNOWN
|
||||
- ✅ Evidence gathered for each NFR (or marked as MISSING)
|
||||
- ✅ Status classified deterministically (PASS/CONCERNS/FAIL)
|
||||
- ✅ No thresholds were guessed (marked as CONCERNS if unknown)
|
||||
- ✅ Quick wins identified for CONCERNS/FAIL
|
||||
- ✅ Recommended actions are specific and actionable
|
||||
- ✅ Evidence gaps documented with owners and deadlines
|
||||
- ✅ NFR assessment report generated and saved
|
||||
- ✅ Gate YAML snippet generated (if enabled)
|
||||
- ✅ Evidence checklist generated (if enabled)
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- **Never Guess Thresholds:** If a threshold is unknown, mark as CONCERNS and recommend defining it
|
||||
- **Evidence-Based:** Every assessment must be backed by evidence (tests, metrics, logs, CI results)
|
||||
- **Deterministic Rules:** Use consistent PASS/CONCERNS/FAIL classification based on evidence
|
||||
- **Actionable Recommendations:** Provide specific steps, not generic advice
|
||||
- **Gate Integration:** Generate YAML snippets that can be consumed by CI/CD pipelines
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "NFR thresholds not defined"
|
||||
- Check tech-spec.md for NFR requirements
|
||||
- Check PRD.md for product-level SLAs
|
||||
- Check story file for feature-specific requirements
|
||||
- If thresholds truly unknown, mark as CONCERNS and recommend defining them
|
||||
|
||||
### "No evidence found"
|
||||
- Check evidence directories (test-results, metrics, logs)
|
||||
- Check CI/CD pipeline for test results
|
||||
- If evidence truly missing, mark NFR as "NO EVIDENCE" and recommend generating it
|
||||
|
||||
### "CONCERNS status but no threshold exceeded"
|
||||
- CONCERNS is correct when threshold is UNKNOWN or evidence is MISSING/INCOMPLETE
|
||||
- CONCERNS is also correct when evidence is close to threshold (within 10%)
|
||||
- Document why CONCERNS was assigned
|
||||
|
||||
### "FAIL status blocks release"
|
||||
- This is intentional - FAIL means critical NFR not met
|
||||
- Recommend remediation actions with specific steps
|
||||
- Re-run assessment after remediation
|
||||
|
||||
---
|
||||
|
||||
## Related Workflows
|
||||
|
||||
- **testarch-test-design** - Define NFR requirements and test plan
|
||||
- **testarch-framework** - Set up performance/security testing frameworks
|
||||
- **testarch-ci** - Configure CI/CD for NFR validation
|
||||
- **testarch-gate** - Use NFR assessment as input for quality gate decisions
|
||||
- **testarch-test-review** - Review test quality (maintainability NFR)
|
||||
|
||||
---
|
||||
|
||||
<!-- Powered by BMAD-CORE™ -->
|
||||
```
|
||||
@@ -1,445 +0,0 @@
|
||||
# NFR Assessment - {FEATURE_NAME}
|
||||
|
||||
**Date:** {DATE}
|
||||
**Story:** {STORY_ID} (if applicable)
|
||||
**Overall Status:** {OVERALL_STATUS} {STATUS_ICON}
|
||||
|
||||
---
|
||||
|
||||
Note: This assessment summarizes existing evidence; it does not run tests or CI workflows.
|
||||
|
||||
## Executive Summary
|
||||
|
||||
**Assessment:** {PASS_COUNT} PASS, {CONCERNS_COUNT} CONCERNS, {FAIL_COUNT} FAIL
|
||||
|
||||
**Blockers:** {BLOCKER_COUNT} {BLOCKER_DESCRIPTION}
|
||||
|
||||
**High Priority Issues:** {HIGH_PRIORITY_COUNT} {HIGH_PRIORITY_DESCRIPTION}
|
||||
|
||||
**Recommendation:** {OVERALL_RECOMMENDATION}
|
||||
|
||||
---
|
||||
|
||||
## Performance Assessment
|
||||
|
||||
### Response Time (p95)
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_VALUE}
|
||||
- **Actual:** {ACTUAL_VALUE}
|
||||
- **Evidence:** {EVIDENCE_SOURCE}
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
|
||||
### Throughput
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_VALUE}
|
||||
- **Actual:** {ACTUAL_VALUE}
|
||||
- **Evidence:** {EVIDENCE_SOURCE}
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
|
||||
### Resource Usage
|
||||
|
||||
- **CPU Usage**
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_VALUE}
|
||||
- **Actual:** {ACTUAL_VALUE}
|
||||
- **Evidence:** {EVIDENCE_SOURCE}
|
||||
|
||||
- **Memory Usage**
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_VALUE}
|
||||
- **Actual:** {ACTUAL_VALUE}
|
||||
- **Evidence:** {EVIDENCE_SOURCE}
|
||||
|
||||
### Scalability
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_DESCRIPTION}
|
||||
- **Actual:** {ACTUAL_DESCRIPTION}
|
||||
- **Evidence:** {EVIDENCE_SOURCE}
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
|
||||
---
|
||||
|
||||
## Security Assessment
|
||||
|
||||
### Authentication Strength
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_DESCRIPTION}
|
||||
- **Actual:** {ACTUAL_DESCRIPTION}
|
||||
- **Evidence:** {EVIDENCE_SOURCE}
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
- **Recommendation:** {RECOMMENDATION} (if CONCERNS or FAIL)
|
||||
|
||||
### Authorization Controls
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_DESCRIPTION}
|
||||
- **Actual:** {ACTUAL_DESCRIPTION}
|
||||
- **Evidence:** {EVIDENCE_SOURCE}
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
|
||||
### Data Protection
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_DESCRIPTION}
|
||||
- **Actual:** {ACTUAL_DESCRIPTION}
|
||||
- **Evidence:** {EVIDENCE_SOURCE}
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
|
||||
### Vulnerability Management
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_DESCRIPTION} (e.g., "0 critical, <3 high vulnerabilities")
|
||||
- **Actual:** {ACTUAL_DESCRIPTION} (e.g., "0 critical, 1 high, 5 medium vulnerabilities")
|
||||
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "Snyk scan results - scan-2025-10-14.json")
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
|
||||
### Compliance (if applicable)
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Standards:** {COMPLIANCE_STANDARDS} (e.g., "GDPR, HIPAA, PCI-DSS")
|
||||
- **Actual:** {ACTUAL_COMPLIANCE_STATUS}
|
||||
- **Evidence:** {EVIDENCE_SOURCE}
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
|
||||
---
|
||||
|
||||
## Reliability Assessment
|
||||
|
||||
### Availability (Uptime)
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_VALUE} (e.g., "99.9%")
|
||||
- **Actual:** {ACTUAL_VALUE} (e.g., "99.95%")
|
||||
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "Uptime monitoring - uptime-report-2025-10-14.csv")
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
|
||||
### Error Rate
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_VALUE} (e.g., "<0.1%")
|
||||
- **Actual:** {ACTUAL_VALUE} (e.g., "0.05%")
|
||||
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "Error logs - logs/errors-2025-10.log")
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
|
||||
### MTTR (Mean Time To Recovery)
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_VALUE} (e.g., "<15 minutes")
|
||||
- **Actual:** {ACTUAL_VALUE} (e.g., "12 minutes")
|
||||
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "Incident reports - incidents/")
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
|
||||
### Fault Tolerance
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_DESCRIPTION}
|
||||
- **Actual:** {ACTUAL_DESCRIPTION}
|
||||
- **Evidence:** {EVIDENCE_SOURCE}
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
|
||||
### CI Burn-In (Stability)
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_VALUE} (e.g., "100 consecutive successful runs")
|
||||
- **Actual:** {ACTUAL_VALUE} (e.g., "150 consecutive successful runs")
|
||||
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "CI burn-in results - ci-burn-in-2025-10-14.log")
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
|
||||
### Disaster Recovery (if applicable)
|
||||
|
||||
- **RTO (Recovery Time Objective)**
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_VALUE}
|
||||
- **Actual:** {ACTUAL_VALUE}
|
||||
- **Evidence:** {EVIDENCE_SOURCE}
|
||||
|
||||
- **RPO (Recovery Point Objective)**
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_VALUE}
|
||||
- **Actual:** {ACTUAL_VALUE}
|
||||
- **Evidence:** {EVIDENCE_SOURCE}
|
||||
|
||||
---
|
||||
|
||||
## Maintainability Assessment
|
||||
|
||||
### Test Coverage
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_VALUE} (e.g., ">=80%")
|
||||
- **Actual:** {ACTUAL_VALUE} (e.g., "87%")
|
||||
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "Coverage report - coverage/lcov-report/index.html")
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
|
||||
### Code Quality
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_VALUE} (e.g., ">=85/100")
|
||||
- **Actual:** {ACTUAL_VALUE} (e.g., "92/100")
|
||||
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "SonarQube analysis - sonarqube-report-2025-10-14.pdf")
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
|
||||
### Technical Debt
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_VALUE} (e.g., "<5% debt ratio")
|
||||
- **Actual:** {ACTUAL_VALUE} (e.g., "3.2% debt ratio")
|
||||
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "CodeClimate analysis - codeclimate-2025-10-14.json")
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
|
||||
### Documentation Completeness
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_VALUE} (e.g., ">=90%")
|
||||
- **Actual:** {ACTUAL_VALUE} (e.g., "95%")
|
||||
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "Documentation audit - docs-audit-2025-10-14.md")
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
|
||||
### Test Quality (from test-review, if available)
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_DESCRIPTION}
|
||||
- **Actual:** {ACTUAL_DESCRIPTION}
|
||||
- **Evidence:** {EVIDENCE_SOURCE} (e.g., "Test review report - test-review-2025-10-14.md")
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
|
||||
---
|
||||
|
||||
## Custom NFR Assessments (if applicable)
|
||||
|
||||
### {CUSTOM_NFR_NAME_1}
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_DESCRIPTION}
|
||||
- **Actual:** {ACTUAL_DESCRIPTION}
|
||||
- **Evidence:** {EVIDENCE_SOURCE}
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
|
||||
### {CUSTOM_NFR_NAME_2}
|
||||
|
||||
- **Status:** {STATUS} {STATUS_ICON}
|
||||
- **Threshold:** {THRESHOLD_DESCRIPTION}
|
||||
- **Actual:** {ACTUAL_DESCRIPTION}
|
||||
- **Evidence:** {EVIDENCE_SOURCE}
|
||||
- **Findings:** {FINDINGS_DESCRIPTION}
|
||||
|
||||
---
|
||||
|
||||
## Quick Wins
|
||||
|
||||
{QUICK_WIN_COUNT} quick wins identified for immediate implementation:
|
||||
|
||||
1. **{QUICK_WIN_TITLE_1}** ({NFR_CATEGORY}) - {PRIORITY} - {ESTIMATED_EFFORT}
|
||||
- {QUICK_WIN_DESCRIPTION}
|
||||
- No code changes needed / Minimal code changes
|
||||
|
||||
2. **{QUICK_WIN_TITLE_2}** ({NFR_CATEGORY}) - {PRIORITY} - {ESTIMATED_EFFORT}
|
||||
- {QUICK_WIN_DESCRIPTION}
|
||||
|
||||
---
|
||||
|
||||
## Recommended Actions
|
||||
|
||||
### Immediate (Before Release) - CRITICAL/HIGH Priority
|
||||
|
||||
1. **{ACTION_TITLE_1}** - {PRIORITY} - {ESTIMATED_EFFORT} - {OWNER}
|
||||
- {ACTION_DESCRIPTION}
|
||||
- {SPECIFIC_STEPS}
|
||||
- {VALIDATION_CRITERIA}
|
||||
|
||||
2. **{ACTION_TITLE_2}** - {PRIORITY} - {ESTIMATED_EFFORT} - {OWNER}
|
||||
- {ACTION_DESCRIPTION}
|
||||
- {SPECIFIC_STEPS}
|
||||
- {VALIDATION_CRITERIA}
|
||||
|
||||
### Short-term (Next Sprint) - MEDIUM Priority
|
||||
|
||||
1. **{ACTION_TITLE_3}** - {PRIORITY} - {ESTIMATED_EFFORT} - {OWNER}
|
||||
- {ACTION_DESCRIPTION}
|
||||
|
||||
2. **{ACTION_TITLE_4}** - {PRIORITY} - {ESTIMATED_EFFORT} - {OWNER}
|
||||
- {ACTION_DESCRIPTION}
|
||||
|
||||
### Long-term (Backlog) - LOW Priority
|
||||
|
||||
1. **{ACTION_TITLE_5}** - {PRIORITY} - {ESTIMATED_EFFORT} - {OWNER}
|
||||
- {ACTION_DESCRIPTION}
|
||||
|
||||
---
|
||||
|
||||
## Monitoring Hooks
|
||||
|
||||
{MONITORING_HOOK_COUNT} monitoring hooks recommended to detect issues before failures:
|
||||
|
||||
### Performance Monitoring
|
||||
|
||||
- [ ] {MONITORING_TOOL_1} - {MONITORING_DESCRIPTION}
|
||||
- **Owner:** {OWNER}
|
||||
- **Deadline:** {DEADLINE}
|
||||
|
||||
- [ ] {MONITORING_TOOL_2} - {MONITORING_DESCRIPTION}
|
||||
- **Owner:** {OWNER}
|
||||
- **Deadline:** {DEADLINE}
|
||||
|
||||
### Security Monitoring
|
||||
|
||||
- [ ] {MONITORING_TOOL_3} - {MONITORING_DESCRIPTION}
|
||||
- **Owner:** {OWNER}
|
||||
- **Deadline:** {DEADLINE}
|
||||
|
||||
### Reliability Monitoring
|
||||
|
||||
- [ ] {MONITORING_TOOL_4} - {MONITORING_DESCRIPTION}
|
||||
- **Owner:** {OWNER}
|
||||
- **Deadline:** {DEADLINE}
|
||||
|
||||
### Alerting Thresholds
|
||||
|
||||
- [ ] {ALERT_DESCRIPTION} - Notify when {THRESHOLD_CONDITION}
|
||||
- **Owner:** {OWNER}
|
||||
- **Deadline:** {DEADLINE}
|
||||
|
||||
---
|
||||
|
||||
## Fail-Fast Mechanisms
|
||||
|
||||
{FAIL_FAST_COUNT} fail-fast mechanisms recommended to prevent failures:
|
||||
|
||||
### Circuit Breakers (Reliability)
|
||||
|
||||
- [ ] {CIRCUIT_BREAKER_DESCRIPTION}
|
||||
- **Owner:** {OWNER}
|
||||
- **Estimated Effort:** {EFFORT}
|
||||
|
||||
### Rate Limiting (Performance)
|
||||
|
||||
- [ ] {RATE_LIMITING_DESCRIPTION}
|
||||
- **Owner:** {OWNER}
|
||||
- **Estimated Effort:** {EFFORT}
|
||||
|
||||
### Validation Gates (Security)
|
||||
|
||||
- [ ] {VALIDATION_GATE_DESCRIPTION}
|
||||
- **Owner:** {OWNER}
|
||||
- **Estimated Effort:** {EFFORT}
|
||||
|
||||
### Smoke Tests (Maintainability)
|
||||
|
||||
- [ ] {SMOKE_TEST_DESCRIPTION}
|
||||
- **Owner:** {OWNER}
|
||||
- **Estimated Effort:** {EFFORT}
|
||||
|
||||
---
|
||||
|
||||
## Evidence Gaps
|
||||
|
||||
{EVIDENCE_GAP_COUNT} evidence gaps identified - action required:
|
||||
|
||||
- [ ] **{NFR_NAME_1}** ({NFR_CATEGORY})
|
||||
- **Owner:** {OWNER}
|
||||
- **Deadline:** {DEADLINE}
|
||||
- **Suggested Evidence:** {SUGGESTED_EVIDENCE_SOURCE}
|
||||
- **Impact:** {IMPACT_DESCRIPTION}
|
||||
|
||||
- [ ] **{NFR_NAME_2}** ({NFR_CATEGORY})
|
||||
- **Owner:** {OWNER}
|
||||
- **Deadline:** {DEADLINE}
|
||||
- **Suggested Evidence:** {SUGGESTED_EVIDENCE_SOURCE}
|
||||
- **Impact:** {IMPACT_DESCRIPTION}
|
||||
|
||||
---
|
||||
|
||||
## Findings Summary
|
||||
|
||||
| Category | PASS | CONCERNS | FAIL | Overall Status |
|
||||
| --------------- | ---------------- | -------------------- | ---------------- | ----------------------------------- |
|
||||
| Performance | {P_PASS_COUNT} | {P_CONCERNS_COUNT} | {P_FAIL_COUNT} | {P_STATUS} {P_ICON} |
|
||||
| Security | {S_PASS_COUNT} | {S_CONCERNS_COUNT} | {S_FAIL_COUNT} | {S_STATUS} {S_ICON} |
|
||||
| Reliability | {R_PASS_COUNT} | {R_CONCERNS_COUNT} | {R_FAIL_COUNT} | {R_STATUS} {R_ICON} |
|
||||
| Maintainability | {M_PASS_COUNT} | {M_CONCERNS_COUNT} | {M_FAIL_COUNT} | {M_STATUS} {M_ICON} |
|
||||
| **Total** | **{TOTAL_PASS}** | **{TOTAL_CONCERNS}** | **{TOTAL_FAIL}** | **{OVERALL_STATUS} {OVERALL_ICON}** |
|
||||
|
||||
---
|
||||
|
||||
## Gate YAML Snippet
|
||||
|
||||
```yaml
|
||||
nfr_assessment:
|
||||
date: '{DATE}'
|
||||
story_id: '{STORY_ID}'
|
||||
feature_name: '{FEATURE_NAME}'
|
||||
categories:
|
||||
performance: '{PERFORMANCE_STATUS}'
|
||||
security: '{SECURITY_STATUS}'
|
||||
reliability: '{RELIABILITY_STATUS}'
|
||||
maintainability: '{MAINTAINABILITY_STATUS}'
|
||||
overall_status: '{OVERALL_STATUS}'
|
||||
critical_issues: { CRITICAL_COUNT }
|
||||
high_priority_issues: { HIGH_COUNT }
|
||||
medium_priority_issues: { MEDIUM_COUNT }
|
||||
concerns: { CONCERNS_COUNT }
|
||||
blockers: { BLOCKER_BOOLEAN } # true/false
|
||||
quick_wins: { QUICK_WIN_COUNT }
|
||||
evidence_gaps: { EVIDENCE_GAP_COUNT }
|
||||
recommendations:
|
||||
- '{RECOMMENDATION_1}'
|
||||
- '{RECOMMENDATION_2}'
|
||||
- '{RECOMMENDATION_3}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Related Artifacts
|
||||
|
||||
- **Story File:** {STORY_FILE_PATH} (if applicable)
|
||||
- **Tech Spec:** {TECH_SPEC_PATH} (if available)
|
||||
- **PRD:** {PRD_PATH} (if available)
|
||||
- **Test Design:** {TEST_DESIGN_PATH} (if available)
|
||||
- **Evidence Sources:**
|
||||
- Test Results: {TEST_RESULTS_DIR}
|
||||
- Metrics: {METRICS_DIR}
|
||||
- Logs: {LOGS_DIR}
|
||||
- CI Results: {CI_RESULTS_PATH}
|
||||
|
||||
---
|
||||
|
||||
## Recommendations Summary
|
||||
|
||||
**Release Blocker:** {RELEASE_BLOCKER_SUMMARY}
|
||||
|
||||
**High Priority:** {HIGH_PRIORITY_SUMMARY}
|
||||
|
||||
**Medium Priority:** {MEDIUM_PRIORITY_SUMMARY}
|
||||
|
||||
**Next Steps:** {NEXT_STEPS_DESCRIPTION}
|
||||
|
||||
---
|
||||
|
||||
## Sign-Off
|
||||
|
||||
**NFR Assessment:**
|
||||
|
||||
- Overall Status: {OVERALL_STATUS} {OVERALL_ICON}
|
||||
- Critical Issues: {CRITICAL_COUNT}
|
||||
- High Priority Issues: {HIGH_COUNT}
|
||||
- Concerns: {CONCERNS_COUNT}
|
||||
- Evidence Gaps: {EVIDENCE_GAP_COUNT}
|
||||
|
||||
**Gate Status:** {GATE_STATUS} {GATE_ICON}
|
||||
|
||||
**Next Actions:**
|
||||
|
||||
- If PASS ✅: Proceed to `*gate` workflow or release
|
||||
- If CONCERNS ⚠️: Address HIGH/CRITICAL issues, re-run `*nfr-assess`
|
||||
- If FAIL ❌: Resolve FAIL status NFRs, re-run `*nfr-assess`
|
||||
|
||||
**Generated:** {DATE}
|
||||
**Workflow:** testarch-nfr v4.0
|
||||
|
||||
---
|
||||
|
||||
<!-- Powered by BMAD-CORE™ -->
|
||||
@@ -1,47 +0,0 @@
|
||||
# Test Architect workflow: nfr-assess
|
||||
name: testarch-nfr
|
||||
description: "Assess non-functional requirements (performance, security, reliability, maintainability) before release with evidence-based validation"
|
||||
author: "BMad"
|
||||
|
||||
# Critical variables from config
|
||||
config_source: "{project-root}/_bmad/bmm/config.yaml"
|
||||
output_folder: "{config_source}:output_folder"
|
||||
user_name: "{config_source}:user_name"
|
||||
communication_language: "{config_source}:communication_language"
|
||||
document_output_language: "{config_source}:document_output_language"
|
||||
date: system-generated
|
||||
|
||||
# Workflow components
|
||||
installed_path: "{project-root}/_bmad/bmm/workflows/testarch/nfr-assess"
|
||||
instructions: "{installed_path}/instructions.md"
|
||||
validation: "{installed_path}/checklist.md"
|
||||
template: "{installed_path}/nfr-report-template.md"
|
||||
|
||||
# Variables and inputs
|
||||
variables:
|
||||
# NFR category assessment (defaults to all categories)
|
||||
custom_nfr_categories: "" # Optional additional categories beyond standard (security, performance, reliability, maintainability)
|
||||
|
||||
# Output configuration
|
||||
default_output_file: "{output_folder}/nfr-assessment.md"
|
||||
|
||||
# Required tools
|
||||
required_tools:
|
||||
- read_file # Read story, test results, metrics, logs, BMad artifacts
|
||||
- write_file # Create NFR assessment, gate YAML, evidence checklist
|
||||
- list_files # Discover test results, metrics, logs
|
||||
- search_repo # Find NFR-related tests and evidence
|
||||
- glob # Find result files matching patterns
|
||||
|
||||
tags:
|
||||
- qa
|
||||
- nfr
|
||||
- test-architect
|
||||
- performance
|
||||
- security
|
||||
- reliability
|
||||
|
||||
execution_hints:
|
||||
interactive: false # Minimize prompts
|
||||
autonomous: true # Proceed without user input unless blocked
|
||||
iterative: true
|
||||
@@ -1,235 +0,0 @@
|
||||
# Test Design and Risk Assessment - Validation Checklist
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- [ ] Story markdown with clear acceptance criteria exists
|
||||
- [ ] PRD or epic documentation available
|
||||
- [ ] Architecture documents available (optional)
|
||||
- [ ] Requirements are testable and unambiguous
|
||||
|
||||
## Process Steps
|
||||
|
||||
### Step 1: Context Loading
|
||||
|
||||
- [ ] PRD.md read and requirements extracted
|
||||
- [ ] Epics.md or specific epic documentation loaded
|
||||
- [ ] Story markdown with acceptance criteria analyzed
|
||||
- [ ] Architecture documents reviewed (if available)
|
||||
- [ ] Existing test coverage analyzed
|
||||
- [ ] Knowledge base fragments loaded (risk-governance, probability-impact, test-levels, test-priorities)
|
||||
|
||||
### Step 2: Risk Assessment
|
||||
|
||||
- [ ] Genuine risks identified (not just features)
|
||||
- [ ] Risks classified by category (TECH/SEC/PERF/DATA/BUS/OPS)
|
||||
- [ ] Probability scored (1-3 for each risk)
|
||||
- [ ] Impact scored (1-3 for each risk)
|
||||
- [ ] Risk scores calculated (probability × impact)
|
||||
- [ ] High-priority risks (score ≥6) flagged
|
||||
- [ ] Mitigation plans defined for high-priority risks
|
||||
- [ ] Owners assigned for each mitigation
|
||||
- [ ] Timelines set for mitigations
|
||||
- [ ] Residual risk documented
|
||||
|
||||
### Step 3: Coverage Design
|
||||
|
||||
- [ ] Acceptance criteria broken into atomic scenarios
|
||||
- [ ] Test levels selected (E2E/API/Component/Unit)
|
||||
- [ ] No duplicate coverage across levels
|
||||
- [ ] Priority levels assigned (P0/P1/P2/P3)
|
||||
- [ ] P0 scenarios meet strict criteria (blocks core + high risk + no workaround)
|
||||
- [ ] Data prerequisites identified
|
||||
- [ ] Tooling requirements documented
|
||||
- [ ] Execution order defined (smoke → P0 → P1 → P2/P3)
|
||||
|
||||
### Step 4: Deliverables Generation
|
||||
|
||||
- [ ] Risk assessment matrix created
|
||||
- [ ] Coverage matrix created
|
||||
- [ ] Execution order documented
|
||||
- [ ] Resource estimates calculated
|
||||
- [ ] Quality gate criteria defined
|
||||
- [ ] Output file written to correct location
|
||||
- [ ] Output file uses template structure
|
||||
|
||||
## Output Validation
|
||||
|
||||
### Risk Assessment Matrix
|
||||
|
||||
- [ ] All risks have unique IDs (R-001, R-002, etc.)
|
||||
- [ ] Each risk has category assigned
|
||||
- [ ] Probability values are 1, 2, or 3
|
||||
- [ ] Impact values are 1, 2, or 3
|
||||
- [ ] Scores calculated correctly (P × I)
|
||||
- [ ] High-priority risks (≥6) clearly marked
|
||||
- [ ] Mitigation strategies specific and actionable
|
||||
|
||||
### Coverage Matrix
|
||||
|
||||
- [ ] All requirements mapped to test levels
|
||||
- [ ] Priorities assigned to all scenarios
|
||||
- [ ] Risk linkage documented
|
||||
- [ ] Test counts realistic
|
||||
- [ ] Owners assigned where applicable
|
||||
- [ ] No duplicate coverage (same behavior at multiple levels)
|
||||
|
||||
### Execution Order
|
||||
|
||||
- [ ] Smoke tests defined (<5 min target)
|
||||
- [ ] P0 tests listed (<10 min target)
|
||||
- [ ] P1 tests listed (<30 min target)
|
||||
- [ ] P2/P3 tests listed (<60 min target)
|
||||
- [ ] Order optimizes for fast feedback
|
||||
|
||||
### Resource Estimates
|
||||
|
||||
- [ ] P0 hours calculated (count × 2 hours)
|
||||
- [ ] P1 hours calculated (count × 1 hour)
|
||||
- [ ] P2 hours calculated (count × 0.5 hours)
|
||||
- [ ] P3 hours calculated (count × 0.25 hours)
|
||||
- [ ] Total hours summed
|
||||
- [ ] Days estimate provided (hours / 8)
|
||||
- [ ] Estimates include setup time
|
||||
|
||||
### Quality Gate Criteria
|
||||
|
||||
- [ ] P0 pass rate threshold defined (should be 100%)
|
||||
- [ ] P1 pass rate threshold defined (typically ≥95%)
|
||||
- [ ] High-risk mitigation completion required
|
||||
- [ ] Coverage targets specified (≥80% recommended)
|
||||
|
||||
## Quality Checks
|
||||
|
||||
### Evidence-Based Assessment
|
||||
|
||||
- [ ] Risk assessment based on documented evidence
|
||||
- [ ] No speculation on business impact
|
||||
- [ ] Assumptions clearly documented
|
||||
- [ ] Clarifications requested where needed
|
||||
- [ ] Historical data referenced where available
|
||||
|
||||
### Risk Classification Accuracy
|
||||
|
||||
- [ ] TECH risks are architecture/integration issues
|
||||
- [ ] SEC risks are security vulnerabilities
|
||||
- [ ] PERF risks are performance/scalability concerns
|
||||
- [ ] DATA risks are data integrity issues
|
||||
- [ ] BUS risks are business/revenue impacts
|
||||
- [ ] OPS risks are deployment/operational issues
|
||||
|
||||
### Priority Assignment Accuracy
|
||||
|
||||
- [ ] P0: Truly blocks core functionality
|
||||
- [ ] P0: High-risk (score ≥6)
|
||||
- [ ] P0: No workaround exists
|
||||
- [ ] P1: Important but not blocking
|
||||
- [ ] P2/P3: Nice-to-have or edge cases
|
||||
|
||||
### Test Level Selection
|
||||
|
||||
- [ ] E2E used only for critical paths
|
||||
- [ ] API tests cover complex business logic
|
||||
- [ ] Component tests for UI interactions
|
||||
- [ ] Unit tests for edge cases and algorithms
|
||||
- [ ] No redundant coverage
|
||||
|
||||
## Integration Points
|
||||
|
||||
### Knowledge Base Integration
|
||||
|
||||
- [ ] risk-governance.md consulted
|
||||
- [ ] probability-impact.md applied
|
||||
- [ ] test-levels-framework.md referenced
|
||||
- [ ] test-priorities-matrix.md used
|
||||
- [ ] Additional fragments loaded as needed
|
||||
|
||||
### Status File Integration
|
||||
|
||||
- [ ] bmm-workflow-status.md exists
|
||||
- [ ] Test design logged in Quality & Testing Progress
|
||||
- [ ] Epic number and scope documented
|
||||
- [ ] Completion timestamp recorded
|
||||
|
||||
### Workflow Dependencies
|
||||
|
||||
- [ ] Can proceed to `*atdd` workflow with P0 scenarios
|
||||
- [ ] `*atdd` is a separate workflow and must be run explicitly (not auto-run)
|
||||
- [ ] Can proceed to `automate` workflow with full coverage plan
|
||||
- [ ] Risk assessment informs `gate` workflow criteria
|
||||
- [ ] Integrates with `ci` workflow execution order
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
**All must be true:**
|
||||
|
||||
- [ ] All prerequisites met
|
||||
- [ ] All process steps completed
|
||||
- [ ] All output validations passed
|
||||
- [ ] All quality checks passed
|
||||
- [ ] All integration points verified
|
||||
- [ ] Output file complete and well-formatted
|
||||
- [ ] Team review scheduled (if required)
|
||||
|
||||
## Post-Workflow Actions
|
||||
|
||||
**User must complete:**
|
||||
|
||||
1. [ ] Review risk assessment with team
|
||||
2. [ ] Prioritize mitigation for high-priority risks (score ≥6)
|
||||
3. [ ] Allocate resources per estimates
|
||||
4. [ ] Run `*atdd` workflow to generate P0 tests (separate workflow; not auto-run)
|
||||
5. [ ] Set up test data factories and fixtures
|
||||
6. [ ] Schedule team review of test design document
|
||||
|
||||
**Recommended next workflows:**
|
||||
|
||||
1. [ ] Run `atdd` workflow for P0 test generation
|
||||
2. [ ] Run `framework` workflow if not already done
|
||||
3. [ ] Run `ci` workflow to configure pipeline stages
|
||||
|
||||
## Rollback Procedure
|
||||
|
||||
If workflow fails:
|
||||
|
||||
1. [ ] Delete output file
|
||||
2. [ ] Review error logs
|
||||
3. [ ] Fix missing context (PRD, architecture docs)
|
||||
4. [ ] Clarify ambiguous requirements
|
||||
5. [ ] Retry workflow
|
||||
|
||||
## Notes
|
||||
|
||||
### Common Issues
|
||||
|
||||
**Issue**: Too many P0 tests
|
||||
|
||||
- **Solution**: Apply strict P0 criteria - must block core AND high risk AND no workaround
|
||||
|
||||
**Issue**: Risk scores all high
|
||||
|
||||
- **Solution**: Differentiate between high-impact (3) and degraded (2) impacts
|
||||
|
||||
**Issue**: Duplicate coverage across levels
|
||||
|
||||
- **Solution**: Use test pyramid - E2E for critical paths only
|
||||
|
||||
**Issue**: Resource estimates too high
|
||||
|
||||
- **Solution**: Invest in fixtures/factories to reduce per-test setup time
|
||||
|
||||
### Best Practices
|
||||
|
||||
- Base risk assessment on evidence, not assumptions
|
||||
- High-priority risks (≥6) require immediate mitigation
|
||||
- P0 tests should cover <10% of total scenarios
|
||||
- Avoid testing same behavior at multiple levels
|
||||
- Include smoke tests (P0 subset) for fast feedback
|
||||
|
||||
---
|
||||
|
||||
**Checklist Complete**: Sign off when all items validated.
|
||||
|
||||
**Completed by:** {name}
|
||||
**Date:** {date}
|
||||
**Epic:** {epic title}
|
||||
**Notes:** {additional notes}
|
||||
@@ -1,788 +0,0 @@
|
||||
<!-- Powered by BMAD-CORE™ -->
|
||||
|
||||
# Test Design and Risk Assessment
|
||||
|
||||
**Workflow ID**: `_bmad/bmm/testarch/test-design`
|
||||
**Version**: 4.0 (BMad v6)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Plans comprehensive test coverage strategy with risk assessment, priority classification, and execution ordering. This workflow operates in **two modes**:
|
||||
|
||||
- **System-Level Mode (Phase 3)**: Testability review of architecture before solutioning gate check
|
||||
- **Epic-Level Mode (Phase 4)**: Per-epic test planning with risk assessment (current behavior)
|
||||
|
||||
The workflow auto-detects which mode to use based on project phase.
|
||||
|
||||
---
|
||||
|
||||
## Preflight: Detect Mode and Load Context
|
||||
|
||||
**Critical:** Determine mode before proceeding.
|
||||
|
||||
### Mode Detection
|
||||
|
||||
1. **Check for sprint-status.yaml**
|
||||
- If `{implementation_artifacts}/sprint-status.yaml` exists → **Epic-Level Mode** (Phase 4)
|
||||
- If NOT exists → Check workflow status
|
||||
|
||||
2. **Check workflow-status.yaml**
|
||||
- Read `{planning_artifacts}/bmm-workflow-status.yaml`
|
||||
- If `implementation-readiness: required` or `implementation-readiness: recommended` → **System-Level Mode** (Phase 3)
|
||||
- Otherwise → **Epic-Level Mode** (Phase 4 without sprint status yet)
|
||||
|
||||
3. **Mode-Specific Requirements**
|
||||
|
||||
**System-Level Mode (Phase 3 - Testability Review):**
|
||||
- ✅ Architecture document exists (architecture.md or tech-spec)
|
||||
- ✅ PRD exists with functional and non-functional requirements
|
||||
- ✅ Epics documented (epics.md)
|
||||
- ⚠️ Output: `{output_folder}/test-design-system.md`
|
||||
|
||||
**Epic-Level Mode (Phase 4 - Per-Epic Planning):**
|
||||
- ✅ Story markdown with acceptance criteria available
|
||||
- ✅ PRD or epic documentation exists for context
|
||||
- ✅ Architecture documents available (optional but recommended)
|
||||
- ✅ Requirements are clear and testable
|
||||
- ⚠️ Output: `{output_folder}/test-design-epic-{epic_num}.md`
|
||||
|
||||
**Halt Condition:** If mode cannot be determined or required files missing, HALT and notify user with missing prerequisites.
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Load Context (Mode-Aware)
|
||||
|
||||
**Mode-Specific Loading:**
|
||||
|
||||
### System-Level Mode (Phase 3)
|
||||
|
||||
1. **Read Architecture Documentation**
|
||||
- Load architecture.md or tech-spec (REQUIRED)
|
||||
- Load PRD.md for functional and non-functional requirements
|
||||
- Load epics.md for feature scope
|
||||
- Identify technology stack decisions (frameworks, databases, deployment targets)
|
||||
- Note integration points and external system dependencies
|
||||
- Extract NFR requirements (performance SLOs, security requirements, etc.)
|
||||
|
||||
2. **Check Playwright Utils Flag**
|
||||
|
||||
Read `{config_source}` and check `config.tea_use_playwright_utils`.
|
||||
|
||||
If true, note that `@seontechnologies/playwright-utils` provides utilities for test implementation. Reference in test design where relevant.
|
||||
|
||||
3. **Load Knowledge Base Fragments (System-Level)**
|
||||
|
||||
**Critical:** Consult `{project-root}/_bmad/bmm/testarch/tea-index.csv` to load:
|
||||
- `nfr-criteria.md` - NFR validation approach (security, performance, reliability, maintainability)
|
||||
- `test-levels-framework.md` - Test levels strategy guidance
|
||||
- `risk-governance.md` - Testability risk identification
|
||||
- `test-quality.md` - Quality standards and Definition of Done
|
||||
|
||||
4. **Analyze Existing Test Setup (if brownfield)**
|
||||
- Search for existing test directories
|
||||
- Identify current test framework (if any)
|
||||
- Note testability concerns in existing codebase
|
||||
|
||||
### Epic-Level Mode (Phase 4)
|
||||
|
||||
1. **Read Requirements Documentation**
|
||||
- Load PRD.md for high-level product requirements
|
||||
- Read epics.md or specific epic for feature scope
|
||||
- Read story markdown for detailed acceptance criteria
|
||||
- Identify all testable requirements
|
||||
|
||||
2. **Load Architecture Context**
|
||||
- Read architecture.md for system design
|
||||
- Read tech-spec for implementation details
|
||||
- Read test-design-system.md (if exists from Phase 3)
|
||||
- Identify technical constraints and dependencies
|
||||
- Note integration points and external systems
|
||||
|
||||
3. **Analyze Existing Test Coverage**
|
||||
- Search for existing test files in `{test_dir}`
|
||||
- Identify coverage gaps
|
||||
- Note areas with insufficient testing
|
||||
- Check for flaky or outdated tests
|
||||
|
||||
4. **Load Knowledge Base Fragments (Epic-Level)**
|
||||
|
||||
**Critical:** Consult `{project-root}/_bmad/bmm/testarch/tea-index.csv` to load:
|
||||
- `risk-governance.md` - Risk classification framework (6 categories: TECH, SEC, PERF, DATA, BUS, OPS), automated scoring, gate decision engine, owner tracking (625 lines, 4 examples)
|
||||
- `probability-impact.md` - Risk scoring methodology (probability × impact matrix, automated classification, dynamic re-assessment, gate integration, 604 lines, 4 examples)
|
||||
- `test-levels-framework.md` - Test level selection guidance (E2E vs API vs Component vs Unit with decision matrix, characteristics, when to use each, 467 lines, 4 examples)
|
||||
- `test-priorities-matrix.md` - P0-P3 prioritization criteria (automated priority calculation, risk-based mapping, tagging strategy, time budgets, 389 lines, 2 examples)
|
||||
|
||||
**Halt Condition (Epic-Level only):** If story data or acceptance criteria are missing, check if brownfield exploration is needed. If neither requirements NOR exploration possible, HALT with message: "Epic-level test design requires clear requirements, acceptance criteria, or brownfield app URL for exploration"
|
||||
|
||||
---
|
||||
|
||||
## Step 1.5: System-Level Testability Review (Phase 3 Only)
|
||||
|
||||
**Skip this step if Epic-Level Mode.** This step only executes in System-Level Mode.
|
||||
|
||||
### Actions
|
||||
|
||||
1. **Review Architecture for Testability**
|
||||
|
||||
Evaluate architecture against these criteria:
|
||||
|
||||
**Controllability:**
|
||||
- Can we control system state for testing? (API seeding, factories, database reset)
|
||||
- Are external dependencies mockable? (interfaces, dependency injection)
|
||||
- Can we trigger error conditions? (chaos engineering, fault injection)
|
||||
|
||||
**Observability:**
|
||||
- Can we inspect system state? (logging, metrics, traces)
|
||||
- Are test results deterministic? (no race conditions, clear success/failure)
|
||||
- Can we validate NFRs? (performance metrics, security audit logs)
|
||||
|
||||
**Reliability:**
|
||||
- Are tests isolated? (parallel-safe, stateless, cleanup discipline)
|
||||
- Can we reproduce failures? (deterministic waits, HAR capture, seed data)
|
||||
- Are components loosely coupled? (mockable, testable boundaries)
|
||||
|
||||
2. **Identify Architecturally Significant Requirements (ASRs)**
|
||||
|
||||
From PRD NFRs and architecture decisions, identify quality requirements that:
|
||||
- Drive architecture decisions (e.g., "Must handle 10K concurrent users" → caching architecture)
|
||||
- Pose testability challenges (e.g., "Sub-second response time" → performance test infrastructure)
|
||||
- Require special test environments (e.g., "Multi-region deployment" → regional test instances)
|
||||
|
||||
Score each ASR using risk matrix (probability × impact).
|
||||
|
||||
3. **Define Test Levels Strategy**
|
||||
|
||||
Based on architecture (mobile, web, API, microservices, monolith):
|
||||
- Recommend unit/integration/E2E split (e.g., 70/20/10 for API-heavy, 40/30/30 for UI-heavy)
|
||||
- Identify test environment needs (local, staging, ephemeral, production-like)
|
||||
- Define testing approach per technology (Playwright for web, Maestro for mobile, k6 for performance)
|
||||
|
||||
4. **Assess NFR Testing Approach**
|
||||
|
||||
For each NFR category:
|
||||
- **Security**: Auth/authz tests, OWASP validation, secret handling (Playwright E2E + security tools)
|
||||
- **Performance**: Load/stress/spike testing with k6, SLO/SLA thresholds
|
||||
- **Reliability**: Error handling, retries, circuit breakers, health checks (Playwright + API tests)
|
||||
- **Maintainability**: Coverage targets, code quality gates, observability validation
|
||||
|
||||
5. **Flag Testability Concerns**
|
||||
|
||||
Identify architecture decisions that harm testability:
|
||||
- ❌ Tight coupling (no interfaces, hard dependencies)
|
||||
- ❌ No dependency injection (can't mock external services)
|
||||
- ❌ Hardcoded configurations (can't test different envs)
|
||||
- ❌ Missing observability (can't validate NFRs)
|
||||
- ❌ Stateful designs (can't parallelize tests)
|
||||
|
||||
**Critical:** If testability concerns are blockers (e.g., "Architecture makes performance testing impossible"), document as CONCERNS or FAIL recommendation for gate check.
|
||||
|
||||
6. **Output System-Level Test Design**
|
||||
|
||||
Write to `{output_folder}/test-design-system.md` containing:
|
||||
|
||||
```markdown
|
||||
# System-Level Test Design
|
||||
|
||||
## Testability Assessment
|
||||
|
||||
- Controllability: [PASS/CONCERNS/FAIL with details]
|
||||
- Observability: [PASS/CONCERNS/FAIL with details]
|
||||
- Reliability: [PASS/CONCERNS/FAIL with details]
|
||||
|
||||
## Architecturally Significant Requirements (ASRs)
|
||||
|
||||
[Risk-scored quality requirements]
|
||||
|
||||
## Test Levels Strategy
|
||||
|
||||
- Unit: [X%] - [Rationale]
|
||||
- Integration: [Y%] - [Rationale]
|
||||
- E2E: [Z%] - [Rationale]
|
||||
|
||||
## NFR Testing Approach
|
||||
|
||||
- Security: [Approach with tools]
|
||||
- Performance: [Approach with tools]
|
||||
- Reliability: [Approach with tools]
|
||||
- Maintainability: [Approach with tools]
|
||||
|
||||
## Test Environment Requirements
|
||||
|
||||
[Infrastructure needs based on deployment architecture]
|
||||
|
||||
## Testability Concerns (if any)
|
||||
|
||||
[Blockers or concerns that should inform solutioning gate check]
|
||||
|
||||
## Recommendations for Sprint 0
|
||||
|
||||
[Specific actions for *framework and *ci workflows]
|
||||
```
|
||||
|
||||
**After System-Level Mode:** Skip to Step 4 (Generate Deliverables) - Steps 2-3 are epic-level only.
|
||||
|
||||
---
|
||||
|
||||
## Step 1.6: Exploratory Mode Selection (Epic-Level Only)
|
||||
|
||||
### Actions
|
||||
|
||||
1. **Detect Planning Mode**
|
||||
|
||||
Determine mode based on context:
|
||||
|
||||
**Requirements-Based Mode (DEFAULT)**:
|
||||
- Have clear story/PRD with acceptance criteria
|
||||
- Uses: Existing workflow (Steps 2-4)
|
||||
- Appropriate for: Documented features, greenfield projects
|
||||
|
||||
**Exploratory Mode (OPTIONAL - Brownfield)**:
|
||||
- Missing/incomplete requirements AND brownfield application exists
|
||||
- Uses: UI exploration to discover functionality
|
||||
- Appropriate for: Undocumented brownfield apps, legacy systems
|
||||
|
||||
2. **Requirements-Based Mode (DEFAULT - Skip to Step 2)**
|
||||
|
||||
If requirements are clear:
|
||||
- Continue with existing workflow (Step 2: Assess and Classify Risks)
|
||||
- Use loaded requirements from Step 1
|
||||
- Proceed with risk assessment based on documented requirements
|
||||
|
||||
3. **Exploratory Mode (OPTIONAL - Brownfield Apps)**
|
||||
|
||||
If exploring brownfield application:
|
||||
|
||||
**A. Check MCP Availability**
|
||||
|
||||
If config.tea_use_mcp_enhancements is true AND Playwright MCP tools available:
|
||||
- Use MCP-assisted exploration (Step 3.B)
|
||||
|
||||
If MCP unavailable OR config.tea_use_mcp_enhancements is false:
|
||||
- Use manual exploration fallback (Step 3.C)
|
||||
|
||||
**B. MCP-Assisted Exploration (If MCP Tools Available)**
|
||||
|
||||
Use Playwright MCP browser tools to explore UI:
|
||||
|
||||
**Setup:**
|
||||
|
||||
```
|
||||
1. Use planner_setup_page to initialize browser
|
||||
2. Navigate to {exploration_url}
|
||||
3. Capture initial state with browser_snapshot
|
||||
```
|
||||
|
||||
**Exploration Process:**
|
||||
|
||||
```
|
||||
4. Use browser_navigate to explore different pages
|
||||
5. Use browser_click to interact with buttons, links, forms
|
||||
6. Use browser_hover to reveal hidden menus/tooltips
|
||||
7. Capture browser_snapshot at each significant state
|
||||
8. Take browser_screenshot for documentation
|
||||
9. Monitor browser_console_messages for JavaScript errors
|
||||
10. Track browser_network_requests to identify API calls
|
||||
11. Map user flows and interactive elements
|
||||
12. Document discovered functionality
|
||||
```
|
||||
|
||||
**Discovery Documentation:**
|
||||
- Create list of discovered features (pages, workflows, forms)
|
||||
- Identify user journeys (navigation paths)
|
||||
- Map API endpoints (from network requests)
|
||||
- Note error states (from console messages)
|
||||
- Capture screenshots for visual reference
|
||||
|
||||
**Convert to Test Scenarios:**
|
||||
- Transform discoveries into testable requirements
|
||||
- Prioritize based on user flow criticality
|
||||
- Identify risks from discovered functionality
|
||||
- Continue with Step 2 (Assess and Classify Risks) using discovered requirements
|
||||
|
||||
**C. Manual Exploration Fallback (If MCP Unavailable)**
|
||||
|
||||
If Playwright MCP is not available:
|
||||
|
||||
**Notify User:**
|
||||
|
||||
```markdown
|
||||
Exploratory mode enabled but Playwright MCP unavailable.
|
||||
|
||||
**Manual exploration required:**
|
||||
|
||||
1. Open application at: {exploration_url}
|
||||
2. Explore all pages, workflows, and features
|
||||
3. Document findings in markdown:
|
||||
- List of pages/features discovered
|
||||
- User journeys identified
|
||||
- API endpoints observed (DevTools Network tab)
|
||||
- JavaScript errors noted (DevTools Console)
|
||||
- Critical workflows mapped
|
||||
|
||||
4. Provide exploration findings to continue workflow
|
||||
|
||||
**Alternative:** Disable exploratory_mode and provide requirements documentation
|
||||
```
|
||||
|
||||
Wait for user to provide exploration findings, then:
|
||||
- Parse user-provided discovery documentation
|
||||
- Convert to testable requirements
|
||||
- Continue with Step 2 (risk assessment)
|
||||
|
||||
4. **Proceed to Risk Assessment**
|
||||
|
||||
After mode selection (Requirements-Based OR Exploratory):
|
||||
- Continue to Step 2: Assess and Classify Risks
|
||||
- Use requirements from documentation (Requirements-Based) OR discoveries (Exploratory)
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Assess and Classify Risks
|
||||
|
||||
### Actions
|
||||
|
||||
1. **Identify Genuine Risks**
|
||||
|
||||
Filter requirements to isolate actual risks (not just features):
|
||||
- Unresolved technical gaps
|
||||
- Security vulnerabilities
|
||||
- Performance bottlenecks
|
||||
- Data loss or corruption potential
|
||||
- Business impact failures
|
||||
- Operational deployment issues
|
||||
|
||||
2. **Classify Risks by Category**
|
||||
|
||||
Use these standard risk categories:
|
||||
|
||||
**TECH** (Technical/Architecture):
|
||||
- Architecture flaws
|
||||
- Integration failures
|
||||
- Scalability issues
|
||||
- Technical debt
|
||||
|
||||
**SEC** (Security):
|
||||
- Missing access controls
|
||||
- Authentication bypass
|
||||
- Data exposure
|
||||
- Injection vulnerabilities
|
||||
|
||||
**PERF** (Performance):
|
||||
- SLA violations
|
||||
- Response time degradation
|
||||
- Resource exhaustion
|
||||
- Scalability limits
|
||||
|
||||
**DATA** (Data Integrity):
|
||||
- Data loss
|
||||
- Data corruption
|
||||
- Inconsistent state
|
||||
- Migration failures
|
||||
|
||||
**BUS** (Business Impact):
|
||||
- User experience degradation
|
||||
- Business logic errors
|
||||
- Revenue impact
|
||||
- Compliance violations
|
||||
|
||||
**OPS** (Operations):
|
||||
- Deployment failures
|
||||
- Configuration errors
|
||||
- Monitoring gaps
|
||||
- Rollback issues
|
||||
|
||||
3. **Score Risk Probability**
|
||||
|
||||
Rate likelihood (1-3):
|
||||
- **1 (Unlikely)**: <10% chance, edge case
|
||||
- **2 (Possible)**: 10-50% chance, known scenario
|
||||
- **3 (Likely)**: >50% chance, common occurrence
|
||||
|
||||
4. **Score Risk Impact**
|
||||
|
||||
Rate severity (1-3):
|
||||
- **1 (Minor)**: Cosmetic, workaround exists, limited users
|
||||
- **2 (Degraded)**: Feature impaired, workaround difficult, affects many users
|
||||
- **3 (Critical)**: System failure, data loss, no workaround, blocks usage
|
||||
|
||||
5. **Calculate Risk Score**
|
||||
|
||||
```
|
||||
Risk Score = Probability × Impact
|
||||
|
||||
Scores:
|
||||
1-2: Low risk (monitor)
|
||||
3-4: Medium risk (plan mitigation)
|
||||
6-9: High risk (immediate mitigation required)
|
||||
```
|
||||
|
||||
6. **Highlight High-Priority Risks**
|
||||
|
||||
Flag all risks with score ≥6 for immediate attention.
|
||||
|
||||
7. **Request Clarification**
|
||||
|
||||
If evidence is missing or assumptions required:
|
||||
- Document assumptions clearly
|
||||
- Request user clarification
|
||||
- Do NOT speculate on business impact
|
||||
|
||||
8. **Plan Mitigations**
|
||||
|
||||
For each high-priority risk:
|
||||
- Define mitigation strategy
|
||||
- Assign owner (dev, QA, ops)
|
||||
- Set timeline
|
||||
- Update residual risk expectation
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Design Test Coverage
|
||||
|
||||
### Actions
|
||||
|
||||
1. **Break Down Acceptance Criteria**
|
||||
|
||||
Convert each acceptance criterion into atomic test scenarios:
|
||||
- One scenario per testable behavior
|
||||
- Scenarios are independent
|
||||
- Scenarios are repeatable
|
||||
- Scenarios tie back to risk mitigations
|
||||
|
||||
2. **Select Appropriate Test Levels**
|
||||
|
||||
**Knowledge Base Reference**: `test-levels-framework.md`
|
||||
|
||||
Map requirements to optimal test levels (avoid duplication):
|
||||
|
||||
**E2E (End-to-End)**:
|
||||
- Critical user journeys
|
||||
- Multi-system integration
|
||||
- Production-like environment
|
||||
- Highest confidence, slowest execution
|
||||
|
||||
**API (Integration)**:
|
||||
- Service contracts
|
||||
- Business logic validation
|
||||
- Fast feedback
|
||||
- Good for complex scenarios
|
||||
|
||||
**Component**:
|
||||
- UI component behavior
|
||||
- Interaction testing
|
||||
- Visual regression
|
||||
- Fast, isolated
|
||||
|
||||
**Unit**:
|
||||
- Business logic
|
||||
- Edge cases
|
||||
- Error handling
|
||||
- Fastest, most granular
|
||||
|
||||
**Avoid duplicate coverage**: Don't test same behavior at multiple levels unless necessary.
|
||||
|
||||
3. **Assign Priority Levels**
|
||||
|
||||
**Knowledge Base Reference**: `test-priorities-matrix.md`
|
||||
|
||||
**P0 (Critical)**:
|
||||
- Blocks core user journey
|
||||
- High-risk areas (score ≥6)
|
||||
- Revenue-impacting
|
||||
- Security-critical
|
||||
- **Run on every commit**
|
||||
|
||||
**P1 (High)**:
|
||||
- Important user features
|
||||
- Medium-risk areas (score 3-4)
|
||||
- Common workflows
|
||||
- **Run on PR to main**
|
||||
|
||||
**P2 (Medium)**:
|
||||
- Secondary features
|
||||
- Low-risk areas (score 1-2)
|
||||
- Edge cases
|
||||
- **Run nightly or weekly**
|
||||
|
||||
**P3 (Low)**:
|
||||
- Nice-to-have
|
||||
- Exploratory
|
||||
- Performance benchmarks
|
||||
- **Run on-demand**
|
||||
|
||||
4. **Outline Data and Tooling Prerequisites**
|
||||
|
||||
For each test scenario, identify:
|
||||
- Test data requirements (factories, fixtures)
|
||||
- External services (mocks, stubs)
|
||||
- Environment setup
|
||||
- Tools and dependencies
|
||||
|
||||
5. **Define Execution Order**
|
||||
|
||||
Recommend test execution sequence:
|
||||
1. **Smoke tests** (P0 subset, <5 min)
|
||||
2. **P0 tests** (critical paths, <10 min)
|
||||
3. **P1 tests** (important features, <30 min)
|
||||
4. **P2/P3 tests** (full regression, <60 min)
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Generate Deliverables
|
||||
|
||||
### Actions
|
||||
|
||||
1. **Create Risk Assessment Matrix**
|
||||
|
||||
Use template structure:
|
||||
|
||||
```markdown
|
||||
| Risk ID | Category | Description | Probability | Impact | Score | Mitigation |
|
||||
| ------- | -------- | ----------- | ----------- | ------ | ----- | --------------- |
|
||||
| R-001 | SEC | Auth bypass | 2 | 3 | 6 | Add authz check |
|
||||
```
|
||||
|
||||
2. **Create Coverage Matrix**
|
||||
|
||||
```markdown
|
||||
| Requirement | Test Level | Priority | Risk Link | Test Count | Owner |
|
||||
| ----------- | ---------- | -------- | --------- | ---------- | ----- |
|
||||
| Login flow | E2E | P0 | R-001 | 3 | QA |
|
||||
```
|
||||
|
||||
3. **Document Execution Order**
|
||||
|
||||
```markdown
|
||||
### Smoke Tests (<5 min)
|
||||
|
||||
- Login successful
|
||||
- Dashboard loads
|
||||
|
||||
### P0 Tests (<10 min)
|
||||
|
||||
- [Full P0 list]
|
||||
|
||||
### P1 Tests (<30 min)
|
||||
|
||||
- [Full P1 list]
|
||||
```
|
||||
|
||||
4. **Include Resource Estimates**
|
||||
|
||||
```markdown
|
||||
### Test Effort Estimates
|
||||
|
||||
- P0 scenarios: 15 tests × 2 hours = 30 hours
|
||||
- P1 scenarios: 25 tests × 1 hour = 25 hours
|
||||
- P2 scenarios: 40 tests × 0.5 hour = 20 hours
|
||||
- **Total:** 75 hours (~10 days)
|
||||
```
|
||||
|
||||
5. **Add Gate Criteria**
|
||||
|
||||
```markdown
|
||||
### Quality Gate Criteria
|
||||
|
||||
- All P0 tests pass (100%)
|
||||
- P1 tests pass rate ≥95%
|
||||
- No high-risk (score ≥6) items unmitigated
|
||||
- Test coverage ≥80% for critical paths
|
||||
```
|
||||
|
||||
6. **Write to Output File**
|
||||
|
||||
Save to `{output_folder}/test-design-epic-{epic_num}.md` using template structure.
|
||||
|
||||
---
|
||||
|
||||
## Important Notes
|
||||
|
||||
### Risk Category Definitions
|
||||
|
||||
**TECH** (Technical/Architecture):
|
||||
|
||||
- Architecture flaws or technical debt
|
||||
- Integration complexity
|
||||
- Scalability concerns
|
||||
|
||||
**SEC** (Security):
|
||||
|
||||
- Missing security controls
|
||||
- Authentication/authorization gaps
|
||||
- Data exposure risks
|
||||
|
||||
**PERF** (Performance):
|
||||
|
||||
- SLA risk or performance degradation
|
||||
- Resource constraints
|
||||
- Scalability bottlenecks
|
||||
|
||||
**DATA** (Data Integrity):
|
||||
|
||||
- Data loss or corruption potential
|
||||
- State consistency issues
|
||||
- Migration risks
|
||||
|
||||
**BUS** (Business Impact):
|
||||
|
||||
- User experience harm
|
||||
- Business logic errors
|
||||
- Revenue or compliance impact
|
||||
|
||||
**OPS** (Operations):
|
||||
|
||||
- Deployment or runtime failures
|
||||
- Configuration issues
|
||||
- Monitoring/observability gaps
|
||||
|
||||
### Risk Scoring Methodology
|
||||
|
||||
**Probability × Impact = Risk Score**
|
||||
|
||||
Examples:
|
||||
|
||||
- High likelihood (3) × Critical impact (3) = **Score 9** (highest priority)
|
||||
- Possible (2) × Critical (3) = **Score 6** (high priority threshold)
|
||||
- Unlikely (1) × Minor (1) = **Score 1** (low priority)
|
||||
|
||||
**Threshold**: Scores ≥6 require immediate mitigation.
|
||||
|
||||
### Test Level Selection Strategy
|
||||
|
||||
**Avoid duplication:**
|
||||
|
||||
- Don't test same behavior at E2E and API level
|
||||
- Use E2E for critical paths only
|
||||
- Use API tests for complex business logic
|
||||
- Use unit tests for edge cases
|
||||
|
||||
**Tradeoffs:**
|
||||
|
||||
- E2E: High confidence, slow execution, brittle
|
||||
- API: Good balance, fast, stable
|
||||
- Unit: Fastest feedback, narrow scope
|
||||
|
||||
### Priority Assignment Guidelines
|
||||
|
||||
**P0 criteria** (all must be true):
|
||||
|
||||
- Blocks core functionality
|
||||
- High-risk (score ≥6)
|
||||
- No workaround exists
|
||||
- Affects majority of users
|
||||
|
||||
**P1 criteria**:
|
||||
|
||||
- Important feature
|
||||
- Medium risk (score 3-5)
|
||||
- Workaround exists but difficult
|
||||
|
||||
**P2/P3**: Everything else, prioritized by value
|
||||
|
||||
### Knowledge Base Integration
|
||||
|
||||
**Core Fragments (Auto-loaded in Step 1):**
|
||||
|
||||
- `risk-governance.md` - Risk classification (6 categories), automated scoring, gate decision engine, coverage traceability, owner tracking (625 lines, 4 examples)
|
||||
- `probability-impact.md` - Probability × impact matrix, automated classification thresholds, dynamic re-assessment, gate integration (604 lines, 4 examples)
|
||||
- `test-levels-framework.md` - E2E vs API vs Component vs Unit decision framework with characteristics matrix (467 lines, 4 examples)
|
||||
- `test-priorities-matrix.md` - P0-P3 automated priority calculation, risk-based mapping, tagging strategy, time budgets (389 lines, 2 examples)
|
||||
|
||||
**Reference for Test Planning:**
|
||||
|
||||
- `selective-testing.md` - Execution strategy: tag-based, spec filters, diff-based selection, promotion rules (727 lines, 4 examples)
|
||||
- `fixture-architecture.md` - Data setup patterns: pure function → fixture → mergeTests, auto-cleanup (406 lines, 5 examples)
|
||||
|
||||
**Manual Reference (Optional):**
|
||||
|
||||
- Use `tea-index.csv` to find additional specialized fragments as needed
|
||||
|
||||
### Evidence-Based Assessment
|
||||
|
||||
**Critical principle:** Base risk assessment on evidence, not speculation.
|
||||
|
||||
**Evidence sources:**
|
||||
|
||||
- PRD and user research
|
||||
- Architecture documentation
|
||||
- Historical bug data
|
||||
- User feedback
|
||||
- Security audit results
|
||||
|
||||
**Avoid:**
|
||||
|
||||
- Guessing business impact
|
||||
- Assuming user behavior
|
||||
- Inventing requirements
|
||||
|
||||
**When uncertain:** Document assumptions and request clarification from user.
|
||||
|
||||
---
|
||||
|
||||
## Output Summary
|
||||
|
||||
After completing this workflow, provide a summary:
|
||||
|
||||
```markdown
|
||||
## Test Design Complete
|
||||
|
||||
**Epic**: {epic_num}
|
||||
**Scope**: {design_level}
|
||||
|
||||
**Risk Assessment**:
|
||||
|
||||
- Total risks identified: {count}
|
||||
- High-priority risks (≥6): {high_count}
|
||||
- Categories: {categories}
|
||||
|
||||
**Coverage Plan**:
|
||||
|
||||
- P0 scenarios: {p0_count} ({p0_hours} hours)
|
||||
- P1 scenarios: {p1_count} ({p1_hours} hours)
|
||||
- P2/P3 scenarios: {p2p3_count} ({p2p3_hours} hours)
|
||||
- **Total effort**: {total_hours} hours (~{total_days} days)
|
||||
|
||||
**Test Levels**:
|
||||
|
||||
- E2E: {e2e_count}
|
||||
- API: {api_count}
|
||||
- Component: {component_count}
|
||||
- Unit: {unit_count}
|
||||
|
||||
**Quality Gate Criteria**:
|
||||
|
||||
- P0 pass rate: 100%
|
||||
- P1 pass rate: ≥95%
|
||||
- High-risk mitigations: 100%
|
||||
- Coverage: ≥80%
|
||||
|
||||
**Output File**: {output_file}
|
||||
|
||||
**Next Steps**:
|
||||
|
||||
1. Review risk assessment with team
|
||||
2. Prioritize mitigation for high-risk items (score ≥6)
|
||||
3. Run `*atdd` to generate failing tests for P0 scenarios (separate workflow; not auto-run by `*test-design`)
|
||||
4. Allocate resources per effort estimates
|
||||
5. Set up test data factories and fixtures
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Validation
|
||||
|
||||
After completing all steps, verify:
|
||||
|
||||
- [ ] Risk assessment complete with all categories
|
||||
- [ ] All risks scored (probability × impact)
|
||||
- [ ] High-priority risks (≥6) flagged
|
||||
- [ ] Coverage matrix maps requirements to test levels
|
||||
- [ ] Priority levels assigned (P0-P3)
|
||||
- [ ] Execution order defined
|
||||
- [ ] Resource estimates provided
|
||||
- [ ] Quality gate criteria defined
|
||||
- [ ] Output file created and formatted correctly
|
||||
|
||||
Refer to `checklist.md` for comprehensive validation criteria.
|
||||
@@ -1,294 +0,0 @@
|
||||
# Test Design: Epic {epic_num} - {epic_title}
|
||||
|
||||
**Date:** {date}
|
||||
**Author:** {user_name}
|
||||
**Status:** Draft / Approved
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
**Scope:** {design_level} test design for Epic {epic_num}
|
||||
|
||||
**Risk Summary:**
|
||||
|
||||
- Total risks identified: {total_risks}
|
||||
- High-priority risks (≥6): {high_priority_count}
|
||||
- Critical categories: {top_categories}
|
||||
|
||||
**Coverage Summary:**
|
||||
|
||||
- P0 scenarios: {p0_count} ({p0_hours} hours)
|
||||
- P1 scenarios: {p1_count} ({p1_hours} hours)
|
||||
- P2/P3 scenarios: {p2p3_count} ({p2p3_hours} hours)
|
||||
- **Total effort**: {total_hours} hours (~{total_days} days)
|
||||
|
||||
---
|
||||
|
||||
## Risk Assessment
|
||||
|
||||
### High-Priority Risks (Score ≥6)
|
||||
|
||||
| Risk ID | Category | Description | Probability | Impact | Score | Mitigation | Owner | Timeline |
|
||||
| ------- | -------- | ------------- | ----------- | ------ | ----- | ------------ | ------- | -------- |
|
||||
| R-001 | SEC | {description} | 2 | 3 | 6 | {mitigation} | {owner} | {date} |
|
||||
| R-002 | PERF | {description} | 3 | 2 | 6 | {mitigation} | {owner} | {date} |
|
||||
|
||||
### Medium-Priority Risks (Score 3-4)
|
||||
|
||||
| Risk ID | Category | Description | Probability | Impact | Score | Mitigation | Owner |
|
||||
| ------- | -------- | ------------- | ----------- | ------ | ----- | ------------ | ------- |
|
||||
| R-003 | TECH | {description} | 2 | 2 | 4 | {mitigation} | {owner} |
|
||||
| R-004 | DATA | {description} | 1 | 3 | 3 | {mitigation} | {owner} |
|
||||
|
||||
### Low-Priority Risks (Score 1-2)
|
||||
|
||||
| Risk ID | Category | Description | Probability | Impact | Score | Action |
|
||||
| ------- | -------- | ------------- | ----------- | ------ | ----- | ------- |
|
||||
| R-005 | OPS | {description} | 1 | 2 | 2 | Monitor |
|
||||
| R-006 | BUS | {description} | 1 | 1 | 1 | Monitor |
|
||||
|
||||
### Risk Category Legend
|
||||
|
||||
- **TECH**: Technical/Architecture (flaws, integration, scalability)
|
||||
- **SEC**: Security (access controls, auth, data exposure)
|
||||
- **PERF**: Performance (SLA violations, degradation, resource limits)
|
||||
- **DATA**: Data Integrity (loss, corruption, inconsistency)
|
||||
- **BUS**: Business Impact (UX harm, logic errors, revenue)
|
||||
- **OPS**: Operations (deployment, config, monitoring)
|
||||
|
||||
---
|
||||
|
||||
## Test Coverage Plan
|
||||
|
||||
### P0 (Critical) - Run on every commit
|
||||
|
||||
**Criteria**: Blocks core journey + High risk (≥6) + No workaround
|
||||
|
||||
| Requirement | Test Level | Risk Link | Test Count | Owner | Notes |
|
||||
| ------------- | ---------- | --------- | ---------- | ----- | ------- |
|
||||
| {requirement} | E2E | R-001 | 3 | QA | {notes} |
|
||||
| {requirement} | API | R-002 | 5 | QA | {notes} |
|
||||
|
||||
**Total P0**: {p0_count} tests, {p0_hours} hours
|
||||
|
||||
### P1 (High) - Run on PR to main
|
||||
|
||||
**Criteria**: Important features + Medium risk (3-4) + Common workflows
|
||||
|
||||
| Requirement | Test Level | Risk Link | Test Count | Owner | Notes |
|
||||
| ------------- | ---------- | --------- | ---------- | ----- | ------- |
|
||||
| {requirement} | API | R-003 | 4 | QA | {notes} |
|
||||
| {requirement} | Component | - | 6 | DEV | {notes} |
|
||||
|
||||
**Total P1**: {p1_count} tests, {p1_hours} hours
|
||||
|
||||
### P2 (Medium) - Run nightly/weekly
|
||||
|
||||
**Criteria**: Secondary features + Low risk (1-2) + Edge cases
|
||||
|
||||
| Requirement | Test Level | Risk Link | Test Count | Owner | Notes |
|
||||
| ------------- | ---------- | --------- | ---------- | ----- | ------- |
|
||||
| {requirement} | API | R-004 | 8 | QA | {notes} |
|
||||
| {requirement} | Unit | - | 15 | DEV | {notes} |
|
||||
|
||||
**Total P2**: {p2_count} tests, {p2_hours} hours
|
||||
|
||||
### P3 (Low) - Run on-demand
|
||||
|
||||
**Criteria**: Nice-to-have + Exploratory + Performance benchmarks
|
||||
|
||||
| Requirement | Test Level | Test Count | Owner | Notes |
|
||||
| ------------- | ---------- | ---------- | ----- | ------- |
|
||||
| {requirement} | E2E | 2 | QA | {notes} |
|
||||
| {requirement} | Unit | 8 | DEV | {notes} |
|
||||
|
||||
**Total P3**: {p3_count} tests, {p3_hours} hours
|
||||
|
||||
---
|
||||
|
||||
## Execution Order
|
||||
|
||||
### Smoke Tests (<5 min)
|
||||
|
||||
**Purpose**: Fast feedback, catch build-breaking issues
|
||||
|
||||
- [ ] {scenario} (30s)
|
||||
- [ ] {scenario} (45s)
|
||||
- [ ] {scenario} (1min)
|
||||
|
||||
**Total**: {smoke_count} scenarios
|
||||
|
||||
### P0 Tests (<10 min)
|
||||
|
||||
**Purpose**: Critical path validation
|
||||
|
||||
- [ ] {scenario} (E2E)
|
||||
- [ ] {scenario} (API)
|
||||
- [ ] {scenario} (API)
|
||||
|
||||
**Total**: {p0_count} scenarios
|
||||
|
||||
### P1 Tests (<30 min)
|
||||
|
||||
**Purpose**: Important feature coverage
|
||||
|
||||
- [ ] {scenario} (API)
|
||||
- [ ] {scenario} (Component)
|
||||
|
||||
**Total**: {p1_count} scenarios
|
||||
|
||||
### P2/P3 Tests (<60 min)
|
||||
|
||||
**Purpose**: Full regression coverage
|
||||
|
||||
- [ ] {scenario} (Unit)
|
||||
- [ ] {scenario} (API)
|
||||
|
||||
**Total**: {p2p3_count} scenarios
|
||||
|
||||
---
|
||||
|
||||
## Resource Estimates
|
||||
|
||||
### Test Development Effort
|
||||
|
||||
| Priority | Count | Hours/Test | Total Hours | Notes |
|
||||
| --------- | ----------------- | ---------- | ----------------- | ----------------------- |
|
||||
| P0 | {p0_count} | 2.0 | {p0_hours} | Complex setup, security |
|
||||
| P1 | {p1_count} | 1.0 | {p1_hours} | Standard coverage |
|
||||
| P2 | {p2_count} | 0.5 | {p2_hours} | Simple scenarios |
|
||||
| P3 | {p3_count} | 0.25 | {p3_hours} | Exploratory |
|
||||
| **Total** | **{total_count}** | **-** | **{total_hours}** | **~{total_days} days** |
|
||||
|
||||
### Prerequisites
|
||||
|
||||
**Test Data:**
|
||||
|
||||
- {factory_name} factory (faker-based, auto-cleanup)
|
||||
- {fixture_name} fixture (setup/teardown)
|
||||
|
||||
**Tooling:**
|
||||
|
||||
- {tool} for {purpose}
|
||||
- {tool} for {purpose}
|
||||
|
||||
**Environment:**
|
||||
|
||||
- {env_requirement}
|
||||
- {env_requirement}
|
||||
|
||||
---
|
||||
|
||||
## Quality Gate Criteria
|
||||
|
||||
### Pass/Fail Thresholds
|
||||
|
||||
- **P0 pass rate**: 100% (no exceptions)
|
||||
- **P1 pass rate**: ≥95% (waivers required for failures)
|
||||
- **P2/P3 pass rate**: ≥90% (informational)
|
||||
- **High-risk mitigations**: 100% complete or approved waivers
|
||||
|
||||
### Coverage Targets
|
||||
|
||||
- **Critical paths**: ≥80%
|
||||
- **Security scenarios**: 100%
|
||||
- **Business logic**: ≥70%
|
||||
- **Edge cases**: ≥50%
|
||||
|
||||
### Non-Negotiable Requirements
|
||||
|
||||
- [ ] All P0 tests pass
|
||||
- [ ] No high-risk (≥6) items unmitigated
|
||||
- [ ] Security tests (SEC category) pass 100%
|
||||
- [ ] Performance targets met (PERF category)
|
||||
|
||||
---
|
||||
|
||||
## Mitigation Plans
|
||||
|
||||
### R-001: {Risk Description} (Score: 6)
|
||||
|
||||
**Mitigation Strategy:** {detailed_mitigation}
|
||||
**Owner:** {owner}
|
||||
**Timeline:** {date}
|
||||
**Status:** Planned / In Progress / Complete
|
||||
**Verification:** {how_to_verify}
|
||||
|
||||
### R-002: {Risk Description} (Score: 6)
|
||||
|
||||
**Mitigation Strategy:** {detailed_mitigation}
|
||||
**Owner:** {owner}
|
||||
**Timeline:** {date}
|
||||
**Status:** Planned / In Progress / Complete
|
||||
**Verification:** {how_to_verify}
|
||||
|
||||
---
|
||||
|
||||
## Assumptions and Dependencies
|
||||
|
||||
### Assumptions
|
||||
|
||||
1. {assumption}
|
||||
2. {assumption}
|
||||
3. {assumption}
|
||||
|
||||
### Dependencies
|
||||
|
||||
1. {dependency} - Required by {date}
|
||||
2. {dependency} - Required by {date}
|
||||
|
||||
### Risks to Plan
|
||||
|
||||
- **Risk**: {risk_to_plan}
|
||||
- **Impact**: {impact}
|
||||
- **Contingency**: {contingency}
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
## Follow-on Workflows (Manual)
|
||||
|
||||
- Run `*atdd` to generate failing P0 tests (separate workflow; not auto-run).
|
||||
- Run `*automate` for broader coverage once implementation exists.
|
||||
|
||||
---
|
||||
|
||||
## Approval
|
||||
|
||||
**Test Design Approved By:**
|
||||
|
||||
- [ ] Product Manager: {name} Date: {date}
|
||||
- [ ] Tech Lead: {name} Date: {date}
|
||||
- [ ] QA Lead: {name} Date: {date}
|
||||
|
||||
**Comments:**
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
## Appendix
|
||||
|
||||
### Knowledge Base References
|
||||
|
||||
- `risk-governance.md` - Risk classification framework
|
||||
- `probability-impact.md` - Risk scoring methodology
|
||||
- `test-levels-framework.md` - Test level selection
|
||||
- `test-priorities-matrix.md` - P0-P3 prioritization
|
||||
|
||||
### Related Documents
|
||||
|
||||
- PRD: {prd_link}
|
||||
- Epic: {epic_link}
|
||||
- Architecture: {arch_link}
|
||||
- Tech Spec: {tech_spec_link}
|
||||
|
||||
---
|
||||
|
||||
**Generated by**: BMad TEA Agent - Test Architect Module
|
||||
**Workflow**: `_bmad/bmm/testarch/test-design`
|
||||
**Version**: 4.0 (BMad v6)
|
||||
@@ -1,54 +0,0 @@
|
||||
# Test Architect workflow: test-design
|
||||
name: testarch-test-design
|
||||
description: "Dual-mode workflow: (1) System-level testability review in Solutioning phase, or (2) Epic-level test planning in Implementation phase. Auto-detects mode based on project phase."
|
||||
author: "BMad"
|
||||
|
||||
# Critical variables from config
|
||||
config_source: "{project-root}/_bmad/bmm/config.yaml"
|
||||
output_folder: "{config_source}:output_folder"
|
||||
user_name: "{config_source}:user_name"
|
||||
communication_language: "{config_source}:communication_language"
|
||||
document_output_language: "{config_source}:document_output_language"
|
||||
date: system-generated
|
||||
|
||||
# Workflow components
|
||||
installed_path: "{project-root}/_bmad/bmm/workflows/testarch/test-design"
|
||||
instructions: "{installed_path}/instructions.md"
|
||||
validation: "{installed_path}/checklist.md"
|
||||
template: "{installed_path}/test-design-template.md"
|
||||
|
||||
# Variables and inputs
|
||||
variables:
|
||||
design_level: "full" # full, targeted, minimal - scope of design effort
|
||||
mode: "auto-detect" # auto-detect (default), system-level, epic-level
|
||||
|
||||
# Output configuration
|
||||
# Note: Actual output file determined dynamically based on mode detection
|
||||
# Declared outputs for new workflow format
|
||||
outputs:
|
||||
- id: system-level
|
||||
description: "System-level testability review (Phase 3)"
|
||||
path: "{output_folder}/test-design-system.md"
|
||||
- id: epic-level
|
||||
description: "Epic-level test plan (Phase 4)"
|
||||
path: "{output_folder}/test-design-epic-{epic_num}.md"
|
||||
default_output_file: "{output_folder}/test-design-epic-{epic_num}.md"
|
||||
|
||||
# Required tools
|
||||
required_tools:
|
||||
- read_file # Read PRD, epics, stories, architecture docs
|
||||
- write_file # Create test design document
|
||||
- list_files # Find related documentation
|
||||
- search_repo # Search for existing tests and patterns
|
||||
|
||||
tags:
|
||||
- qa
|
||||
- planning
|
||||
- test-architect
|
||||
- risk-assessment
|
||||
- coverage
|
||||
|
||||
execution_hints:
|
||||
interactive: false # Minimize prompts
|
||||
autonomous: true # Proceed without user input unless blocked
|
||||
iterative: true
|
||||
@@ -1,472 +0,0 @@
|
||||
# Test Quality Review - Validation Checklist
|
||||
|
||||
Use this checklist to validate that the test quality review workflow completed successfully and all quality criteria were properly evaluated.
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Note: `test-review` is optional and only audits existing tests; it does not generate tests.
|
||||
|
||||
### Test File Discovery
|
||||
|
||||
- [ ] Test file(s) identified for review (single/directory/suite scope)
|
||||
- [ ] Test files exist and are readable
|
||||
- [ ] Test framework detected (Playwright, Jest, Cypress, Vitest, etc.)
|
||||
- [ ] Test framework configuration found (playwright.config.ts, jest.config.js, etc.)
|
||||
|
||||
### Knowledge Base Loading
|
||||
|
||||
- [ ] tea-index.csv loaded successfully
|
||||
- [ ] `test-quality.md` loaded (Definition of Done)
|
||||
- [ ] `fixture-architecture.md` loaded (Pure function → Fixture patterns)
|
||||
- [ ] `network-first.md` loaded (Route intercept before navigate)
|
||||
- [ ] `data-factories.md` loaded (Factory patterns)
|
||||
- [ ] `test-levels-framework.md` loaded (E2E vs API vs Component vs Unit)
|
||||
- [ ] All other enabled fragments loaded successfully
|
||||
|
||||
### Context Gathering
|
||||
|
||||
- [ ] Story file discovered or explicitly provided (if available)
|
||||
- [ ] Test design document discovered or explicitly provided (if available)
|
||||
- [ ] Acceptance criteria extracted from story (if available)
|
||||
- [ ] Priority context (P0/P1/P2/P3) extracted from test-design (if available)
|
||||
|
||||
---
|
||||
|
||||
## Process Steps
|
||||
|
||||
### Step 1: Context Loading
|
||||
|
||||
- [ ] Review scope determined (single/directory/suite)
|
||||
- [ ] Test file paths collected
|
||||
- [ ] Related artifacts discovered (story, test-design)
|
||||
- [ ] Knowledge base fragments loaded successfully
|
||||
- [ ] Quality criteria flags read from workflow variables
|
||||
|
||||
### Step 2: Test File Parsing
|
||||
|
||||
**For Each Test File:**
|
||||
|
||||
- [ ] File read successfully
|
||||
- [ ] File size measured (lines, KB)
|
||||
- [ ] File structure parsed (describe blocks, it blocks)
|
||||
- [ ] Test IDs extracted (if present)
|
||||
- [ ] Priority markers extracted (if present)
|
||||
- [ ] Imports analyzed
|
||||
- [ ] Dependencies identified
|
||||
|
||||
**Test Structure Analysis:**
|
||||
|
||||
- [ ] Describe block count calculated
|
||||
- [ ] It/test block count calculated
|
||||
- [ ] BDD structure identified (Given-When-Then)
|
||||
- [ ] Fixture usage detected
|
||||
- [ ] Data factory usage detected
|
||||
- [ ] Network interception patterns identified
|
||||
- [ ] Assertions counted
|
||||
- [ ] Waits and timeouts cataloged
|
||||
- [ ] Conditionals (if/else) detected
|
||||
- [ ] Try/catch blocks detected
|
||||
- [ ] Shared state or globals detected
|
||||
|
||||
### Step 3: Quality Criteria Validation
|
||||
|
||||
**For Each Enabled Criterion:**
|
||||
|
||||
#### BDD Format (if `check_given_when_then: true`)
|
||||
|
||||
- [ ] Given-When-Then structure evaluated
|
||||
- [ ] Status assigned (PASS/WARN/FAIL)
|
||||
- [ ] Violations recorded with line numbers
|
||||
- [ ] Examples of good/bad patterns noted
|
||||
|
||||
#### Test IDs (if `check_test_ids: true`)
|
||||
|
||||
- [ ] Test ID presence validated
|
||||
- [ ] Test ID format checked (e.g., 1.3-E2E-001)
|
||||
- [ ] Status assigned (PASS/WARN/FAIL)
|
||||
- [ ] Missing IDs cataloged
|
||||
|
||||
#### Priority Markers (if `check_priority_markers: true`)
|
||||
|
||||
- [ ] P0/P1/P2/P3 classification validated
|
||||
- [ ] Status assigned (PASS/WARN/FAIL)
|
||||
- [ ] Missing priorities cataloged
|
||||
|
||||
#### Hard Waits (if `check_hard_waits: true`)
|
||||
|
||||
- [ ] sleep(), waitForTimeout(), hardcoded delays detected
|
||||
- [ ] Justification comments checked
|
||||
- [ ] Status assigned (PASS/WARN/FAIL)
|
||||
- [ ] Violations recorded with line numbers and recommended fixes
|
||||
|
||||
#### Determinism (if `check_determinism: true`)
|
||||
|
||||
- [ ] Conditionals (if/else/switch) detected
|
||||
- [ ] Try/catch abuse detected
|
||||
- [ ] Random values (Math.random, Date.now) detected
|
||||
- [ ] Status assigned (PASS/WARN/FAIL)
|
||||
- [ ] Violations recorded with recommended fixes
|
||||
|
||||
#### Isolation (if `check_isolation: true`)
|
||||
|
||||
- [ ] Cleanup hooks (afterEach/afterAll) validated
|
||||
- [ ] Shared state detected
|
||||
- [ ] Global variable mutations detected
|
||||
- [ ] Resource cleanup verified
|
||||
- [ ] Status assigned (PASS/WARN/FAIL)
|
||||
- [ ] Violations recorded with recommended fixes
|
||||
|
||||
#### Fixture Patterns (if `check_fixture_patterns: true`)
|
||||
|
||||
- [ ] Fixtures detected (test.extend)
|
||||
- [ ] Pure functions validated
|
||||
- [ ] mergeTests usage checked
|
||||
- [ ] beforeEach complexity analyzed
|
||||
- [ ] Status assigned (PASS/WARN/FAIL)
|
||||
- [ ] Violations recorded with recommended fixes
|
||||
|
||||
#### Data Factories (if `check_data_factories: true`)
|
||||
|
||||
- [ ] Factory functions detected
|
||||
- [ ] Hardcoded data (magic strings/numbers) detected
|
||||
- [ ] Faker.js or similar usage validated
|
||||
- [ ] API-first setup pattern checked
|
||||
- [ ] Status assigned (PASS/WARN/FAIL)
|
||||
- [ ] Violations recorded with recommended fixes
|
||||
|
||||
#### Network-First (if `check_network_first: true`)
|
||||
|
||||
- [ ] page.route() before page.goto() validated
|
||||
- [ ] Race conditions detected (route after navigate)
|
||||
- [ ] waitForResponse patterns checked
|
||||
- [ ] Status assigned (PASS/WARN/FAIL)
|
||||
- [ ] Violations recorded with recommended fixes
|
||||
|
||||
#### Assertions (if `check_assertions: true`)
|
||||
|
||||
- [ ] Explicit assertions counted
|
||||
- [ ] Implicit waits without assertions detected
|
||||
- [ ] Assertion specificity validated
|
||||
- [ ] Status assigned (PASS/WARN/FAIL)
|
||||
- [ ] Violations recorded with recommended fixes
|
||||
|
||||
#### Test Length (if `check_test_length: true`)
|
||||
|
||||
- [ ] File line count calculated
|
||||
- [ ] Threshold comparison (≤300 lines ideal)
|
||||
- [ ] Status assigned (PASS/WARN/FAIL)
|
||||
- [ ] Splitting recommendations generated (if >300 lines)
|
||||
|
||||
#### Test Duration (if `check_test_duration: true`)
|
||||
|
||||
- [ ] Test complexity analyzed (as proxy for duration if no execution data)
|
||||
- [ ] Threshold comparison (≤1.5 min target)
|
||||
- [ ] Status assigned (PASS/WARN/FAIL)
|
||||
- [ ] Optimization recommendations generated
|
||||
|
||||
#### Flakiness Patterns (if `check_flakiness_patterns: true`)
|
||||
|
||||
- [ ] Tight timeouts detected (e.g., { timeout: 1000 })
|
||||
- [ ] Race conditions detected
|
||||
- [ ] Timing-dependent assertions detected
|
||||
- [ ] Retry logic detected
|
||||
- [ ] Environment-dependent assumptions detected
|
||||
- [ ] Status assigned (PASS/WARN/FAIL)
|
||||
- [ ] Violations recorded with recommended fixes
|
||||
|
||||
---
|
||||
|
||||
### Step 4: Quality Score Calculation
|
||||
|
||||
**Violation Counting:**
|
||||
|
||||
- [ ] Critical (P0) violations counted
|
||||
- [ ] High (P1) violations counted
|
||||
- [ ] Medium (P2) violations counted
|
||||
- [ ] Low (P3) violations counted
|
||||
- [ ] Violation breakdown by criterion recorded
|
||||
|
||||
**Score Calculation:**
|
||||
|
||||
- [ ] Starting score: 100
|
||||
- [ ] Critical violations deducted (-10 each)
|
||||
- [ ] High violations deducted (-5 each)
|
||||
- [ ] Medium violations deducted (-2 each)
|
||||
- [ ] Low violations deducted (-1 each)
|
||||
- [ ] Bonus points added (max +30):
|
||||
- [ ] Excellent BDD structure (+5 if applicable)
|
||||
- [ ] Comprehensive fixtures (+5 if applicable)
|
||||
- [ ] Comprehensive data factories (+5 if applicable)
|
||||
- [ ] Network-first pattern (+5 if applicable)
|
||||
- [ ] Perfect isolation (+5 if applicable)
|
||||
- [ ] All test IDs present (+5 if applicable)
|
||||
- [ ] Final score calculated: max(0, min(100, Starting - Violations + Bonus))
|
||||
|
||||
**Quality Grade:**
|
||||
|
||||
- [ ] Grade assigned based on score:
|
||||
- 90-100: A+ (Excellent)
|
||||
- 80-89: A (Good)
|
||||
- 70-79: B (Acceptable)
|
||||
- 60-69: C (Needs Improvement)
|
||||
- <60: F (Critical Issues)
|
||||
|
||||
---
|
||||
|
||||
### Step 5: Review Report Generation
|
||||
|
||||
**Report Sections Created:**
|
||||
|
||||
- [ ] **Header Section**:
|
||||
- [ ] Test file(s) reviewed listed
|
||||
- [ ] Review date recorded
|
||||
- [ ] Review scope noted (single/directory/suite)
|
||||
- [ ] Quality score and grade displayed
|
||||
|
||||
- [ ] **Executive Summary**:
|
||||
- [ ] Overall assessment (Excellent/Good/Needs Improvement/Critical)
|
||||
- [ ] Key strengths listed (3-5 bullet points)
|
||||
- [ ] Key weaknesses listed (3-5 bullet points)
|
||||
- [ ] Recommendation stated (Approve/Approve with comments/Request changes/Block)
|
||||
|
||||
- [ ] **Quality Criteria Assessment**:
|
||||
- [ ] Table with all criteria evaluated
|
||||
- [ ] Status for each criterion (PASS/WARN/FAIL)
|
||||
- [ ] Violation count per criterion
|
||||
|
||||
- [ ] **Critical Issues (Must Fix)**:
|
||||
- [ ] P0/P1 violations listed
|
||||
- [ ] Code location provided for each (file:line)
|
||||
- [ ] Issue explanation clear
|
||||
- [ ] Recommended fix provided with code example
|
||||
- [ ] Knowledge base reference provided
|
||||
|
||||
- [ ] **Recommendations (Should Fix)**:
|
||||
- [ ] P2/P3 violations listed
|
||||
- [ ] Code location provided for each (file:line)
|
||||
- [ ] Issue explanation clear
|
||||
- [ ] Recommended improvement provided with code example
|
||||
- [ ] Knowledge base reference provided
|
||||
|
||||
- [ ] **Best Practices Examples** (if good patterns found):
|
||||
- [ ] Good patterns highlighted from tests
|
||||
- [ ] Knowledge base fragments referenced
|
||||
- [ ] Examples provided for others to follow
|
||||
|
||||
- [ ] **Knowledge Base References**:
|
||||
- [ ] All fragments consulted listed
|
||||
- [ ] Links to detailed guidance provided
|
||||
|
||||
---
|
||||
|
||||
### Step 6: Optional Outputs Generation
|
||||
|
||||
**Inline Comments** (if `generate_inline_comments: true`):
|
||||
|
||||
- [ ] Inline comments generated at violation locations
|
||||
- [ ] Comment format: `// TODO (TEA Review): [Issue] - See test-review-{filename}.md`
|
||||
- [ ] Comments added to test files (no logic changes)
|
||||
- [ ] Test files remain valid and executable
|
||||
|
||||
**Quality Badge** (if `generate_quality_badge: true`):
|
||||
|
||||
- [ ] Badge created with quality score (e.g., "Test Quality: 87/100 (A)")
|
||||
- [ ] Badge format suitable for README or documentation
|
||||
- [ ] Badge saved to output folder
|
||||
|
||||
**Story Update** (if `append_to_story: true` and story file exists):
|
||||
|
||||
- [ ] "Test Quality Review" section created
|
||||
- [ ] Quality score included
|
||||
- [ ] Critical issues summarized
|
||||
- [ ] Link to full review report provided
|
||||
- [ ] Story file updated successfully
|
||||
|
||||
---
|
||||
|
||||
### Step 7: Save and Notify
|
||||
|
||||
**Outputs Saved:**
|
||||
|
||||
- [ ] Review report saved to `{output_file}`
|
||||
- [ ] Inline comments written to test files (if enabled)
|
||||
- [ ] Quality badge saved (if enabled)
|
||||
- [ ] Story file updated (if enabled)
|
||||
- [ ] All outputs are valid and readable
|
||||
|
||||
**Summary Message Generated:**
|
||||
|
||||
- [ ] Quality score and grade included
|
||||
- [ ] Critical issue count stated
|
||||
- [ ] Recommendation provided (Approve/Request changes/Block)
|
||||
- [ ] Next steps clarified
|
||||
- [ ] Message displayed to user
|
||||
|
||||
---
|
||||
|
||||
## Output Validation
|
||||
|
||||
### Review Report Completeness
|
||||
|
||||
- [ ] All required sections present
|
||||
- [ ] No placeholder text or TODOs in report
|
||||
- [ ] All code locations are accurate (file:line)
|
||||
- [ ] All code examples are valid and demonstrate fix
|
||||
- [ ] All knowledge base references are correct
|
||||
|
||||
### Review Report Accuracy
|
||||
|
||||
- [ ] Quality score matches violation breakdown
|
||||
- [ ] Grade matches score range
|
||||
- [ ] Violations correctly categorized by severity (P0/P1/P2/P3)
|
||||
- [ ] Violations correctly attributed to quality criteria
|
||||
- [ ] No false positives (violations are legitimate issues)
|
||||
- [ ] No false negatives (critical issues not missed)
|
||||
|
||||
### Review Report Clarity
|
||||
|
||||
- [ ] Executive summary is clear and actionable
|
||||
- [ ] Issue explanations are understandable
|
||||
- [ ] Recommended fixes are implementable
|
||||
- [ ] Code examples are correct and runnable
|
||||
- [ ] Recommendation (Approve/Request changes) is clear
|
||||
|
||||
---
|
||||
|
||||
## Quality Checks
|
||||
|
||||
### Knowledge-Based Validation
|
||||
|
||||
- [ ] All feedback grounded in knowledge base fragments
|
||||
- [ ] Recommendations follow proven patterns
|
||||
- [ ] No arbitrary or opinion-based feedback
|
||||
- [ ] Knowledge fragment references accurate and relevant
|
||||
|
||||
### Actionable Feedback
|
||||
|
||||
- [ ] Every issue includes recommended fix
|
||||
- [ ] Every fix includes code example
|
||||
- [ ] Code examples demonstrate correct pattern
|
||||
- [ ] Fixes reference knowledge base for more detail
|
||||
|
||||
### Severity Classification
|
||||
|
||||
- [ ] Critical (P0) issues are genuinely critical (hard waits, race conditions, no assertions)
|
||||
- [ ] High (P1) issues impact maintainability/reliability (missing IDs, hardcoded data)
|
||||
- [ ] Medium (P2) issues are nice-to-have improvements (long files, missing priorities)
|
||||
- [ ] Low (P3) issues are minor style/preference (verbose tests)
|
||||
|
||||
### Context Awareness
|
||||
|
||||
- [ ] Review considers project context (some patterns may be justified)
|
||||
- [ ] Violations with justification comments noted as acceptable
|
||||
- [ ] Edge cases acknowledged
|
||||
- [ ] Recommendations are pragmatic, not dogmatic
|
||||
|
||||
---
|
||||
|
||||
## Integration Points
|
||||
|
||||
### Story File Integration
|
||||
|
||||
- [ ] Story file discovered correctly (if available)
|
||||
- [ ] Acceptance criteria extracted and used for context
|
||||
- [ ] Test quality section appended to story (if enabled)
|
||||
- [ ] Link to review report added to story
|
||||
|
||||
### Test Design Integration
|
||||
|
||||
- [ ] Test design document discovered correctly (if available)
|
||||
- [ ] Priority context (P0/P1/P2/P3) extracted and used
|
||||
- [ ] Review validates tests align with prioritization
|
||||
- [ ] Misalignment flagged (e.g., P0 scenario missing tests)
|
||||
|
||||
### Knowledge Base Integration
|
||||
|
||||
- [ ] tea-index.csv loaded successfully
|
||||
- [ ] All required fragments loaded
|
||||
- [ ] Fragments applied correctly to validation
|
||||
- [ ] Fragment references in report are accurate
|
||||
|
||||
---
|
||||
|
||||
## Edge Cases and Special Situations
|
||||
|
||||
### Empty or Minimal Tests
|
||||
|
||||
- [ ] If test file is empty, report notes "No tests found"
|
||||
- [ ] If test file has only boilerplate, report notes "No meaningful tests"
|
||||
- [ ] Score reflects lack of content appropriately
|
||||
|
||||
### Legacy Tests
|
||||
|
||||
- [ ] Legacy tests acknowledged in context
|
||||
- [ ] Review provides practical recommendations for improvement
|
||||
- [ ] Recognizes that complete refactor may not be feasible
|
||||
- [ ] Prioritizes critical issues (flakiness) over style
|
||||
|
||||
### Test Framework Variations
|
||||
|
||||
- [ ] Review adapts to test framework (Playwright vs Jest vs Cypress)
|
||||
- [ ] Framework-specific patterns recognized (e.g., Playwright fixtures)
|
||||
- [ ] Framework-specific violations detected (e.g., Cypress anti-patterns)
|
||||
- [ ] Knowledge fragments applied appropriately for framework
|
||||
|
||||
### Justified Violations
|
||||
|
||||
- [ ] Violations with justification comments in code noted as acceptable
|
||||
- [ ] Justifications evaluated for legitimacy
|
||||
- [ ] Report acknowledges justified patterns
|
||||
- [ ] Score not penalized for justified violations
|
||||
|
||||
---
|
||||
|
||||
## Final Validation
|
||||
|
||||
### Review Completeness
|
||||
|
||||
- [ ] All enabled quality criteria evaluated
|
||||
- [ ] All test files in scope reviewed
|
||||
- [ ] All violations cataloged
|
||||
- [ ] All recommendations provided
|
||||
- [ ] Review report is comprehensive
|
||||
|
||||
### Review Accuracy
|
||||
|
||||
- [ ] Quality score is accurate
|
||||
- [ ] Violations are correct (no false positives)
|
||||
- [ ] Critical issues not missed (no false negatives)
|
||||
- [ ] Code locations are correct
|
||||
- [ ] Knowledge base references are accurate
|
||||
|
||||
### Review Usefulness
|
||||
|
||||
- [ ] Feedback is actionable
|
||||
- [ ] Recommendations are implementable
|
||||
- [ ] Code examples are correct
|
||||
- [ ] Review helps developer improve tests
|
||||
- [ ] Review educates on best practices
|
||||
|
||||
### Workflow Complete
|
||||
|
||||
- [ ] All checklist items completed
|
||||
- [ ] All outputs validated and saved
|
||||
- [ ] User notified with summary
|
||||
- [ ] Review ready for developer consumption
|
||||
- [ ] Follow-up actions identified (if any)
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
Record any issues, observations, or important context during workflow execution:
|
||||
|
||||
- **Test Framework**: [Playwright, Jest, Cypress, etc.]
|
||||
- **Review Scope**: [single file, directory, full suite]
|
||||
- **Quality Score**: [0-100 score, letter grade]
|
||||
- **Critical Issues**: [Count of P0/P1 violations]
|
||||
- **Recommendation**: [Approve / Approve with comments / Request changes / Block]
|
||||
- **Special Considerations**: [Legacy code, justified patterns, edge cases]
|
||||
- **Follow-up Actions**: [Re-review after fixes, pair programming, etc.]
|
||||
@@ -1,628 +0,0 @@
|
||||
# Test Quality Review - Instructions v4.0
|
||||
|
||||
**Workflow:** `testarch-test-review`
|
||||
**Purpose:** Review test quality using TEA's comprehensive knowledge base and validate against best practices for maintainability, determinism, isolation, and flakiness prevention
|
||||
**Agent:** Test Architect (TEA)
|
||||
**Format:** Pure Markdown v4.0 (no XML blocks)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This workflow performs comprehensive test quality reviews using TEA's knowledge base of best practices. It validates tests against proven patterns for fixture architecture, network-first safeguards, data factories, determinism, isolation, and flakiness prevention. The review generates actionable feedback with quality scoring.
|
||||
|
||||
**Key Capabilities:**
|
||||
|
||||
- **Knowledge-Based Review**: Applies patterns from tea-index.csv fragments
|
||||
- **Quality Scoring**: 0-100 score based on violations and best practices
|
||||
- **Multi-Scope**: Review single file, directory, or entire test suite
|
||||
- **Pattern Detection**: Identifies flaky patterns, hard waits, race conditions
|
||||
- **Best Practice Validation**: BDD format, test IDs, priorities, assertions
|
||||
- **Actionable Feedback**: Critical issues (must fix) vs recommendations (should fix)
|
||||
- **Integration**: Works with story files, test-design, acceptance criteria
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
**Required:**
|
||||
|
||||
- Test file(s) to review (auto-discovered or explicitly provided)
|
||||
- Test framework configuration (playwright.config.ts, jest.config.js, etc.)
|
||||
|
||||
**Recommended:**
|
||||
|
||||
- Story file with acceptance criteria (for context)
|
||||
- Test design document (for priority context)
|
||||
- Knowledge base fragments available in tea-index.csv
|
||||
|
||||
**Halt Conditions:**
|
||||
|
||||
- If test file path is invalid or file doesn't exist, halt and request correction
|
||||
- If test_dir is empty (no tests found), halt and notify user
|
||||
|
||||
---
|
||||
|
||||
## Workflow Steps
|
||||
|
||||
### Step 1: Load Context and Knowledge Base
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. Check playwright-utils flag:
|
||||
- Read `{config_source}` and check `config.tea_use_playwright_utils`
|
||||
|
||||
2. Load relevant knowledge fragments from `{project-root}/_bmad/bmm/testarch/tea-index.csv`:
|
||||
|
||||
**Core Patterns (Always load):**
|
||||
- `test-quality.md` - Definition of Done (deterministic tests, isolated with cleanup, explicit assertions, <300 lines, <1.5 min, 658 lines, 5 examples)
|
||||
- `data-factories.md` - Factory functions with faker: overrides, nested factories, API-first setup (498 lines, 5 examples)
|
||||
- `test-levels-framework.md` - E2E vs API vs Component vs Unit appropriateness with decision matrix (467 lines, 4 examples)
|
||||
- `selective-testing.md` - Duplicate coverage detection with tag-based, spec filter, diff-based selection (727 lines, 4 examples)
|
||||
- `test-healing-patterns.md` - Common failure patterns: stale selectors, race conditions, dynamic data, network errors, hard waits (648 lines, 5 examples)
|
||||
- `selector-resilience.md` - Selector best practices (data-testid > ARIA > text > CSS hierarchy, anti-patterns, 541 lines, 4 examples)
|
||||
- `timing-debugging.md` - Race condition prevention and async debugging techniques (370 lines, 3 examples)
|
||||
|
||||
**If `config.tea_use_playwright_utils: true` (All Utilities):**
|
||||
- `overview.md` - Playwright utils best practices
|
||||
- `api-request.md` - Validate apiRequest usage patterns
|
||||
- `network-recorder.md` - Review HAR record/playback implementation
|
||||
- `auth-session.md` - Check auth token management
|
||||
- `intercept-network-call.md` - Validate network interception
|
||||
- `recurse.md` - Review polling patterns
|
||||
- `log.md` - Check logging best practices
|
||||
- `file-utils.md` - Validate file operation patterns
|
||||
- `burn-in.md` - Review burn-in configuration
|
||||
- `network-error-monitor.md` - Check error monitoring setup
|
||||
- `fixtures-composition.md` - Validate mergeTests usage
|
||||
|
||||
**If `config.tea_use_playwright_utils: false`:**
|
||||
- `fixture-architecture.md` - Pure function → Fixture → mergeTests composition with auto-cleanup (406 lines, 5 examples)
|
||||
- `network-first.md` - Route intercept before navigate to prevent race conditions (489 lines, 5 examples)
|
||||
- `playwright-config.md` - Environment-based configuration with fail-fast validation (722 lines, 5 examples)
|
||||
- `component-tdd.md` - Red-Green-Refactor patterns with provider isolation (480 lines, 4 examples)
|
||||
- `ci-burn-in.md` - Flaky test detection with 10-iteration burn-in loop (678 lines, 4 examples)
|
||||
|
||||
3. Determine review scope:
|
||||
- **single**: Review one test file (`test_file_path` provided)
|
||||
- **directory**: Review all tests in directory (`test_dir` provided)
|
||||
- **suite**: Review entire test suite (discover all test files)
|
||||
|
||||
4. Auto-discover related artifacts (if `auto_discover_story: true`):
|
||||
- Extract test ID from filename (e.g., `1.3-E2E-001.spec.ts` → story 1.3)
|
||||
- Search for story file (`story-1.3.md`)
|
||||
- Search for test design (`test-design-story-1.3.md` or `test-design-epic-1.md`)
|
||||
|
||||
5. Read story file for context (if available):
|
||||
- Extract acceptance criteria
|
||||
- Extract priority classification
|
||||
- Extract expected test IDs
|
||||
|
||||
**Output:** Complete knowledge base loaded, review scope determined, context gathered
|
||||
|
||||
---
|
||||
|
||||
### Step 2: Discover and Parse Test Files
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. **Discover test files** based on scope:
|
||||
- **single**: Use `test_file_path` variable
|
||||
- **directory**: Use `glob` to find all test files in `test_dir` (e.g., `*.spec.ts`, `*.test.js`)
|
||||
- **suite**: Use `glob` to find all test files recursively from project root
|
||||
|
||||
2. **Parse test file metadata**:
|
||||
- File path and name
|
||||
- File size (warn if >15 KB or >300 lines)
|
||||
- Test framework detected (Playwright, Jest, Cypress, Vitest, etc.)
|
||||
- Imports and dependencies
|
||||
- Test structure (describe/context/it blocks)
|
||||
|
||||
3. **Extract test structure**:
|
||||
- Count of describe blocks (test suites)
|
||||
- Count of it/test blocks (individual tests)
|
||||
- Test IDs (if present, e.g., `test.describe('1.3-E2E-001')`)
|
||||
- Priority markers (if present, e.g., `test.describe.only` for P0)
|
||||
- BDD structure (Given-When-Then comments or steps)
|
||||
|
||||
4. **Identify test patterns**:
|
||||
- Fixtures used
|
||||
- Data factories used
|
||||
- Network interception patterns
|
||||
- Assertions used (expect, assert, toHaveText, etc.)
|
||||
- Waits and timeouts (page.waitFor, sleep, hardcoded delays)
|
||||
- Conditionals (if/else, switch, ternary)
|
||||
- Try/catch blocks
|
||||
- Shared state or globals
|
||||
|
||||
**Output:** Complete test file inventory with structure and pattern analysis
|
||||
|
||||
---
|
||||
|
||||
### Step 3: Validate Against Quality Criteria
|
||||
|
||||
**Actions:**
|
||||
|
||||
For each test file, validate against quality criteria (configurable via workflow variables):
|
||||
|
||||
#### 1. BDD Format Validation (if `check_given_when_then: true`)
|
||||
|
||||
- ✅ **PASS**: Tests use Given-When-Then structure (comments or step organization)
|
||||
- ⚠️ **WARN**: Tests have some structure but not explicit GWT
|
||||
- ❌ **FAIL**: Tests lack clear structure, hard to understand intent
|
||||
|
||||
**Knowledge Fragment**: test-quality.md, tdd-cycles.md
|
||||
|
||||
---
|
||||
|
||||
#### 2. Test ID Conventions (if `check_test_ids: true`)
|
||||
|
||||
- ✅ **PASS**: Test IDs present and follow convention (e.g., `1.3-E2E-001`, `2.1-API-005`)
|
||||
- ⚠️ **WARN**: Some test IDs missing or inconsistent
|
||||
- ❌ **FAIL**: No test IDs, can't trace tests to requirements
|
||||
|
||||
**Knowledge Fragment**: traceability.md, test-quality.md
|
||||
|
||||
---
|
||||
|
||||
#### 3. Priority Markers (if `check_priority_markers: true`)
|
||||
|
||||
- ✅ **PASS**: Tests classified as P0/P1/P2/P3 (via markers or test-design reference)
|
||||
- ⚠️ **WARN**: Some priority classifications missing
|
||||
- ❌ **FAIL**: No priority classification, can't determine criticality
|
||||
|
||||
**Knowledge Fragment**: test-priorities.md, risk-governance.md
|
||||
|
||||
---
|
||||
|
||||
#### 4. Hard Waits Detection (if `check_hard_waits: true`)
|
||||
|
||||
- ✅ **PASS**: No hard waits detected (no `sleep()`, `wait(5000)`, hardcoded delays)
|
||||
- ⚠️ **WARN**: Some hard waits used but with justification comments
|
||||
- ❌ **FAIL**: Hard waits detected without justification (flakiness risk)
|
||||
|
||||
**Patterns to detect:**
|
||||
|
||||
- `sleep(1000)`, `setTimeout()`, `delay()`
|
||||
- `page.waitForTimeout(5000)` without explicit reason
|
||||
- `await new Promise(resolve => setTimeout(resolve, 3000))`
|
||||
|
||||
**Knowledge Fragment**: test-quality.md, network-first.md
|
||||
|
||||
---
|
||||
|
||||
#### 5. Determinism Check (if `check_determinism: true`)
|
||||
|
||||
- ✅ **PASS**: Tests are deterministic (no conditionals, no try/catch abuse, no random values)
|
||||
- ⚠️ **WARN**: Some conditionals but with clear justification
|
||||
- ❌ **FAIL**: Tests use if/else, switch, or try/catch to control flow (flakiness risk)
|
||||
|
||||
**Patterns to detect:**
|
||||
|
||||
- `if (condition) { test logic }` - tests should work deterministically
|
||||
- `try { test } catch { fallback }` - tests shouldn't swallow errors
|
||||
- `Math.random()`, `Date.now()` without factory abstraction
|
||||
|
||||
**Knowledge Fragment**: test-quality.md, data-factories.md
|
||||
|
||||
---
|
||||
|
||||
#### 6. Isolation Validation (if `check_isolation: true`)
|
||||
|
||||
- ✅ **PASS**: Tests clean up resources, no shared state, can run in any order
|
||||
- ⚠️ **WARN**: Some cleanup missing but isolated enough
|
||||
- ❌ **FAIL**: Tests share state, depend on execution order, leave resources
|
||||
|
||||
**Patterns to check:**
|
||||
|
||||
- afterEach/afterAll cleanup hooks present
|
||||
- No global variables mutated
|
||||
- Database/API state cleaned up after tests
|
||||
- Test data deleted or marked inactive
|
||||
|
||||
**Knowledge Fragment**: test-quality.md, data-factories.md
|
||||
|
||||
---
|
||||
|
||||
#### 7. Fixture Patterns (if `check_fixture_patterns: true`)
|
||||
|
||||
- ✅ **PASS**: Uses pure function → Fixture → mergeTests pattern
|
||||
- ⚠️ **WARN**: Some fixtures used but not consistently
|
||||
- ❌ **FAIL**: No fixtures, tests repeat setup code (maintainability risk)
|
||||
|
||||
**Patterns to check:**
|
||||
|
||||
- Fixtures defined (e.g., `test.extend({ customFixture: async ({}, use) => { ... }})`)
|
||||
- Pure functions used for fixture logic
|
||||
- mergeTests used to combine fixtures
|
||||
- No beforeEach with complex setup (should be in fixtures)
|
||||
|
||||
**Knowledge Fragment**: fixture-architecture.md
|
||||
|
||||
---
|
||||
|
||||
#### 8. Data Factories (if `check_data_factories: true`)
|
||||
|
||||
- ✅ **PASS**: Uses factory functions with overrides, API-first setup
|
||||
- ⚠️ **WARN**: Some factories used but also hardcoded data
|
||||
- ❌ **FAIL**: Hardcoded test data, magic strings/numbers (maintainability risk)
|
||||
|
||||
**Patterns to check:**
|
||||
|
||||
- Factory functions defined (e.g., `createUser()`, `generateInvoice()`)
|
||||
- Factories use faker.js or similar for realistic data
|
||||
- Factories accept overrides (e.g., `createUser({ email: 'custom@example.com' })`)
|
||||
- API-first setup (create via API, test via UI)
|
||||
|
||||
**Knowledge Fragment**: data-factories.md
|
||||
|
||||
---
|
||||
|
||||
#### 9. Network-First Pattern (if `check_network_first: true`)
|
||||
|
||||
- ✅ **PASS**: Route interception set up BEFORE navigation (race condition prevention)
|
||||
- ⚠️ **WARN**: Some routes intercepted correctly, others after navigation
|
||||
- ❌ **FAIL**: Route interception after navigation (race condition risk)
|
||||
|
||||
**Patterns to check:**
|
||||
|
||||
- `page.route()` called before `page.goto()`
|
||||
- `page.waitForResponse()` used with explicit URL pattern
|
||||
- No navigation followed immediately by route setup
|
||||
|
||||
**Knowledge Fragment**: network-first.md
|
||||
|
||||
---
|
||||
|
||||
#### 10. Assertions (if `check_assertions: true`)
|
||||
|
||||
- ✅ **PASS**: Explicit assertions present (expect, assert, toHaveText)
|
||||
- ⚠️ **WARN**: Some tests rely on implicit waits instead of assertions
|
||||
- ❌ **FAIL**: Missing assertions, tests don't verify behavior
|
||||
|
||||
**Patterns to check:**
|
||||
|
||||
- Each test has at least one assertion
|
||||
- Assertions are specific (not just truthy checks)
|
||||
- Assertions use framework-provided matchers (toHaveText, toBeVisible)
|
||||
|
||||
**Knowledge Fragment**: test-quality.md
|
||||
|
||||
---
|
||||
|
||||
#### 11. Test Length (if `check_test_length: true`)
|
||||
|
||||
- ✅ **PASS**: Test file ≤200 lines (ideal), ≤300 lines (acceptable)
|
||||
- ⚠️ **WARN**: Test file 301-500 lines (consider splitting)
|
||||
- ❌ **FAIL**: Test file >500 lines (too large, maintainability risk)
|
||||
|
||||
**Knowledge Fragment**: test-quality.md
|
||||
|
||||
---
|
||||
|
||||
#### 12. Test Duration (if `check_test_duration: true`)
|
||||
|
||||
- ✅ **PASS**: Individual tests ≤1.5 minutes (target: <30 seconds)
|
||||
- ⚠️ **WARN**: Some tests 1.5-3 minutes (consider optimization)
|
||||
- ❌ **FAIL**: Tests >3 minutes (too slow, impacts CI/CD)
|
||||
|
||||
**Note:** Duration estimation based on complexity analysis if execution data unavailable
|
||||
|
||||
**Knowledge Fragment**: test-quality.md, selective-testing.md
|
||||
|
||||
---
|
||||
|
||||
#### 13. Flakiness Patterns (if `check_flakiness_patterns: true`)
|
||||
|
||||
- ✅ **PASS**: No known flaky patterns detected
|
||||
- ⚠️ **WARN**: Some potential flaky patterns (e.g., tight timeouts, race conditions)
|
||||
- ❌ **FAIL**: Multiple flaky patterns detected (high flakiness risk)
|
||||
|
||||
**Patterns to detect:**
|
||||
|
||||
- Tight timeouts (e.g., `{ timeout: 1000 }`)
|
||||
- Race conditions (navigation before route interception)
|
||||
- Timing-dependent assertions (e.g., checking timestamps)
|
||||
- Retry logic in tests (hides flakiness)
|
||||
- Environment-dependent assumptions (hardcoded URLs, ports)
|
||||
|
||||
**Knowledge Fragment**: test-quality.md, network-first.md, ci-burn-in.md
|
||||
|
||||
---
|
||||
|
||||
### Step 4: Calculate Quality Score
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. **Count violations** by severity:
|
||||
- **Critical (P0)**: Hard waits without justification, no assertions, race conditions, shared state
|
||||
- **High (P1)**: Missing test IDs, no BDD structure, hardcoded data, missing fixtures
|
||||
- **Medium (P2)**: Long test files (>300 lines), missing priorities, some conditionals
|
||||
- **Low (P3)**: Minor style issues, incomplete cleanup, verbose tests
|
||||
|
||||
2. **Calculate quality score** (if `quality_score_enabled: true`):
|
||||
|
||||
```
|
||||
Starting Score: 100
|
||||
|
||||
Critical Violations: -10 points each
|
||||
High Violations: -5 points each
|
||||
Medium Violations: -2 points each
|
||||
Low Violations: -1 point each
|
||||
|
||||
Bonus Points:
|
||||
+ Excellent BDD structure: +5
|
||||
+ Comprehensive fixtures: +5
|
||||
+ Comprehensive data factories: +5
|
||||
+ Network-first pattern: +5
|
||||
+ Perfect isolation: +5
|
||||
+ All test IDs present: +5
|
||||
|
||||
Quality Score: max(0, min(100, Starting Score - Violations + Bonus))
|
||||
```
|
||||
|
||||
3. **Quality Grade**:
|
||||
- **90-100**: Excellent (A+)
|
||||
- **80-89**: Good (A)
|
||||
- **70-79**: Acceptable (B)
|
||||
- **60-69**: Needs Improvement (C)
|
||||
- **<60**: Critical Issues (F)
|
||||
|
||||
**Output:** Quality score calculated with violation breakdown
|
||||
|
||||
---
|
||||
|
||||
### Step 5: Generate Review Report
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. **Create review report** using `test-review-template.md`:
|
||||
|
||||
**Header Section:**
|
||||
- Test file(s) reviewed
|
||||
- Review date
|
||||
- Review scope (single/directory/suite)
|
||||
- Quality score and grade
|
||||
|
||||
**Executive Summary:**
|
||||
- Overall assessment (Excellent/Good/Needs Improvement/Critical)
|
||||
- Key strengths
|
||||
- Key weaknesses
|
||||
- Recommendation (Approve/Approve with comments/Request changes)
|
||||
|
||||
**Quality Criteria Assessment:**
|
||||
- Table with all criteria evaluated
|
||||
- Status for each (PASS/WARN/FAIL)
|
||||
- Violation count per criterion
|
||||
|
||||
**Critical Issues (Must Fix):**
|
||||
- Priority P0/P1 violations
|
||||
- Code location (file:line)
|
||||
- Explanation of issue
|
||||
- Recommended fix
|
||||
- Knowledge base reference
|
||||
|
||||
**Recommendations (Should Fix):**
|
||||
- Priority P2/P3 violations
|
||||
- Code location (file:line)
|
||||
- Explanation of issue
|
||||
- Recommended improvement
|
||||
- Knowledge base reference
|
||||
|
||||
**Best Practices Examples:**
|
||||
- Highlight good patterns found in tests
|
||||
- Reference knowledge base fragments
|
||||
- Provide examples for others to follow
|
||||
|
||||
**Knowledge Base References:**
|
||||
- List all fragments consulted
|
||||
- Provide links to detailed guidance
|
||||
|
||||
2. **Generate inline comments** (if `generate_inline_comments: true`):
|
||||
- Add TODO comments in test files at violation locations
|
||||
- Format: `// TODO (TEA Review): [Issue description] - See test-review-{filename}.md`
|
||||
- Never modify test logic, only add comments
|
||||
|
||||
3. **Generate quality badge** (if `generate_quality_badge: true`):
|
||||
- Create badge with quality score (e.g., "Test Quality: 87/100 (A)")
|
||||
- Format for inclusion in README or documentation
|
||||
|
||||
4. **Append to story file** (if `append_to_story: true` and story file exists):
|
||||
- Add "Test Quality Review" section to story
|
||||
- Include quality score and critical issues
|
||||
- Link to full review report
|
||||
|
||||
**Output:** Comprehensive review report with actionable feedback
|
||||
|
||||
---
|
||||
|
||||
### Step 6: Save Outputs and Notify
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. **Save review report** to `{output_file}`
|
||||
2. **Save inline comments** to test files (if enabled)
|
||||
3. **Save quality badge** to output folder (if enabled)
|
||||
4. **Update story file** (if enabled)
|
||||
5. **Generate summary message** for user:
|
||||
- Quality score and grade
|
||||
- Critical issue count
|
||||
- Recommendation
|
||||
|
||||
**Output:** All review artifacts saved and user notified
|
||||
|
||||
---
|
||||
|
||||
## Quality Criteria Decision Matrix
|
||||
|
||||
| Criterion | PASS | WARN | FAIL | Knowledge Fragment |
|
||||
| ------------------ | ------------------------- | -------------- | ------------------- | ----------------------- |
|
||||
| BDD Format | Given-When-Then present | Some structure | No structure | test-quality.md |
|
||||
| Test IDs | All tests have IDs | Some missing | No IDs | traceability.md |
|
||||
| Priority Markers | All classified | Some missing | No classification | test-priorities.md |
|
||||
| Hard Waits | No hard waits | Some justified | Hard waits present | test-quality.md |
|
||||
| Determinism | No conditionals/random | Some justified | Conditionals/random | test-quality.md |
|
||||
| Isolation | Clean up, no shared state | Some gaps | Shared state | test-quality.md |
|
||||
| Fixture Patterns | Pure fn → Fixture | Some fixtures | No fixtures | fixture-architecture.md |
|
||||
| Data Factories | Factory functions | Some factories | Hardcoded data | data-factories.md |
|
||||
| Network-First | Intercept before navigate | Some correct | Race conditions | network-first.md |
|
||||
| Assertions | Explicit assertions | Some implicit | Missing assertions | test-quality.md |
|
||||
| Test Length | ≤300 lines | 301-500 lines | >500 lines | test-quality.md |
|
||||
| Test Duration | ≤1.5 min | 1.5-3 min | >3 min | test-quality.md |
|
||||
| Flakiness Patterns | No flaky patterns | Some potential | Multiple patterns | ci-burn-in.md |
|
||||
|
||||
---
|
||||
|
||||
## Example Review Summary
|
||||
|
||||
````markdown
|
||||
# Test Quality Review: auth-login.spec.ts
|
||||
|
||||
**Quality Score**: 78/100 (B - Acceptable)
|
||||
**Review Date**: 2025-10-14
|
||||
**Recommendation**: Approve with Comments
|
||||
|
||||
## Executive Summary
|
||||
|
||||
Overall, the test demonstrates good structure and coverage of the login flow. However, there are several areas for improvement to enhance maintainability and prevent flakiness.
|
||||
|
||||
**Strengths:**
|
||||
|
||||
- Excellent BDD structure with clear Given-When-Then comments
|
||||
- Good use of test IDs (1.3-E2E-001, 1.3-E2E-002)
|
||||
- Comprehensive assertions on authentication state
|
||||
|
||||
**Weaknesses:**
|
||||
|
||||
- Hard wait detected (page.waitForTimeout(2000)) - flakiness risk
|
||||
- Hardcoded test data (email: 'test@example.com') - use factories instead
|
||||
- Missing fixture for common login setup - DRY violation
|
||||
|
||||
**Recommendation**: Address critical issue (hard wait) before merging. Other improvements can be addressed in follow-up PR.
|
||||
|
||||
## Critical Issues (Must Fix)
|
||||
|
||||
### 1. Hard Wait Detected (Line 45)
|
||||
|
||||
**Severity**: P0 (Critical)
|
||||
**Issue**: `await page.waitForTimeout(2000)` introduces flakiness
|
||||
**Fix**: Use explicit wait for element or network request instead
|
||||
**Knowledge**: See test-quality.md, network-first.md
|
||||
|
||||
```typescript
|
||||
// ❌ Bad (current)
|
||||
await page.waitForTimeout(2000);
|
||||
await expect(page.locator('[data-testid="user-menu"]')).toBeVisible();
|
||||
|
||||
// ✅ Good (recommended)
|
||||
await expect(page.locator('[data-testid="user-menu"]')).toBeVisible({ timeout: 10000 });
|
||||
```
|
||||
````
|
||||
|
||||
## Recommendations (Should Fix)
|
||||
|
||||
### 1. Use Data Factory for Test User (Lines 23, 32, 41)
|
||||
|
||||
**Severity**: P1 (High)
|
||||
**Issue**: Hardcoded email `test@example.com` - maintainability risk
|
||||
**Fix**: Create factory function for test users
|
||||
**Knowledge**: See data-factories.md
|
||||
|
||||
```typescript
|
||||
// ✅ Good (recommended)
|
||||
import { createTestUser } from './factories/user-factory';
|
||||
|
||||
const testUser = createTestUser({ role: 'admin' });
|
||||
await loginPage.login(testUser.email, testUser.password);
|
||||
```
|
||||
|
||||
### 2. Extract Login Setup to Fixture (Lines 18-28)
|
||||
|
||||
**Severity**: P1 (High)
|
||||
**Issue**: Login setup repeated across tests - DRY violation
|
||||
**Fix**: Create fixture for authenticated state
|
||||
**Knowledge**: See fixture-architecture.md
|
||||
|
||||
```typescript
|
||||
// ✅ Good (recommended)
|
||||
const test = base.extend({
|
||||
authenticatedPage: async ({ page }, use) => {
|
||||
const user = createTestUser();
|
||||
await loginPage.login(user.email, user.password);
|
||||
await use(page);
|
||||
},
|
||||
});
|
||||
|
||||
test('user can access dashboard', async ({ authenticatedPage }) => {
|
||||
// Test starts already logged in
|
||||
});
|
||||
```
|
||||
|
||||
## Quality Score Breakdown
|
||||
|
||||
- Starting Score: 100
|
||||
- Critical Violations (1 × -10): -10
|
||||
- High Violations (2 × -5): -10
|
||||
- Medium Violations (0 × -2): 0
|
||||
- Low Violations (1 × -1): -1
|
||||
- Bonus (BDD +5, Test IDs +5): +10
|
||||
- **Final Score**: 78/100 (B)
|
||||
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Integration with Other Workflows
|
||||
|
||||
### Before Test Review
|
||||
|
||||
- **atdd**: Generate acceptance tests (TEA reviews them for quality)
|
||||
- **automate**: Expand regression suite (TEA reviews new tests)
|
||||
- **dev story**: Developer writes implementation tests (TEA reviews them)
|
||||
|
||||
### After Test Review
|
||||
|
||||
- **Developer**: Addresses critical issues, improves based on recommendations
|
||||
- **gate**: Test quality review feeds into gate decision (high-quality tests increase confidence)
|
||||
|
||||
### Coordinates With
|
||||
|
||||
- **Story File**: Review links to acceptance criteria context
|
||||
- **Test Design**: Review validates tests align with prioritization
|
||||
- **Knowledge Base**: Review references fragments for detailed guidance
|
||||
|
||||
---
|
||||
|
||||
## Important Notes
|
||||
|
||||
1. **Non-Prescriptive**: Review provides guidance, not rigid rules
|
||||
2. **Context Matters**: Some violations may be justified for specific scenarios
|
||||
3. **Knowledge-Based**: All feedback grounded in proven patterns from tea-index.csv
|
||||
4. **Actionable**: Every issue includes recommended fix with code examples
|
||||
5. **Quality Score**: Use as indicator, not absolute measure
|
||||
6. **Continuous Improvement**: Review same tests periodically as patterns evolve
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**Problem: No test files found**
|
||||
- Verify test_dir path is correct
|
||||
- Check test file extensions match glob pattern
|
||||
- Ensure test files exist in expected location
|
||||
|
||||
**Problem: Quality score seems too low/high**
|
||||
- Review violation counts - may need to adjust thresholds
|
||||
- Consider context - some projects have different standards
|
||||
- Focus on critical issues first, not just score
|
||||
|
||||
**Problem: Inline comments not generated**
|
||||
- Check generate_inline_comments: true in variables
|
||||
- Verify write permissions on test files
|
||||
- Review append_to_file: false (separate report mode)
|
||||
|
||||
**Problem: Knowledge fragments not loading**
|
||||
- Verify tea-index.csv exists in testarch/ directory
|
||||
- Check fragment file paths are correct
|
||||
- Ensure auto_load_knowledge: true in variables
|
||||
```
|
||||
@@ -1,390 +0,0 @@
|
||||
# Test Quality Review: {test_filename}
|
||||
|
||||
**Quality Score**: {score}/100 ({grade} - {assessment})
|
||||
**Review Date**: {YYYY-MM-DD}
|
||||
**Review Scope**: {single | directory | suite}
|
||||
**Reviewer**: {user_name or TEA Agent}
|
||||
|
||||
---
|
||||
|
||||
Note: This review audits existing tests; it does not generate tests.
|
||||
|
||||
## Executive Summary
|
||||
|
||||
**Overall Assessment**: {Excellent | Good | Acceptable | Needs Improvement | Critical Issues}
|
||||
|
||||
**Recommendation**: {Approve | Approve with Comments | Request Changes | Block}
|
||||
|
||||
### Key Strengths
|
||||
|
||||
✅ {strength_1}
|
||||
✅ {strength_2}
|
||||
✅ {strength_3}
|
||||
|
||||
### Key Weaknesses
|
||||
|
||||
❌ {weakness_1}
|
||||
❌ {weakness_2}
|
||||
❌ {weakness_3}
|
||||
|
||||
### Summary
|
||||
|
||||
{1-2 paragraph summary of overall test quality, highlighting major findings and recommendation rationale}
|
||||
|
||||
---
|
||||
|
||||
## Quality Criteria Assessment
|
||||
|
||||
| Criterion | Status | Violations | Notes |
|
||||
| ------------------------------------ | ------------------------------- | ---------- | ------------ |
|
||||
| BDD Format (Given-When-Then) | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
|
||||
| Test IDs | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
|
||||
| Priority Markers (P0/P1/P2/P3) | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
|
||||
| Hard Waits (sleep, waitForTimeout) | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
|
||||
| Determinism (no conditionals) | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
|
||||
| Isolation (cleanup, no shared state) | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
|
||||
| Fixture Patterns | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
|
||||
| Data Factories | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
|
||||
| Network-First Pattern | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
|
||||
| Explicit Assertions | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
|
||||
| Test Length (≤300 lines) | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {lines} | {brief_note} |
|
||||
| Test Duration (≤1.5 min) | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {duration} | {brief_note} |
|
||||
| Flakiness Patterns | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
|
||||
|
||||
**Total Violations**: {critical_count} Critical, {high_count} High, {medium_count} Medium, {low_count} Low
|
||||
|
||||
---
|
||||
|
||||
## Quality Score Breakdown
|
||||
|
||||
```
|
||||
Starting Score: 100
|
||||
Critical Violations: -{critical_count} × 10 = -{critical_deduction}
|
||||
High Violations: -{high_count} × 5 = -{high_deduction}
|
||||
Medium Violations: -{medium_count} × 2 = -{medium_deduction}
|
||||
Low Violations: -{low_count} × 1 = -{low_deduction}
|
||||
|
||||
Bonus Points:
|
||||
Excellent BDD: +{0|5}
|
||||
Comprehensive Fixtures: +{0|5}
|
||||
Data Factories: +{0|5}
|
||||
Network-First: +{0|5}
|
||||
Perfect Isolation: +{0|5}
|
||||
All Test IDs: +{0|5}
|
||||
--------
|
||||
Total Bonus: +{bonus_total}
|
||||
|
||||
Final Score: {final_score}/100
|
||||
Grade: {grade}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Critical Issues (Must Fix)
|
||||
|
||||
{If no critical issues: "No critical issues detected. ✅"}
|
||||
|
||||
{For each critical issue:}
|
||||
|
||||
### {issue_number}. {Issue Title}
|
||||
|
||||
**Severity**: P0 (Critical)
|
||||
**Location**: `{filename}:{line_number}`
|
||||
**Criterion**: {criterion_name}
|
||||
**Knowledge Base**: [{fragment_name}]({fragment_path})
|
||||
|
||||
**Issue Description**:
|
||||
{Detailed explanation of what the problem is and why it's critical}
|
||||
|
||||
**Current Code**:
|
||||
|
||||
```typescript
|
||||
// ❌ Bad (current implementation)
|
||||
{
|
||||
code_snippet_showing_problem;
|
||||
}
|
||||
```
|
||||
|
||||
**Recommended Fix**:
|
||||
|
||||
```typescript
|
||||
// ✅ Good (recommended approach)
|
||||
{
|
||||
code_snippet_showing_solution;
|
||||
}
|
||||
```
|
||||
|
||||
**Why This Matters**:
|
||||
{Explanation of impact - flakiness risk, maintainability, reliability}
|
||||
|
||||
**Related Violations**:
|
||||
{If similar issue appears elsewhere, note line numbers}
|
||||
|
||||
---
|
||||
|
||||
## Recommendations (Should Fix)
|
||||
|
||||
{If no recommendations: "No additional recommendations. Test quality is excellent. ✅"}
|
||||
|
||||
{For each recommendation:}
|
||||
|
||||
### {rec_number}. {Recommendation Title}
|
||||
|
||||
**Severity**: {P1 (High) | P2 (Medium) | P3 (Low)}
|
||||
**Location**: `{filename}:{line_number}`
|
||||
**Criterion**: {criterion_name}
|
||||
**Knowledge Base**: [{fragment_name}]({fragment_path})
|
||||
|
||||
**Issue Description**:
|
||||
{Detailed explanation of what could be improved and why}
|
||||
|
||||
**Current Code**:
|
||||
|
||||
```typescript
|
||||
// ⚠️ Could be improved (current implementation)
|
||||
{
|
||||
code_snippet_showing_current_approach;
|
||||
}
|
||||
```
|
||||
|
||||
**Recommended Improvement**:
|
||||
|
||||
```typescript
|
||||
// ✅ Better approach (recommended)
|
||||
{
|
||||
code_snippet_showing_improvement;
|
||||
}
|
||||
```
|
||||
|
||||
**Benefits**:
|
||||
{Explanation of benefits - maintainability, readability, reusability}
|
||||
|
||||
**Priority**:
|
||||
{Why this is P1/P2/P3 - urgency and impact}
|
||||
|
||||
---
|
||||
|
||||
## Best Practices Found
|
||||
|
||||
{If good patterns found, highlight them}
|
||||
|
||||
{For each best practice:}
|
||||
|
||||
### {practice_number}. {Best Practice Title}
|
||||
|
||||
**Location**: `{filename}:{line_number}`
|
||||
**Pattern**: {pattern_name}
|
||||
**Knowledge Base**: [{fragment_name}]({fragment_path})
|
||||
|
||||
**Why This Is Good**:
|
||||
{Explanation of why this pattern is excellent}
|
||||
|
||||
**Code Example**:
|
||||
|
||||
```typescript
|
||||
// ✅ Excellent pattern demonstrated in this test
|
||||
{
|
||||
code_snippet_showing_best_practice;
|
||||
}
|
||||
```
|
||||
|
||||
**Use as Reference**:
|
||||
{Encourage using this pattern in other tests}
|
||||
|
||||
---
|
||||
|
||||
## Test File Analysis
|
||||
|
||||
### File Metadata
|
||||
|
||||
- **File Path**: `{relative_path_from_project_root}`
|
||||
- **File Size**: {line_count} lines, {kb_size} KB
|
||||
- **Test Framework**: {Playwright | Jest | Cypress | Vitest | Other}
|
||||
- **Language**: {TypeScript | JavaScript}
|
||||
|
||||
### Test Structure
|
||||
|
||||
- **Describe Blocks**: {describe_count}
|
||||
- **Test Cases (it/test)**: {test_count}
|
||||
- **Average Test Length**: {avg_lines_per_test} lines per test
|
||||
- **Fixtures Used**: {fixture_count} ({fixture_names})
|
||||
- **Data Factories Used**: {factory_count} ({factory_names})
|
||||
|
||||
### Test Coverage Scope
|
||||
|
||||
- **Test IDs**: {test_id_list}
|
||||
- **Priority Distribution**:
|
||||
- P0 (Critical): {p0_count} tests
|
||||
- P1 (High): {p1_count} tests
|
||||
- P2 (Medium): {p2_count} tests
|
||||
- P3 (Low): {p3_count} tests
|
||||
- Unknown: {unknown_count} tests
|
||||
|
||||
### Assertions Analysis
|
||||
|
||||
- **Total Assertions**: {assertion_count}
|
||||
- **Assertions per Test**: {avg_assertions_per_test} (avg)
|
||||
- **Assertion Types**: {assertion_types_used}
|
||||
|
||||
---
|
||||
|
||||
## Context and Integration
|
||||
|
||||
### Related Artifacts
|
||||
|
||||
{If story file found:}
|
||||
|
||||
- **Story File**: [{story_filename}]({story_path})
|
||||
- **Acceptance Criteria Mapped**: {ac_mapped}/{ac_total} ({ac_coverage}%)
|
||||
|
||||
{If test-design found:}
|
||||
|
||||
- **Test Design**: [{test_design_filename}]({test_design_path})
|
||||
- **Risk Assessment**: {risk_level}
|
||||
- **Priority Framework**: P0-P3 applied
|
||||
|
||||
### Acceptance Criteria Validation
|
||||
|
||||
{If story file available, map tests to ACs:}
|
||||
|
||||
| Acceptance Criterion | Test ID | Status | Notes |
|
||||
| -------------------- | --------- | -------------------------- | ------- |
|
||||
| {AC_1} | {test_id} | {✅ Covered \| ❌ Missing} | {notes} |
|
||||
| {AC_2} | {test_id} | {✅ Covered \| ❌ Missing} | {notes} |
|
||||
| {AC_3} | {test_id} | {✅ Covered \| ❌ Missing} | {notes} |
|
||||
|
||||
**Coverage**: {covered_count}/{total_count} criteria covered ({coverage_percentage}%)
|
||||
|
||||
---
|
||||
|
||||
## Knowledge Base References
|
||||
|
||||
This review consulted the following knowledge base fragments:
|
||||
|
||||
- **[test-quality.md](../../../testarch/knowledge/test-quality.md)** - Definition of Done for tests (no hard waits, <300 lines, <1.5 min, self-cleaning)
|
||||
- **[fixture-architecture.md](../../../testarch/knowledge/fixture-architecture.md)** - Pure function → Fixture → mergeTests pattern
|
||||
- **[network-first.md](../../../testarch/knowledge/network-first.md)** - Route intercept before navigate (race condition prevention)
|
||||
- **[data-factories.md](../../../testarch/knowledge/data-factories.md)** - Factory functions with overrides, API-first setup
|
||||
- **[test-levels-framework.md](../../../testarch/knowledge/test-levels-framework.md)** - E2E vs API vs Component vs Unit appropriateness
|
||||
- **[tdd-cycles.md](../../../testarch/knowledge/tdd-cycles.md)** - Red-Green-Refactor patterns
|
||||
- **[selective-testing.md](../../../testarch/knowledge/selective-testing.md)** - Duplicate coverage detection
|
||||
- **[ci-burn-in.md](../../../testarch/knowledge/ci-burn-in.md)** - Flakiness detection patterns (10-iteration loop)
|
||||
- **[test-priorities.md](../../../testarch/knowledge/test-priorities.md)** - P0/P1/P2/P3 classification framework
|
||||
- **[traceability.md](../../../testarch/knowledge/traceability.md)** - Requirements-to-tests mapping
|
||||
|
||||
See [tea-index.csv](../../../testarch/tea-index.csv) for complete knowledge base.
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
### Immediate Actions (Before Merge)
|
||||
|
||||
1. **{action_1}** - {description}
|
||||
- Priority: {P0 | P1 | P2}
|
||||
- Owner: {team_or_person}
|
||||
- Estimated Effort: {time_estimate}
|
||||
|
||||
2. **{action_2}** - {description}
|
||||
- Priority: {P0 | P1 | P2}
|
||||
- Owner: {team_or_person}
|
||||
- Estimated Effort: {time_estimate}
|
||||
|
||||
### Follow-up Actions (Future PRs)
|
||||
|
||||
1. **{action_1}** - {description}
|
||||
- Priority: {P2 | P3}
|
||||
- Target: {next_sprint | backlog}
|
||||
|
||||
2. **{action_2}** - {description}
|
||||
- Priority: {P2 | P3}
|
||||
- Target: {next_sprint | backlog}
|
||||
|
||||
### Re-Review Needed?
|
||||
|
||||
{✅ No re-review needed - approve as-is}
|
||||
{⚠️ Re-review after critical fixes - request changes, then re-review}
|
||||
{❌ Major refactor required - block merge, pair programming recommended}
|
||||
|
||||
---
|
||||
|
||||
## Decision
|
||||
|
||||
**Recommendation**: {Approve | Approve with Comments | Request Changes | Block}
|
||||
|
||||
**Rationale**:
|
||||
{1-2 paragraph explanation of recommendation based on findings}
|
||||
|
||||
**For Approve**:
|
||||
|
||||
> Test quality is excellent/good with {score}/100 score. {Minor issues noted can be addressed in follow-up PRs.} Tests are production-ready and follow best practices.
|
||||
|
||||
**For Approve with Comments**:
|
||||
|
||||
> Test quality is acceptable with {score}/100 score. {High-priority recommendations should be addressed but don't block merge.} Critical issues resolved, but improvements would enhance maintainability.
|
||||
|
||||
**For Request Changes**:
|
||||
|
||||
> Test quality needs improvement with {score}/100 score. {Critical issues must be fixed before merge.} {X} critical violations detected that pose flakiness/maintainability risks.
|
||||
|
||||
**For Block**:
|
||||
|
||||
> Test quality is insufficient with {score}/100 score. {Multiple critical issues make tests unsuitable for production.} Recommend pairing session with QA engineer to apply patterns from knowledge base.
|
||||
|
||||
---
|
||||
|
||||
## Appendix
|
||||
|
||||
### Violation Summary by Location
|
||||
|
||||
{Table of all violations sorted by line number:}
|
||||
|
||||
| Line | Severity | Criterion | Issue | Fix |
|
||||
| ------ | ------------- | ----------- | ------------- | ----------- |
|
||||
| {line} | {P0/P1/P2/P3} | {criterion} | {brief_issue} | {brief_fix} |
|
||||
| {line} | {P0/P1/P2/P3} | {criterion} | {brief_issue} | {brief_fix} |
|
||||
|
||||
### Quality Trends
|
||||
|
||||
{If reviewing same file multiple times, show trend:}
|
||||
|
||||
| Review Date | Score | Grade | Critical Issues | Trend |
|
||||
| ------------ | ------------- | --------- | --------------- | ----------- |
|
||||
| {YYYY-MM-DD} | {score_1}/100 | {grade_1} | {count_1} | ⬆️ Improved |
|
||||
| {YYYY-MM-DD} | {score_2}/100 | {grade_2} | {count_2} | ⬇️ Declined |
|
||||
| {YYYY-MM-DD} | {score_3}/100 | {grade_3} | {count_3} | ➡️ Stable |
|
||||
|
||||
### Related Reviews
|
||||
|
||||
{If reviewing multiple files in directory/suite:}
|
||||
|
||||
| File | Score | Grade | Critical | Status |
|
||||
| -------- | ----------- | ------- | -------- | ------------------ |
|
||||
| {file_1} | {score}/100 | {grade} | {count} | {Approved/Blocked} |
|
||||
| {file_2} | {score}/100 | {grade} | {count} | {Approved/Blocked} |
|
||||
| {file_3} | {score}/100 | {grade} | {count} | {Approved/Blocked} |
|
||||
|
||||
**Suite Average**: {avg_score}/100 ({avg_grade})
|
||||
|
||||
---
|
||||
|
||||
## Review Metadata
|
||||
|
||||
**Generated By**: BMad TEA Agent (Test Architect)
|
||||
**Workflow**: testarch-test-review v4.0
|
||||
**Review ID**: test-review-{filename}-{YYYYMMDD}
|
||||
**Timestamp**: {YYYY-MM-DD HH:MM:SS}
|
||||
**Version**: 1.0
|
||||
|
||||
---
|
||||
|
||||
## Feedback on This Review
|
||||
|
||||
If you have questions or feedback on this review:
|
||||
|
||||
1. Review patterns in knowledge base: `testarch/knowledge/`
|
||||
2. Consult tea-index.csv for detailed guidance
|
||||
3. Request clarification on specific violations
|
||||
4. Pair with QA engineer to apply patterns
|
||||
|
||||
This review is guidance, not rigid rules. Context matters - if a pattern is justified, document it with a comment.
|
||||
@@ -1,46 +0,0 @@
|
||||
# Test Architect workflow: test-review
|
||||
name: testarch-test-review
|
||||
description: "Review test quality using comprehensive knowledge base and best practices validation"
|
||||
author: "BMad"
|
||||
|
||||
# Critical variables from config
|
||||
config_source: "{project-root}/_bmad/bmm/config.yaml"
|
||||
output_folder: "{config_source}:output_folder"
|
||||
user_name: "{config_source}:user_name"
|
||||
communication_language: "{config_source}:communication_language"
|
||||
document_output_language: "{config_source}:document_output_language"
|
||||
date: system-generated
|
||||
|
||||
# Workflow components
|
||||
installed_path: "{project-root}/_bmad/bmm/workflows/testarch/test-review"
|
||||
instructions: "{installed_path}/instructions.md"
|
||||
validation: "{installed_path}/checklist.md"
|
||||
template: "{installed_path}/test-review-template.md"
|
||||
|
||||
# Variables and inputs
|
||||
variables:
|
||||
test_dir: "{project-root}/tests" # Root test directory
|
||||
review_scope: "single" # single (one file), directory (folder), suite (all tests)
|
||||
|
||||
# Output configuration
|
||||
default_output_file: "{output_folder}/test-review.md"
|
||||
|
||||
# Required tools
|
||||
required_tools:
|
||||
- read_file # Read test files, story, test-design
|
||||
- write_file # Create review report
|
||||
- list_files # Discover test files in directory
|
||||
- search_repo # Find tests by patterns
|
||||
- glob # Find test files matching patterns
|
||||
|
||||
tags:
|
||||
- qa
|
||||
- test-architect
|
||||
- code-review
|
||||
- quality
|
||||
- best-practices
|
||||
|
||||
execution_hints:
|
||||
interactive: false # Minimize prompts
|
||||
autonomous: true # Proceed without user input unless blocked
|
||||
iterative: true # Can review multiple files
|
||||
@@ -1,655 +0,0 @@
|
||||
# Requirements Traceability & Gate Decision - Validation Checklist
|
||||
|
||||
**Workflow:** `testarch-trace`
|
||||
**Purpose:** Ensure complete traceability matrix with actionable gap analysis AND make deployment readiness decision (PASS/CONCERNS/FAIL/WAIVED)
|
||||
|
||||
This checklist covers **two sequential phases**:
|
||||
|
||||
- **PHASE 1**: Requirements Traceability (always executed)
|
||||
- **PHASE 2**: Quality Gate Decision (executed if `enable_gate_decision: true`)
|
||||
|
||||
---
|
||||
|
||||
# PHASE 1: REQUIREMENTS TRACEABILITY
|
||||
|
||||
## Prerequisites Validation
|
||||
|
||||
- [ ] Acceptance criteria are available (from story file OR inline)
|
||||
- [ ] Test suite exists (or gaps are acknowledged and documented)
|
||||
- [ ] If tests are missing, recommend `*atdd` (trace does not run it automatically)
|
||||
- [ ] Test directory path is correct (`test_dir` variable)
|
||||
- [ ] Story file is accessible (if using BMad mode)
|
||||
- [ ] Knowledge base is loaded (test-priorities, traceability, risk-governance)
|
||||
|
||||
---
|
||||
|
||||
## Context Loading
|
||||
|
||||
- [ ] Story file read successfully (if applicable)
|
||||
- [ ] Acceptance criteria extracted correctly
|
||||
- [ ] Story ID identified (e.g., 1.3)
|
||||
- [ ] `test-design.md` loaded (if available)
|
||||
- [ ] `tech-spec.md` loaded (if available)
|
||||
- [ ] `PRD.md` loaded (if available)
|
||||
- [ ] Relevant knowledge fragments loaded from `tea-index.csv`
|
||||
|
||||
---
|
||||
|
||||
## Test Discovery and Cataloging
|
||||
|
||||
- [ ] Tests auto-discovered using multiple strategies (test IDs, describe blocks, file paths)
|
||||
- [ ] Tests categorized by level (E2E, API, Component, Unit)
|
||||
- [ ] Test metadata extracted:
|
||||
- [ ] Test IDs (e.g., 1.3-E2E-001)
|
||||
- [ ] Describe/context blocks
|
||||
- [ ] It blocks (individual test cases)
|
||||
- [ ] Given-When-Then structure (if BDD)
|
||||
- [ ] Priority markers (P0/P1/P2/P3)
|
||||
- [ ] All relevant test files found (no tests missed due to naming conventions)
|
||||
|
||||
---
|
||||
|
||||
## Criteria-to-Test Mapping
|
||||
|
||||
- [ ] Each acceptance criterion mapped to tests (or marked as NONE)
|
||||
- [ ] Explicit references found (test IDs, describe blocks mentioning criterion)
|
||||
- [ ] Test level documented (E2E, API, Component, Unit)
|
||||
- [ ] Given-When-Then narrative verified for alignment
|
||||
- [ ] Traceability matrix table generated:
|
||||
- [ ] Criterion ID
|
||||
- [ ] Description
|
||||
- [ ] Test ID
|
||||
- [ ] Test File
|
||||
- [ ] Test Level
|
||||
- [ ] Coverage Status
|
||||
|
||||
---
|
||||
|
||||
## Coverage Classification
|
||||
|
||||
- [ ] Coverage status classified for each criterion:
|
||||
- [ ] **FULL** - All scenarios validated at appropriate level(s)
|
||||
- [ ] **PARTIAL** - Some coverage but missing edge cases or levels
|
||||
- [ ] **NONE** - No test coverage at any level
|
||||
- [ ] **UNIT-ONLY** - Only unit tests (missing integration/E2E validation)
|
||||
- [ ] **INTEGRATION-ONLY** - Only API/Component tests (missing unit confidence)
|
||||
- [ ] Classification justifications provided
|
||||
- [ ] Edge cases considered in FULL vs PARTIAL determination
|
||||
|
||||
---
|
||||
|
||||
## Duplicate Coverage Detection
|
||||
|
||||
- [ ] Duplicate coverage checked across test levels
|
||||
- [ ] Acceptable overlap identified (defense in depth for critical paths)
|
||||
- [ ] Unacceptable duplication flagged (same validation at multiple levels)
|
||||
- [ ] Recommendations provided for consolidation
|
||||
- [ ] Selective testing principles applied
|
||||
|
||||
---
|
||||
|
||||
## Gap Analysis
|
||||
|
||||
- [ ] Coverage gaps identified:
|
||||
- [ ] Criteria with NONE status
|
||||
- [ ] Criteria with PARTIAL status
|
||||
- [ ] Criteria with UNIT-ONLY status
|
||||
- [ ] Criteria with INTEGRATION-ONLY status
|
||||
- [ ] Gaps prioritized by risk level using test-priorities framework:
|
||||
- [ ] **CRITICAL** - P0 criteria without FULL coverage (BLOCKER)
|
||||
- [ ] **HIGH** - P1 criteria without FULL coverage (PR blocker)
|
||||
- [ ] **MEDIUM** - P2 criteria without FULL coverage (nightly gap)
|
||||
- [ ] **LOW** - P3 criteria without FULL coverage (acceptable)
|
||||
- [ ] Specific test recommendations provided for each gap:
|
||||
- [ ] Suggested test level (E2E, API, Component, Unit)
|
||||
- [ ] Test description (Given-When-Then)
|
||||
- [ ] Recommended test ID (e.g., 1.3-E2E-004)
|
||||
- [ ] Explanation of why test is needed
|
||||
|
||||
---
|
||||
|
||||
## Coverage Metrics
|
||||
|
||||
- [ ] Overall coverage percentage calculated (FULL coverage / total criteria)
|
||||
- [ ] P0 coverage percentage calculated
|
||||
- [ ] P1 coverage percentage calculated
|
||||
- [ ] P2 coverage percentage calculated (if applicable)
|
||||
- [ ] Coverage by level calculated:
|
||||
- [ ] E2E coverage %
|
||||
- [ ] API coverage %
|
||||
- [ ] Component coverage %
|
||||
- [ ] Unit coverage %
|
||||
|
||||
---
|
||||
|
||||
## Test Quality Verification
|
||||
|
||||
For each mapped test, verify:
|
||||
|
||||
- [ ] Explicit assertions are present (not hidden in helpers)
|
||||
- [ ] Test follows Given-When-Then structure
|
||||
- [ ] No hard waits or sleeps (deterministic waiting only)
|
||||
- [ ] Self-cleaning (test cleans up its data)
|
||||
- [ ] File size < 300 lines
|
||||
- [ ] Test duration < 90 seconds
|
||||
|
||||
Quality issues flagged:
|
||||
|
||||
- [ ] **BLOCKER** issues identified (missing assertions, hard waits, flaky patterns)
|
||||
- [ ] **WARNING** issues identified (large files, slow tests, unclear structure)
|
||||
- [ ] **INFO** issues identified (style inconsistencies, missing documentation)
|
||||
|
||||
Knowledge fragments referenced:
|
||||
|
||||
- [ ] `test-quality.md` for Definition of Done
|
||||
- [ ] `fixture-architecture.md` for self-cleaning patterns
|
||||
- [ ] `network-first.md` for Playwright best practices
|
||||
- [ ] `data-factories.md` for test data patterns
|
||||
|
||||
---
|
||||
|
||||
## Phase 1 Deliverables Generated
|
||||
|
||||
### Traceability Matrix Markdown
|
||||
|
||||
- [ ] File created at `{output_folder}/traceability-matrix.md`
|
||||
- [ ] Template from `trace-template.md` used
|
||||
- [ ] Full mapping table included
|
||||
- [ ] Coverage status section included
|
||||
- [ ] Gap analysis section included
|
||||
- [ ] Quality assessment section included
|
||||
- [ ] Recommendations section included
|
||||
|
||||
### Coverage Badge/Metric (if enabled)
|
||||
|
||||
- [ ] Badge markdown generated
|
||||
- [ ] Metrics exported to JSON for CI/CD integration
|
||||
|
||||
### Updated Story File (if enabled)
|
||||
|
||||
- [ ] "Traceability" section added to story markdown
|
||||
- [ ] Link to traceability matrix included
|
||||
- [ ] Coverage summary included
|
||||
|
||||
---
|
||||
|
||||
## Phase 1 Quality Assurance
|
||||
|
||||
### Accuracy Checks
|
||||
|
||||
- [ ] All acceptance criteria accounted for (none skipped)
|
||||
- [ ] Test IDs correctly formatted (e.g., 1.3-E2E-001)
|
||||
- [ ] File paths are correct and accessible
|
||||
- [ ] Coverage percentages calculated correctly
|
||||
- [ ] No false positives (tests incorrectly mapped to criteria)
|
||||
- [ ] No false negatives (existing tests missed in mapping)
|
||||
|
||||
### Completeness Checks
|
||||
|
||||
- [ ] All test levels considered (E2E, API, Component, Unit)
|
||||
- [ ] All priorities considered (P0, P1, P2, P3)
|
||||
- [ ] All coverage statuses used appropriately (FULL, PARTIAL, NONE, UNIT-ONLY, INTEGRATION-ONLY)
|
||||
- [ ] All gaps have recommendations
|
||||
- [ ] All quality issues have severity and remediation guidance
|
||||
|
||||
### Actionability Checks
|
||||
|
||||
- [ ] Recommendations are specific (not generic)
|
||||
- [ ] Test IDs suggested for new tests
|
||||
- [ ] Given-When-Then provided for recommended tests
|
||||
- [ ] Impact explained for each gap
|
||||
- [ ] Priorities clear (CRITICAL, HIGH, MEDIUM, LOW)
|
||||
|
||||
---
|
||||
|
||||
## Phase 1 Documentation
|
||||
|
||||
- [ ] Traceability matrix is readable and well-formatted
|
||||
- [ ] Tables render correctly in markdown
|
||||
- [ ] Code blocks have proper syntax highlighting
|
||||
- [ ] Links are valid and accessible
|
||||
- [ ] Recommendations are clear and prioritized
|
||||
|
||||
---
|
||||
|
||||
# PHASE 2: QUALITY GATE DECISION
|
||||
|
||||
**Note**: Phase 2 executes only if `enable_gate_decision: true` in workflow.yaml
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Evidence Gathering
|
||||
|
||||
- [ ] Test execution results obtained (CI/CD pipeline, test framework reports)
|
||||
- [ ] Story/epic/release file identified and read
|
||||
- [ ] Test design document discovered or explicitly provided (if available)
|
||||
- [ ] Traceability matrix discovered or explicitly provided (available from Phase 1)
|
||||
- [ ] NFR assessment discovered or explicitly provided (if available)
|
||||
- [ ] Code coverage report discovered or explicitly provided (if available)
|
||||
- [ ] Burn-in results discovered or explicitly provided (if available)
|
||||
|
||||
### Evidence Validation
|
||||
|
||||
- [ ] Evidence freshness validated (warn if >7 days old, recommend re-running workflows)
|
||||
- [ ] All required assessments available or user acknowledged gaps
|
||||
- [ ] Test results are complete (not partial or interrupted runs)
|
||||
- [ ] Test results match current codebase (not from outdated branch)
|
||||
|
||||
### Knowledge Base Loading
|
||||
|
||||
- [ ] `risk-governance.md` loaded successfully
|
||||
- [ ] `probability-impact.md` loaded successfully
|
||||
- [ ] `test-quality.md` loaded successfully
|
||||
- [ ] `test-priorities.md` loaded successfully
|
||||
- [ ] `ci-burn-in.md` loaded (if burn-in results available)
|
||||
|
||||
---
|
||||
|
||||
## Process Steps
|
||||
|
||||
### Step 1: Context Loading
|
||||
|
||||
- [ ] Gate type identified (story/epic/release/hotfix)
|
||||
- [ ] Target ID extracted (story_id, epic_num, or release_version)
|
||||
- [ ] Decision thresholds loaded from workflow variables
|
||||
- [ ] Risk tolerance configuration loaded
|
||||
- [ ] Waiver policy loaded
|
||||
|
||||
### Step 2: Evidence Parsing
|
||||
|
||||
**Test Results:**
|
||||
|
||||
- [ ] Total test count extracted
|
||||
- [ ] Passed test count extracted
|
||||
- [ ] Failed test count extracted
|
||||
- [ ] Skipped test count extracted
|
||||
- [ ] Test duration extracted
|
||||
- [ ] P0 test pass rate calculated
|
||||
- [ ] P1 test pass rate calculated
|
||||
- [ ] Overall test pass rate calculated
|
||||
|
||||
**Quality Assessments:**
|
||||
|
||||
- [ ] P0/P1/P2/P3 scenarios extracted from test-design.md (if available)
|
||||
- [ ] Risk scores extracted from test-design.md (if available)
|
||||
- [ ] Coverage percentages extracted from traceability-matrix.md (available from Phase 1)
|
||||
- [ ] Coverage gaps extracted from traceability-matrix.md (available from Phase 1)
|
||||
- [ ] NFR status extracted from nfr-assessment.md (if available)
|
||||
- [ ] Security issues count extracted from nfr-assessment.md (if available)
|
||||
|
||||
**Code Coverage:**
|
||||
|
||||
- [ ] Line coverage percentage extracted (if available)
|
||||
- [ ] Branch coverage percentage extracted (if available)
|
||||
- [ ] Function coverage percentage extracted (if available)
|
||||
- [ ] Critical path coverage validated (if available)
|
||||
|
||||
**Burn-in Results:**
|
||||
|
||||
- [ ] Burn-in iterations count extracted (if available)
|
||||
- [ ] Flaky tests count extracted (if available)
|
||||
- [ ] Stability score calculated (if available)
|
||||
|
||||
### Step 3: Decision Rules Application
|
||||
|
||||
**P0 Criteria Evaluation:**
|
||||
|
||||
- [ ] P0 test pass rate evaluated (must be 100%)
|
||||
- [ ] P0 acceptance criteria coverage evaluated (must be 100%)
|
||||
- [ ] Security issues count evaluated (must be 0)
|
||||
- [ ] Critical NFR failures evaluated (must be 0)
|
||||
- [ ] Flaky tests evaluated (must be 0 if burn-in enabled)
|
||||
- [ ] P0 decision recorded: PASS or FAIL
|
||||
|
||||
**P1 Criteria Evaluation:**
|
||||
|
||||
- [ ] P1 test pass rate evaluated (threshold: min_p1_pass_rate)
|
||||
- [ ] P1 acceptance criteria coverage evaluated (threshold: 95%)
|
||||
- [ ] Overall test pass rate evaluated (threshold: min_overall_pass_rate)
|
||||
- [ ] Code coverage evaluated (threshold: min_coverage)
|
||||
- [ ] P1 decision recorded: PASS or CONCERNS
|
||||
|
||||
**P2/P3 Criteria Evaluation:**
|
||||
|
||||
- [ ] P2 failures tracked (informational, don't block if allow_p2_failures: true)
|
||||
- [ ] P3 failures tracked (informational, don't block if allow_p3_failures: true)
|
||||
- [ ] Residual risks documented
|
||||
|
||||
**Final Decision:**
|
||||
|
||||
- [ ] Decision determined: PASS / CONCERNS / FAIL / WAIVED
|
||||
- [ ] Decision rationale documented
|
||||
- [ ] Decision is deterministic (follows rules, not arbitrary)
|
||||
|
||||
### Step 4: Documentation
|
||||
|
||||
**Gate Decision Document Created:**
|
||||
|
||||
- [ ] Story/epic/release info section complete (ID, title, description, links)
|
||||
- [ ] Decision clearly stated (PASS / CONCERNS / FAIL / WAIVED)
|
||||
- [ ] Decision date recorded
|
||||
- [ ] Evaluator recorded (user or agent name)
|
||||
|
||||
**Evidence Summary Documented:**
|
||||
|
||||
- [ ] Test results summary complete (total, passed, failed, pass rates)
|
||||
- [ ] Coverage summary complete (P0/P1 criteria, code coverage)
|
||||
- [ ] NFR validation summary complete (security, performance, reliability, maintainability)
|
||||
- [ ] Flakiness summary complete (burn-in iterations, flaky test count)
|
||||
|
||||
**Rationale Documented:**
|
||||
|
||||
- [ ] Decision rationale clearly explained
|
||||
- [ ] Key evidence highlighted
|
||||
- [ ] Assumptions and caveats noted (if any)
|
||||
|
||||
**Residual Risks Documented (if CONCERNS or WAIVED):**
|
||||
|
||||
- [ ] Unresolved P1/P2 issues listed
|
||||
- [ ] Probability × impact estimated for each risk
|
||||
- [ ] Mitigations or workarounds described
|
||||
|
||||
**Waivers Documented (if WAIVED):**
|
||||
|
||||
- [ ] Waiver reason documented (business justification)
|
||||
- [ ] Waiver approver documented (name, role)
|
||||
- [ ] Waiver expiry date documented
|
||||
- [ ] Remediation plan documented (fix in next release, due date)
|
||||
- [ ] Monitoring plan documented
|
||||
|
||||
**Critical Issues Documented (if FAIL or CONCERNS):**
|
||||
|
||||
- [ ] Top 5-10 critical issues listed
|
||||
- [ ] Priority assigned to each issue (P0/P1/P2)
|
||||
- [ ] Owner assigned to each issue
|
||||
- [ ] Due date assigned to each issue
|
||||
|
||||
**Recommendations Documented:**
|
||||
|
||||
- [ ] Next steps clearly stated for decision type
|
||||
- [ ] Deployment recommendation provided
|
||||
- [ ] Monitoring recommendations provided (if applicable)
|
||||
- [ ] Remediation recommendations provided (if applicable)
|
||||
|
||||
### Step 5: Status Updates and Notifications
|
||||
|
||||
**Status File Updated:**
|
||||
|
||||
- [ ] Gate decision appended to bmm-workflow-status.md (if append_to_history: true)
|
||||
- [ ] Format correct: `[DATE] Gate Decision: DECISION - Target {ID} - {rationale}`
|
||||
- [ ] Status file committed or staged for commit
|
||||
|
||||
**Gate YAML Created:**
|
||||
|
||||
- [ ] Gate YAML snippet generated with decision and criteria
|
||||
- [ ] Evidence references included in YAML
|
||||
- [ ] Next steps included in YAML
|
||||
- [ ] YAML file saved to output folder
|
||||
|
||||
**Stakeholder Notification Generated:**
|
||||
|
||||
- [ ] Notification subject line created
|
||||
- [ ] Notification body created with summary
|
||||
- [ ] Recipients identified (PM, SM, DEV lead, stakeholders)
|
||||
- [ ] Notification ready for delivery (if notify_stakeholders: true)
|
||||
|
||||
**Outputs Saved:**
|
||||
|
||||
- [ ] Gate decision document saved to `{output_file}`
|
||||
- [ ] Gate YAML saved to `{output_folder}/gate-decision-{target}.yaml`
|
||||
- [ ] All outputs are valid and readable
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 Output Validation
|
||||
|
||||
### Gate Decision Document
|
||||
|
||||
**Completeness:**
|
||||
|
||||
- [ ] All required sections present (info, decision, evidence, rationale, next steps)
|
||||
- [ ] No placeholder text or TODOs left in document
|
||||
- [ ] All evidence references are accurate and complete
|
||||
- [ ] All links to artifacts are valid
|
||||
|
||||
**Accuracy:**
|
||||
|
||||
- [ ] Decision matches applied criteria rules
|
||||
- [ ] Test results match CI/CD pipeline output
|
||||
- [ ] Coverage percentages match reports
|
||||
- [ ] NFR status matches assessment document
|
||||
- [ ] No contradictions or inconsistencies
|
||||
|
||||
**Clarity:**
|
||||
|
||||
- [ ] Decision rationale is clear and unambiguous
|
||||
- [ ] Technical jargon is explained or avoided
|
||||
- [ ] Stakeholders can understand next steps
|
||||
- [ ] Recommendations are actionable
|
||||
|
||||
### Gate YAML
|
||||
|
||||
**Format:**
|
||||
|
||||
- [ ] YAML is valid (no syntax errors)
|
||||
- [ ] All required fields present (target, decision, date, evaluator, criteria, evidence)
|
||||
- [ ] Field values are correct data types (numbers, strings, dates)
|
||||
|
||||
**Content:**
|
||||
|
||||
- [ ] Criteria values match decision document
|
||||
- [ ] Evidence references are accurate
|
||||
- [ ] Next steps align with decision type
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 Quality Checks
|
||||
|
||||
### Decision Integrity
|
||||
|
||||
- [ ] Decision is deterministic (follows rules, not arbitrary)
|
||||
- [ ] P0 failures result in FAIL decision (unless waived)
|
||||
- [ ] Security issues result in FAIL decision (unless waived - but should never be waived)
|
||||
- [ ] Waivers have business justification and approver (if WAIVED)
|
||||
- [ ] Residual risks are documented (if CONCERNS or WAIVED)
|
||||
|
||||
### Evidence-Based
|
||||
|
||||
- [ ] Decision is based on actual test results (not guesses)
|
||||
- [ ] All claims are supported by evidence
|
||||
- [ ] No assumptions without documentation
|
||||
- [ ] Evidence sources are cited (CI run IDs, report URLs)
|
||||
|
||||
### Transparency
|
||||
|
||||
- [ ] Decision rationale is transparent and auditable
|
||||
- [ ] Criteria evaluation is documented step-by-step
|
||||
- [ ] Any deviations from standard process are explained
|
||||
- [ ] Waiver justifications are clear (if applicable)
|
||||
|
||||
### Consistency
|
||||
|
||||
- [ ] Decision aligns with risk-governance knowledge fragment
|
||||
- [ ] Priority framework (P0/P1/P2/P3) applied consistently
|
||||
- [ ] Terminology consistent with test-quality knowledge fragment
|
||||
- [ ] Decision matrix followed correctly
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 Integration Points
|
||||
|
||||
### BMad Workflow Status
|
||||
|
||||
- [ ] Gate decision added to `bmm-workflow-status.md`
|
||||
- [ ] Format matches existing gate history entries
|
||||
- [ ] Timestamp is accurate
|
||||
- [ ] Decision summary is concise (<80 chars)
|
||||
|
||||
### CI/CD Pipeline
|
||||
|
||||
- [ ] Gate YAML is CI/CD-compatible
|
||||
- [ ] YAML can be parsed by pipeline automation
|
||||
- [ ] Decision can be used to block/allow deployments
|
||||
- [ ] Evidence references are accessible to pipeline
|
||||
|
||||
### Stakeholders
|
||||
|
||||
- [ ] Notification message is clear and actionable
|
||||
- [ ] Decision is explained in non-technical terms
|
||||
- [ ] Next steps are specific and time-bound
|
||||
- [ ] Recipients are appropriate for decision type
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 Compliance and Audit
|
||||
|
||||
### Audit Trail
|
||||
|
||||
- [ ] Decision date and time recorded
|
||||
- [ ] Evaluator identified (user or agent)
|
||||
- [ ] All evidence sources cited
|
||||
- [ ] Decision criteria documented
|
||||
- [ ] Rationale clearly explained
|
||||
|
||||
### Traceability
|
||||
|
||||
- [ ] Gate decision traceable to story/epic/release
|
||||
- [ ] Evidence traceable to specific test runs
|
||||
- [ ] Assessments traceable to workflows that created them
|
||||
- [ ] Waiver traceable to approver (if applicable)
|
||||
|
||||
### Compliance
|
||||
|
||||
- [ ] Security requirements validated (no unresolved vulnerabilities)
|
||||
- [ ] Quality standards met or waived with justification
|
||||
- [ ] Regulatory requirements addressed (if applicable)
|
||||
- [ ] Documentation sufficient for external audit
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 Edge Cases and Exceptions
|
||||
|
||||
### Missing Evidence
|
||||
|
||||
- [ ] If test-design.md missing, decision still possible with test results + trace
|
||||
- [ ] If traceability-matrix.md missing, decision still possible with test results (but Phase 1 should provide it)
|
||||
- [ ] If nfr-assessment.md missing, NFR validation marked as NOT ASSESSED
|
||||
- [ ] If code coverage missing, coverage criterion marked as NOT ASSESSED
|
||||
- [ ] User acknowledged gaps in evidence or provided alternative proof
|
||||
|
||||
### Stale Evidence
|
||||
|
||||
- [ ] Evidence freshness checked (if validate_evidence_freshness: true)
|
||||
- [ ] Warnings issued for assessments >7 days old
|
||||
- [ ] User acknowledged stale evidence or re-ran workflows
|
||||
- [ ] Decision document notes any stale evidence used
|
||||
|
||||
### Conflicting Evidence
|
||||
|
||||
- [ ] Conflicts between test results and assessments resolved
|
||||
- [ ] Most recent/authoritative source identified
|
||||
- [ ] Conflict resolution documented in decision rationale
|
||||
- [ ] User consulted if conflict cannot be resolved
|
||||
|
||||
### Waiver Scenarios
|
||||
|
||||
- [ ] Waiver only used for FAIL decision (not PASS or CONCERNS)
|
||||
- [ ] Waiver has business justification (not technical convenience)
|
||||
- [ ] Waiver has named approver with authority (VP/CTO/PO)
|
||||
- [ ] Waiver has expiry date (does NOT apply to future releases)
|
||||
- [ ] Waiver has remediation plan with concrete due date
|
||||
- [ ] Security vulnerabilities are NOT waived (enforced)
|
||||
|
||||
---
|
||||
|
||||
# FINAL VALIDATION (Both Phases)
|
||||
|
||||
## Non-Prescriptive Validation
|
||||
|
||||
- [ ] Traceability format adapted to team needs (not rigid template)
|
||||
- [ ] Examples are minimal and focused on patterns
|
||||
- [ ] Teams can extend with custom classifications
|
||||
- [ ] Integration with external systems supported (JIRA, Azure DevOps)
|
||||
- [ ] Compliance requirements considered (if applicable)
|
||||
|
||||
---
|
||||
|
||||
## Documentation and Communication
|
||||
|
||||
- [ ] All documents are readable and well-formatted
|
||||
- [ ] Tables render correctly in markdown
|
||||
- [ ] Code blocks have proper syntax highlighting
|
||||
- [ ] Links are valid and accessible
|
||||
- [ ] Recommendations are clear and prioritized
|
||||
- [ ] Gate decision is prominent and unambiguous (Phase 2)
|
||||
|
||||
---
|
||||
|
||||
## Final Validation
|
||||
|
||||
**Phase 1 (Traceability):**
|
||||
|
||||
- [ ] All prerequisites met
|
||||
- [ ] All acceptance criteria mapped or gaps documented
|
||||
- [ ] P0 coverage is 100% OR documented as BLOCKER
|
||||
- [ ] Gap analysis is complete and prioritized
|
||||
- [ ] Test quality issues identified and flagged
|
||||
- [ ] Deliverables generated and saved
|
||||
|
||||
**Phase 2 (Gate Decision):**
|
||||
|
||||
- [ ] All quality evidence gathered
|
||||
- [ ] Decision criteria applied correctly
|
||||
- [ ] Decision rationale documented
|
||||
- [ ] Gate YAML ready for CI/CD integration
|
||||
- [ ] Status file updated (if enabled)
|
||||
- [ ] Stakeholders notified (if enabled)
|
||||
|
||||
**Workflow Complete:**
|
||||
|
||||
- [ ] Phase 1 completed successfully
|
||||
- [ ] Phase 2 completed successfully (if enabled)
|
||||
- [ ] All outputs validated and saved
|
||||
- [ ] Ready to proceed based on gate decision
|
||||
|
||||
---
|
||||
|
||||
## Sign-Off
|
||||
|
||||
**Phase 1 - Traceability Status:**
|
||||
|
||||
- [ ] ✅ PASS - All quality gates met, no critical gaps
|
||||
- [ ] ⚠️ WARN - P1 gaps exist, address before PR merge
|
||||
- [ ] ❌ FAIL - P0 gaps exist, BLOCKER for release
|
||||
|
||||
**Phase 2 - Gate Decision Status (if enabled):**
|
||||
|
||||
- [ ] ✅ PASS - Deploy to production
|
||||
- [ ] ⚠️ CONCERNS - Deploy with monitoring
|
||||
- [ ] ❌ FAIL - Block deployment, fix issues
|
||||
- [ ] 🔓 WAIVED - Deploy with business approval and remediation plan
|
||||
|
||||
**Next Actions:**
|
||||
|
||||
- If PASS (both phases): Proceed to deployment
|
||||
- If WARN/CONCERNS: Address gaps/issues, proceed with monitoring
|
||||
- If FAIL (either phase): Run `*atdd` for missing tests, fix issues, re-run `*trace`
|
||||
- If WAIVED: Deploy with approved waiver, schedule remediation
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
Record any issues, deviations, or important observations during workflow execution:
|
||||
|
||||
- **Phase 1 Issues**: [Note any traceability mapping challenges, missing tests, quality concerns]
|
||||
- **Phase 2 Issues**: [Note any missing, stale, or conflicting evidence]
|
||||
- **Decision Rationale**: [Document any nuanced reasoning or edge cases]
|
||||
- **Waiver Details**: [Document waiver negotiations or approvals]
|
||||
- **Follow-up Actions**: [List any actions required after gate decision]
|
||||
|
||||
---
|
||||
|
||||
<!-- Powered by BMAD-CORE™ -->
|
||||
File diff suppressed because it is too large
Load Diff
@@ -1,675 +0,0 @@
|
||||
# Traceability Matrix & Gate Decision - Story {STORY_ID}
|
||||
|
||||
**Story:** {STORY_TITLE}
|
||||
**Date:** {DATE}
|
||||
**Evaluator:** {user_name or TEA Agent}
|
||||
|
||||
---
|
||||
|
||||
Note: This workflow does not generate tests. If gaps exist, run `*atdd` or `*automate` to create coverage.
|
||||
|
||||
## PHASE 1: REQUIREMENTS TRACEABILITY
|
||||
|
||||
### Coverage Summary
|
||||
|
||||
| Priority | Total Criteria | FULL Coverage | Coverage % | Status |
|
||||
| --------- | -------------- | ------------- | ---------- | ------------ |
|
||||
| P0 | {P0_TOTAL} | {P0_FULL} | {P0_PCT}% | {P0_STATUS} |
|
||||
| P1 | {P1_TOTAL} | {P1_FULL} | {P1_PCT}% | {P1_STATUS} |
|
||||
| P2 | {P2_TOTAL} | {P2_FULL} | {P2_PCT}% | {P2_STATUS} |
|
||||
| P3 | {P3_TOTAL} | {P3_FULL} | {P3_PCT}% | {P3_STATUS} |
|
||||
| **Total** | **{TOTAL}** | **{FULL}** | **{PCT}%** | **{STATUS}** |
|
||||
|
||||
**Legend:**
|
||||
|
||||
- ✅ PASS - Coverage meets quality gate threshold
|
||||
- ⚠️ WARN - Coverage below threshold but not critical
|
||||
- ❌ FAIL - Coverage below minimum threshold (blocker)
|
||||
|
||||
---
|
||||
|
||||
### Detailed Mapping
|
||||
|
||||
#### {CRITERION_ID}: {CRITERION_DESCRIPTION} ({PRIORITY})
|
||||
|
||||
- **Coverage:** {COVERAGE_STATUS} {STATUS_ICON}
|
||||
- **Tests:**
|
||||
- `{TEST_ID}` - {TEST_FILE}:{LINE}
|
||||
- **Given:** {GIVEN}
|
||||
- **When:** {WHEN}
|
||||
- **Then:** {THEN}
|
||||
- `{TEST_ID_2}` - {TEST_FILE_2}:{LINE}
|
||||
- **Given:** {GIVEN_2}
|
||||
- **When:** {WHEN_2}
|
||||
- **Then:** {THEN_2}
|
||||
|
||||
- **Gaps:** (if PARTIAL or UNIT-ONLY or INTEGRATION-ONLY)
|
||||
- Missing: {MISSING_SCENARIO_1}
|
||||
- Missing: {MISSING_SCENARIO_2}
|
||||
|
||||
- **Recommendation:** {RECOMMENDATION_TEXT}
|
||||
|
||||
---
|
||||
|
||||
#### Example: AC-1: User can login with email and password (P0)
|
||||
|
||||
- **Coverage:** FULL ✅
|
||||
- **Tests:**
|
||||
- `1.3-E2E-001` - tests/e2e/auth.spec.ts:12
|
||||
- **Given:** User has valid credentials
|
||||
- **When:** User submits login form
|
||||
- **Then:** User is redirected to dashboard
|
||||
- `1.3-UNIT-001` - tests/unit/auth-service.spec.ts:8
|
||||
- **Given:** Valid email and password hash
|
||||
- **When:** validateCredentials is called
|
||||
- **Then:** Returns user object
|
||||
|
||||
---
|
||||
|
||||
#### Example: AC-3: User can reset password via email (P1)
|
||||
|
||||
- **Coverage:** PARTIAL ⚠️
|
||||
- **Tests:**
|
||||
- `1.3-E2E-003` - tests/e2e/auth.spec.ts:44
|
||||
- **Given:** User requests password reset
|
||||
- **When:** User clicks reset link in email
|
||||
- **Then:** User can set new password
|
||||
|
||||
- **Gaps:**
|
||||
- Missing: Email delivery validation
|
||||
- Missing: Expired token handling (error path)
|
||||
- Missing: Invalid token handling (security test)
|
||||
- Missing: Unit test for token generation logic
|
||||
|
||||
- **Recommendation:** Add `1.3-API-001` for email service integration testing and `1.3-UNIT-003` for token generation logic. Add `1.3-E2E-004` for error path validation (expired/invalid tokens).
|
||||
|
||||
---
|
||||
|
||||
### Gap Analysis
|
||||
|
||||
#### Critical Gaps (BLOCKER) ❌
|
||||
|
||||
{CRITICAL_GAP_COUNT} gaps found. **Do not release until resolved.**
|
||||
|
||||
1. **{CRITERION_ID}: {CRITERION_DESCRIPTION}** (P0)
|
||||
- Current Coverage: {COVERAGE_STATUS}
|
||||
- Missing Tests: {MISSING_TEST_DESCRIPTION}
|
||||
- Recommend: {RECOMMENDED_TEST_ID} ({RECOMMENDED_TEST_LEVEL})
|
||||
- Impact: {IMPACT_DESCRIPTION}
|
||||
|
||||
---
|
||||
|
||||
#### High Priority Gaps (PR BLOCKER) ⚠️
|
||||
|
||||
{HIGH_GAP_COUNT} gaps found. **Address before PR merge.**
|
||||
|
||||
1. **{CRITERION_ID}: {CRITERION_DESCRIPTION}** (P1)
|
||||
- Current Coverage: {COVERAGE_STATUS}
|
||||
- Missing Tests: {MISSING_TEST_DESCRIPTION}
|
||||
- Recommend: {RECOMMENDED_TEST_ID} ({RECOMMENDED_TEST_LEVEL})
|
||||
- Impact: {IMPACT_DESCRIPTION}
|
||||
|
||||
---
|
||||
|
||||
#### Medium Priority Gaps (Nightly) ⚠️
|
||||
|
||||
{MEDIUM_GAP_COUNT} gaps found. **Address in nightly test improvements.**
|
||||
|
||||
1. **{CRITERION_ID}: {CRITERION_DESCRIPTION}** (P2)
|
||||
- Current Coverage: {COVERAGE_STATUS}
|
||||
- Recommend: {RECOMMENDED_TEST_ID} ({RECOMMENDED_TEST_LEVEL})
|
||||
|
||||
---
|
||||
|
||||
#### Low Priority Gaps (Optional) ℹ️
|
||||
|
||||
{LOW_GAP_COUNT} gaps found. **Optional - add if time permits.**
|
||||
|
||||
1. **{CRITERION_ID}: {CRITERION_DESCRIPTION}** (P3)
|
||||
- Current Coverage: {COVERAGE_STATUS}
|
||||
|
||||
---
|
||||
|
||||
### Quality Assessment
|
||||
|
||||
#### Tests with Issues
|
||||
|
||||
**BLOCKER Issues** ❌
|
||||
|
||||
- `{TEST_ID}` - {ISSUE_DESCRIPTION} - {REMEDIATION}
|
||||
|
||||
**WARNING Issues** ⚠️
|
||||
|
||||
- `{TEST_ID}` - {ISSUE_DESCRIPTION} - {REMEDIATION}
|
||||
|
||||
**INFO Issues** ℹ️
|
||||
|
||||
- `{TEST_ID}` - {ISSUE_DESCRIPTION} - {REMEDIATION}
|
||||
|
||||
---
|
||||
|
||||
#### Example Quality Issues
|
||||
|
||||
**WARNING Issues** ⚠️
|
||||
|
||||
- `1.3-E2E-001` - 145 seconds (exceeds 90s target) - Optimize fixture setup to reduce test duration
|
||||
- `1.3-UNIT-005` - 320 lines (exceeds 300 line limit) - Split into multiple focused test files
|
||||
|
||||
**INFO Issues** ℹ️
|
||||
|
||||
- `1.3-E2E-002` - Missing Given-When-Then structure - Refactor describe block to use BDD format
|
||||
|
||||
---
|
||||
|
||||
#### Tests Passing Quality Gates
|
||||
|
||||
**{PASSING_TEST_COUNT}/{TOTAL_TEST_COUNT} tests ({PASSING_PCT}%) meet all quality criteria** ✅
|
||||
|
||||
---
|
||||
|
||||
### Duplicate Coverage Analysis
|
||||
|
||||
#### Acceptable Overlap (Defense in Depth)
|
||||
|
||||
- {CRITERION_ID}: Tested at unit (business logic) and E2E (user journey) ✅
|
||||
|
||||
#### Unacceptable Duplication ⚠️
|
||||
|
||||
- {CRITERION_ID}: Same validation at E2E and Component level
|
||||
- Recommendation: Remove {TEST_ID} or consolidate with {OTHER_TEST_ID}
|
||||
|
||||
---
|
||||
|
||||
### Coverage by Test Level
|
||||
|
||||
| Test Level | Tests | Criteria Covered | Coverage % |
|
||||
| ---------- | ----------------- | -------------------- | ---------------- |
|
||||
| E2E | {E2E_COUNT} | {E2E_CRITERIA} | {E2E_PCT}% |
|
||||
| API | {API_COUNT} | {API_CRITERIA} | {API_PCT}% |
|
||||
| Component | {COMP_COUNT} | {COMP_CRITERIA} | {COMP_PCT}% |
|
||||
| Unit | {UNIT_COUNT} | {UNIT_CRITERIA} | {UNIT_PCT}% |
|
||||
| **Total** | **{TOTAL_TESTS}** | **{TOTAL_CRITERIA}** | **{TOTAL_PCT}%** |
|
||||
|
||||
---
|
||||
|
||||
### Traceability Recommendations
|
||||
|
||||
#### Immediate Actions (Before PR Merge)
|
||||
|
||||
1. **{ACTION_1}** - {DESCRIPTION}
|
||||
2. **{ACTION_2}** - {DESCRIPTION}
|
||||
|
||||
#### Short-term Actions (This Sprint)
|
||||
|
||||
1. **{ACTION_1}** - {DESCRIPTION}
|
||||
2. **{ACTION_2}** - {DESCRIPTION}
|
||||
|
||||
#### Long-term Actions (Backlog)
|
||||
|
||||
1. **{ACTION_1}** - {DESCRIPTION}
|
||||
|
||||
---
|
||||
|
||||
#### Example Recommendations
|
||||
|
||||
**Immediate Actions (Before PR Merge)**
|
||||
|
||||
1. **Add P1 Password Reset Tests** - Implement `1.3-API-001` for email service integration and `1.3-E2E-004` for error path validation. P1 coverage currently at 80%, target is 90%.
|
||||
2. **Optimize Slow E2E Test** - Refactor `1.3-E2E-001` to use faster fixture setup. Currently 145s, target is <90s.
|
||||
|
||||
**Short-term Actions (This Sprint)**
|
||||
|
||||
1. **Enhance P2 Coverage** - Add E2E validation for session timeout (`1.3-E2E-005`). Currently UNIT-ONLY coverage.
|
||||
2. **Split Large Test File** - Break `1.3-UNIT-005` (320 lines) into multiple focused test files (<300 lines each).
|
||||
|
||||
**Long-term Actions (Backlog)**
|
||||
|
||||
1. **Enrich P3 Coverage** - Add tests for edge cases in P3 criteria if time permits.
|
||||
|
||||
---
|
||||
|
||||
## PHASE 2: QUALITY GATE DECISION
|
||||
|
||||
**Gate Type:** {story | epic | release | hotfix}
|
||||
**Decision Mode:** {deterministic | manual}
|
||||
|
||||
---
|
||||
|
||||
### Evidence Summary
|
||||
|
||||
#### Test Execution Results
|
||||
|
||||
- **Total Tests**: {total_count}
|
||||
- **Passed**: {passed_count} ({pass_percentage}%)
|
||||
- **Failed**: {failed_count} ({fail_percentage}%)
|
||||
- **Skipped**: {skipped_count} ({skip_percentage}%)
|
||||
- **Duration**: {total_duration}
|
||||
|
||||
**Priority Breakdown:**
|
||||
|
||||
- **P0 Tests**: {p0_passed}/{p0_total} passed ({p0_pass_rate}%) {✅ | ❌}
|
||||
- **P1 Tests**: {p1_passed}/{p1_total} passed ({p1_pass_rate}%) {✅ | ⚠️ | ❌}
|
||||
- **P2 Tests**: {p2_passed}/{p2_total} passed ({p2_pass_rate}%) {informational}
|
||||
- **P3 Tests**: {p3_passed}/{p3_total} passed ({p3_pass_rate}%) {informational}
|
||||
|
||||
**Overall Pass Rate**: {overall_pass_rate}% {✅ | ⚠️ | ❌}
|
||||
|
||||
**Test Results Source**: {CI_run_id | test_report_url | local_run}
|
||||
|
||||
---
|
||||
|
||||
#### Coverage Summary (from Phase 1)
|
||||
|
||||
**Requirements Coverage:**
|
||||
|
||||
- **P0 Acceptance Criteria**: {p0_covered}/{p0_total} covered ({p0_coverage}%) {✅ | ❌}
|
||||
- **P1 Acceptance Criteria**: {p1_covered}/{p1_total} covered ({p1_coverage}%) {✅ | ⚠️ | ❌}
|
||||
- **P2 Acceptance Criteria**: {p2_covered}/{p2_total} covered ({p2_coverage}%) {informational}
|
||||
- **Overall Coverage**: {overall_coverage}%
|
||||
|
||||
**Code Coverage** (if available):
|
||||
|
||||
- **Line Coverage**: {line_coverage}% {✅ | ⚠️ | ❌}
|
||||
- **Branch Coverage**: {branch_coverage}% {✅ | ⚠️ | ❌}
|
||||
- **Function Coverage**: {function_coverage}% {✅ | ⚠️ | ❌}
|
||||
|
||||
**Coverage Source**: {coverage_report_url | coverage_file_path}
|
||||
|
||||
---
|
||||
|
||||
#### Non-Functional Requirements (NFRs)
|
||||
|
||||
**Security**: {PASS | CONCERNS | FAIL | NOT_ASSESSED} {✅ | ⚠️ | ❌}
|
||||
|
||||
- Security Issues: {security_issue_count}
|
||||
- {details_if_issues}
|
||||
|
||||
**Performance**: {PASS | CONCERNS | FAIL | NOT_ASSESSED} {✅ | ⚠️ | ❌}
|
||||
|
||||
- {performance_metrics_summary}
|
||||
|
||||
**Reliability**: {PASS | CONCERNS | FAIL | NOT_ASSESSED} {✅ | ⚠️ | ❌}
|
||||
|
||||
- {reliability_metrics_summary}
|
||||
|
||||
**Maintainability**: {PASS | CONCERNS | FAIL | NOT_ASSESSED} {✅ | ⚠️ | ❌}
|
||||
|
||||
- {maintainability_metrics_summary}
|
||||
|
||||
**NFR Source**: {nfr_assessment_file_path | not_assessed}
|
||||
|
||||
---
|
||||
|
||||
#### Flakiness Validation
|
||||
|
||||
**Burn-in Results** (if available):
|
||||
|
||||
- **Burn-in Iterations**: {iteration_count} (e.g., 10)
|
||||
- **Flaky Tests Detected**: {flaky_test_count} {✅ if 0 | ❌ if >0}
|
||||
- **Stability Score**: {stability_percentage}%
|
||||
|
||||
**Flaky Tests List** (if any):
|
||||
|
||||
- {flaky_test_1_name} - {failure_rate}
|
||||
- {flaky_test_2_name} - {failure_rate}
|
||||
|
||||
**Burn-in Source**: {CI_burn_in_run_id | not_available}
|
||||
|
||||
---
|
||||
|
||||
### Decision Criteria Evaluation
|
||||
|
||||
#### P0 Criteria (Must ALL Pass)
|
||||
|
||||
| Criterion | Threshold | Actual | Status |
|
||||
| --------------------- | --------- | ------------------------- | -------- | -------- |
|
||||
| P0 Coverage | 100% | {p0_coverage}% | {✅ PASS | ❌ FAIL} |
|
||||
| P0 Test Pass Rate | 100% | {p0_pass_rate}% | {✅ PASS | ❌ FAIL} |
|
||||
| Security Issues | 0 | {security_issue_count} | {✅ PASS | ❌ FAIL} |
|
||||
| Critical NFR Failures | 0 | {critical_nfr_fail_count} | {✅ PASS | ❌ FAIL} |
|
||||
| Flaky Tests | 0 | {flaky_test_count} | {✅ PASS | ❌ FAIL} |
|
||||
|
||||
**P0 Evaluation**: {✅ ALL PASS | ❌ ONE OR MORE FAILED}
|
||||
|
||||
---
|
||||
|
||||
#### P1 Criteria (Required for PASS, May Accept for CONCERNS)
|
||||
|
||||
| Criterion | Threshold | Actual | Status |
|
||||
| ---------------------- | ------------------------- | -------------------- | -------- | ----------- | -------- |
|
||||
| P1 Coverage | ≥{min_p1_coverage}% | {p1_coverage}% | {✅ PASS | ⚠️ CONCERNS | ❌ FAIL} |
|
||||
| P1 Test Pass Rate | ≥{min_p1_pass_rate}% | {p1_pass_rate}% | {✅ PASS | ⚠️ CONCERNS | ❌ FAIL} |
|
||||
| Overall Test Pass Rate | ≥{min_overall_pass_rate}% | {overall_pass_rate}% | {✅ PASS | ⚠️ CONCERNS | ❌ FAIL} |
|
||||
| Overall Coverage | ≥{min_coverage}% | {overall_coverage}% | {✅ PASS | ⚠️ CONCERNS | ❌ FAIL} |
|
||||
|
||||
**P1 Evaluation**: {✅ ALL PASS | ⚠️ SOME CONCERNS | ❌ FAILED}
|
||||
|
||||
---
|
||||
|
||||
#### P2/P3 Criteria (Informational, Don't Block)
|
||||
|
||||
| Criterion | Actual | Notes |
|
||||
| ----------------- | --------------- | ------------------------------------------------------------ |
|
||||
| P2 Test Pass Rate | {p2_pass_rate}% | {allow_p2_failures ? "Tracked, doesn't block" : "Evaluated"} |
|
||||
| P3 Test Pass Rate | {p3_pass_rate}% | {allow_p3_failures ? "Tracked, doesn't block" : "Evaluated"} |
|
||||
|
||||
---
|
||||
|
||||
### GATE DECISION: {PASS | CONCERNS | FAIL | WAIVED}
|
||||
|
||||
---
|
||||
|
||||
### Rationale
|
||||
|
||||
{Explain decision based on criteria evaluation}
|
||||
|
||||
{Highlight key evidence that drove decision}
|
||||
|
||||
{Note any assumptions or caveats}
|
||||
|
||||
**Example (PASS):**
|
||||
|
||||
> All P0 criteria met with 100% coverage and pass rates across critical tests. All P1 criteria exceeded thresholds with 98% overall pass rate and 92% coverage. No security issues detected. No flaky tests in validation. Feature is ready for production deployment with standard monitoring.
|
||||
|
||||
**Example (CONCERNS):**
|
||||
|
||||
> All P0 criteria met, ensuring critical user journeys are protected. However, P1 coverage (88%) falls below threshold (90%) due to missing E2E test for AC-5 edge case. Overall pass rate (96%) is excellent. Issues are non-critical and have acceptable workarounds. Risk is low enough to deploy with enhanced monitoring.
|
||||
|
||||
**Example (FAIL):**
|
||||
|
||||
> CRITICAL BLOCKERS DETECTED:
|
||||
>
|
||||
> 1. P0 coverage incomplete (80%) - AC-2 security validation missing
|
||||
> 2. P0 test failures (75% pass rate) in core search functionality
|
||||
> 3. Unresolved SQL injection vulnerability in search filter (CRITICAL)
|
||||
>
|
||||
> Release MUST BE BLOCKED until P0 issues are resolved. Security vulnerability cannot be waived.
|
||||
|
||||
**Example (WAIVED):**
|
||||
|
||||
> Original decision was FAIL due to P0 test failure in legacy Excel 2007 export module (affects <1% of users). However, release contains critical GDPR compliance features required by regulatory deadline (Oct 15). Business has approved waiver given:
|
||||
>
|
||||
> - Regulatory priority overrides legacy module risk
|
||||
> - Workaround available (use Excel 2010+)
|
||||
> - Issue will be fixed in v2.4.1 hotfix (due Oct 20)
|
||||
> - Enhanced monitoring in place
|
||||
|
||||
---
|
||||
|
||||
### {Section: Delete if not applicable}
|
||||
|
||||
#### Residual Risks (For CONCERNS or WAIVED)
|
||||
|
||||
List unresolved P1/P2 issues that don't block release but should be tracked:
|
||||
|
||||
1. **{Risk Description}**
|
||||
- **Priority**: P1 | P2
|
||||
- **Probability**: Low | Medium | High
|
||||
- **Impact**: Low | Medium | High
|
||||
- **Risk Score**: {probability × impact}
|
||||
- **Mitigation**: {workaround or monitoring plan}
|
||||
- **Remediation**: {fix in next sprint/release}
|
||||
|
||||
**Overall Residual Risk**: {LOW | MEDIUM | HIGH}
|
||||
|
||||
---
|
||||
|
||||
#### Waiver Details (For WAIVED only)
|
||||
|
||||
**Original Decision**: ❌ FAIL
|
||||
|
||||
**Reason for Failure**:
|
||||
|
||||
- {list_of_blocking_issues}
|
||||
|
||||
**Waiver Information**:
|
||||
|
||||
- **Waiver Reason**: {business_justification}
|
||||
- **Waiver Approver**: {name}, {role} (e.g., Jane Doe, VP Engineering)
|
||||
- **Approval Date**: {YYYY-MM-DD}
|
||||
- **Waiver Expiry**: {YYYY-MM-DD} (**NOTE**: Does NOT apply to next release)
|
||||
|
||||
**Monitoring Plan**:
|
||||
|
||||
- {enhanced_monitoring_1}
|
||||
- {enhanced_monitoring_2}
|
||||
- {escalation_criteria}
|
||||
|
||||
**Remediation Plan**:
|
||||
|
||||
- **Fix Target**: {next_release_version} (e.g., v2.4.1 hotfix)
|
||||
- **Due Date**: {YYYY-MM-DD}
|
||||
- **Owner**: {team_or_person}
|
||||
- **Verification**: {how_fix_will_be_verified}
|
||||
|
||||
**Business Justification**:
|
||||
{detailed_explanation_of_why_waiver_is_acceptable}
|
||||
|
||||
---
|
||||
|
||||
#### Critical Issues (For FAIL or CONCERNS)
|
||||
|
||||
Top blockers requiring immediate attention:
|
||||
|
||||
| Priority | Issue | Description | Owner | Due Date | Status |
|
||||
| -------- | ------------- | ------------------- | ------------ | ------------ | ------------------ |
|
||||
| P0 | {issue_title} | {brief_description} | {owner_name} | {YYYY-MM-DD} | {OPEN/IN_PROGRESS} |
|
||||
| P0 | {issue_title} | {brief_description} | {owner_name} | {YYYY-MM-DD} | {OPEN/IN_PROGRESS} |
|
||||
| P1 | {issue_title} | {brief_description} | {owner_name} | {YYYY-MM-DD} | {OPEN/IN_PROGRESS} |
|
||||
|
||||
**Blocking Issues Count**: {p0_blocker_count} P0 blockers, {p1_blocker_count} P1 issues
|
||||
|
||||
---
|
||||
|
||||
### Gate Recommendations
|
||||
|
||||
#### For PASS Decision ✅
|
||||
|
||||
1. **Proceed to deployment**
|
||||
- Deploy to staging environment
|
||||
- Validate with smoke tests
|
||||
- Monitor key metrics for 24-48 hours
|
||||
- Deploy to production with standard monitoring
|
||||
|
||||
2. **Post-Deployment Monitoring**
|
||||
- {metric_1_to_monitor}
|
||||
- {metric_2_to_monitor}
|
||||
- {alert_thresholds}
|
||||
|
||||
3. **Success Criteria**
|
||||
- {success_criterion_1}
|
||||
- {success_criterion_2}
|
||||
|
||||
---
|
||||
|
||||
#### For CONCERNS Decision ⚠️
|
||||
|
||||
1. **Deploy with Enhanced Monitoring**
|
||||
- Deploy to staging with extended validation period
|
||||
- Enable enhanced logging/monitoring for known risk areas:
|
||||
- {risk_area_1}
|
||||
- {risk_area_2}
|
||||
- Set aggressive alerts for potential issues
|
||||
- Deploy to production with caution
|
||||
|
||||
2. **Create Remediation Backlog**
|
||||
- Create story: "{fix_title_1}" (Priority: {priority})
|
||||
- Create story: "{fix_title_2}" (Priority: {priority})
|
||||
- Target sprint: {next_sprint}
|
||||
|
||||
3. **Post-Deployment Actions**
|
||||
- Monitor {specific_areas} closely for {time_period}
|
||||
- Weekly status updates on remediation progress
|
||||
- Re-assess after fixes deployed
|
||||
|
||||
---
|
||||
|
||||
#### For FAIL Decision ❌
|
||||
|
||||
1. **Block Deployment Immediately**
|
||||
- Do NOT deploy to any environment
|
||||
- Notify stakeholders of blocking issues
|
||||
- Escalate to tech lead and PM
|
||||
|
||||
2. **Fix Critical Issues**
|
||||
- Address P0 blockers listed in Critical Issues section
|
||||
- Owner assignments confirmed
|
||||
- Due dates agreed upon
|
||||
- Daily standup on blocker resolution
|
||||
|
||||
3. **Re-Run Gate After Fixes**
|
||||
- Re-run full test suite after fixes
|
||||
- Re-run `bmad tea *trace` workflow
|
||||
- Verify decision is PASS before deploying
|
||||
|
||||
---
|
||||
|
||||
#### For WAIVED Decision 🔓
|
||||
|
||||
1. **Deploy with Business Approval**
|
||||
- Confirm waiver approver has signed off
|
||||
- Document waiver in release notes
|
||||
- Notify all stakeholders of waived risks
|
||||
|
||||
2. **Aggressive Monitoring**
|
||||
- {enhanced_monitoring_plan}
|
||||
- {escalation_procedures}
|
||||
- Daily checks on waived risk areas
|
||||
|
||||
3. **Mandatory Remediation**
|
||||
- Fix MUST be completed by {due_date}
|
||||
- Issue CANNOT be waived in next release
|
||||
- Track remediation progress weekly
|
||||
- Verify fix in next gate
|
||||
|
||||
---
|
||||
|
||||
### Next Steps
|
||||
|
||||
**Immediate Actions** (next 24-48 hours):
|
||||
|
||||
1. {action_1}
|
||||
2. {action_2}
|
||||
3. {action_3}
|
||||
|
||||
**Follow-up Actions** (next sprint/release):
|
||||
|
||||
1. {action_1}
|
||||
2. {action_2}
|
||||
3. {action_3}
|
||||
|
||||
**Stakeholder Communication**:
|
||||
|
||||
- Notify PM: {decision_summary}
|
||||
- Notify SM: {decision_summary}
|
||||
- Notify DEV lead: {decision_summary}
|
||||
|
||||
---
|
||||
|
||||
## Integrated YAML Snippet (CI/CD)
|
||||
|
||||
```yaml
|
||||
traceability_and_gate:
|
||||
# Phase 1: Traceability
|
||||
traceability:
|
||||
story_id: "{STORY_ID}"
|
||||
date: "{DATE}"
|
||||
coverage:
|
||||
overall: {OVERALL_PCT}%
|
||||
p0: {P0_PCT}%
|
||||
p1: {P1_PCT}%
|
||||
p2: {P2_PCT}%
|
||||
p3: {P3_PCT}%
|
||||
gaps:
|
||||
critical: {CRITICAL_COUNT}
|
||||
high: {HIGH_COUNT}
|
||||
medium: {MEDIUM_COUNT}
|
||||
low: {LOW_COUNT}
|
||||
quality:
|
||||
passing_tests: {PASSING_COUNT}
|
||||
total_tests: {TOTAL_TESTS}
|
||||
blocker_issues: {BLOCKER_COUNT}
|
||||
warning_issues: {WARNING_COUNT}
|
||||
recommendations:
|
||||
- "{RECOMMENDATION_1}"
|
||||
- "{RECOMMENDATION_2}"
|
||||
|
||||
# Phase 2: Gate Decision
|
||||
gate_decision:
|
||||
decision: "{PASS | CONCERNS | FAIL | WAIVED}"
|
||||
gate_type: "{story | epic | release | hotfix}"
|
||||
decision_mode: "{deterministic | manual}"
|
||||
criteria:
|
||||
p0_coverage: {p0_coverage}%
|
||||
p0_pass_rate: {p0_pass_rate}%
|
||||
p1_coverage: {p1_coverage}%
|
||||
p1_pass_rate: {p1_pass_rate}%
|
||||
overall_pass_rate: {overall_pass_rate}%
|
||||
overall_coverage: {overall_coverage}%
|
||||
security_issues: {security_issue_count}
|
||||
critical_nfrs_fail: {critical_nfr_fail_count}
|
||||
flaky_tests: {flaky_test_count}
|
||||
thresholds:
|
||||
min_p0_coverage: 100
|
||||
min_p0_pass_rate: 100
|
||||
min_p1_coverage: {min_p1_coverage}
|
||||
min_p1_pass_rate: {min_p1_pass_rate}
|
||||
min_overall_pass_rate: {min_overall_pass_rate}
|
||||
min_coverage: {min_coverage}
|
||||
evidence:
|
||||
test_results: "{CI_run_id | test_report_url}"
|
||||
traceability: "{trace_file_path}"
|
||||
nfr_assessment: "{nfr_file_path}"
|
||||
code_coverage: "{coverage_report_url}"
|
||||
next_steps: "{brief_summary_of_recommendations}"
|
||||
waiver: # Only if WAIVED
|
||||
reason: "{business_justification}"
|
||||
approver: "{name}, {role}"
|
||||
expiry: "{YYYY-MM-DD}"
|
||||
remediation_due: "{YYYY-MM-DD}"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Related Artifacts
|
||||
|
||||
- **Story File:** {STORY_FILE_PATH}
|
||||
- **Test Design:** {TEST_DESIGN_PATH} (if available)
|
||||
- **Tech Spec:** {TECH_SPEC_PATH} (if available)
|
||||
- **Test Results:** {TEST_RESULTS_PATH}
|
||||
- **NFR Assessment:** {NFR_FILE_PATH} (if available)
|
||||
- **Test Files:** {TEST_DIR_PATH}
|
||||
|
||||
---
|
||||
|
||||
## Sign-Off
|
||||
|
||||
**Phase 1 - Traceability Assessment:**
|
||||
|
||||
- Overall Coverage: {OVERALL_PCT}%
|
||||
- P0 Coverage: {P0_PCT}% {P0_STATUS}
|
||||
- P1 Coverage: {P1_PCT}% {P1_STATUS}
|
||||
- Critical Gaps: {CRITICAL_COUNT}
|
||||
- High Priority Gaps: {HIGH_COUNT}
|
||||
|
||||
**Phase 2 - Gate Decision:**
|
||||
|
||||
- **Decision**: {PASS | CONCERNS | FAIL | WAIVED} {STATUS_ICON}
|
||||
- **P0 Evaluation**: {✅ ALL PASS | ❌ ONE OR MORE FAILED}
|
||||
- **P1 Evaluation**: {✅ ALL PASS | ⚠️ SOME CONCERNS | ❌ FAILED}
|
||||
|
||||
**Overall Status:** {STATUS} {STATUS_ICON}
|
||||
|
||||
**Next Steps:**
|
||||
|
||||
- If PASS ✅: Proceed to deployment
|
||||
- If CONCERNS ⚠️: Deploy with monitoring, create remediation backlog
|
||||
- If FAIL ❌: Block deployment, fix critical issues, re-run workflow
|
||||
- If WAIVED 🔓: Deploy with business approval and aggressive monitoring
|
||||
|
||||
**Generated:** {DATE}
|
||||
**Workflow:** testarch-trace v4.0 (Enhanced with Gate Decision)
|
||||
|
||||
---
|
||||
|
||||
<!-- Powered by BMAD-CORE™ -->
|
||||
@@ -1,55 +0,0 @@
|
||||
# Test Architect workflow: trace (enhanced with gate decision)
|
||||
name: testarch-trace
|
||||
description: "Generate requirements-to-tests traceability matrix, analyze coverage, and make quality gate decision (PASS/CONCERNS/FAIL/WAIVED)"
|
||||
author: "BMad"
|
||||
|
||||
# Critical variables from config
|
||||
config_source: "{project-root}/_bmad/bmm/config.yaml"
|
||||
output_folder: "{config_source}:output_folder"
|
||||
user_name: "{config_source}:user_name"
|
||||
communication_language: "{config_source}:communication_language"
|
||||
document_output_language: "{config_source}:document_output_language"
|
||||
date: system-generated
|
||||
|
||||
# Workflow components
|
||||
installed_path: "{project-root}/_bmad/bmm/workflows/testarch/trace"
|
||||
instructions: "{installed_path}/instructions.md"
|
||||
validation: "{installed_path}/checklist.md"
|
||||
template: "{installed_path}/trace-template.md"
|
||||
|
||||
# Variables and inputs
|
||||
variables:
|
||||
# Directory paths
|
||||
test_dir: "{project-root}/tests" # Root test directory
|
||||
source_dir: "{project-root}/src" # Source code directory
|
||||
|
||||
# Workflow behavior
|
||||
coverage_levels: "e2e,api,component,unit" # Which test levels to trace
|
||||
gate_type: "story" # story | epic | release | hotfix - determines gate scope
|
||||
decision_mode: "deterministic" # deterministic (rule-based) | manual (team decision)
|
||||
|
||||
# Output configuration
|
||||
default_output_file: "{output_folder}/traceability-matrix.md"
|
||||
|
||||
# Required tools
|
||||
required_tools:
|
||||
- read_file # Read story, test files, BMad artifacts
|
||||
- write_file # Create traceability matrix, gate YAML
|
||||
- list_files # Discover test files
|
||||
- search_repo # Find tests by test ID, describe blocks
|
||||
- glob # Find test files matching patterns
|
||||
|
||||
tags:
|
||||
- qa
|
||||
- traceability
|
||||
- test-architect
|
||||
- coverage
|
||||
- requirements
|
||||
- gate
|
||||
- decision
|
||||
- release
|
||||
|
||||
execution_hints:
|
||||
interactive: false # Minimize prompts
|
||||
autonomous: true # Proceed without user input unless blocked
|
||||
iterative: true
|
||||
Reference in New Issue
Block a user