loops!
BrowseSign in
Back to loops
Testing
manual
Claude CodeCursor

Flaky Test Triage

Run failing tests repeatedly, classify each failure as flaky or real, and fix only confirmed regressions.

0 copies · 44 views

by loops!

Use loop copies the kickoff. Share copies the loop link. Open in Cursor / Claude Code only pre-fill that prompt — they do not install hook files. Download loop saves a README and kickoff into .cursor/loops/flaky-test-triage/ — no hooks required. Full install guide

Sign in to save
Guardrails
Hardened
Anti-gaming rules
Rules the agent must follow so it cannot cheat the exit condition.
  • Do not modify the check command or exit criteria to force success
  • Do not skip, disable, or bypass checks to pass the exit condition
  • If stuck after several iterations, stop and report blockers instead of gaming metrics
  • Do not weaken, delete, or skip tests to make the suite pass
  • Do not replace real assertions with trivial always-pass tests
  • Prefer fixing production code over patching tests to go green
How to run this loop
Prompt only
Run “Flaky Test Triage” in your agent
Deeplinks and “Open in Cursor” only paste the kickoff prompt. They do not install hook files — your agent cannot tell whether files are on disk until you add them yourself.

Two separate pieces

  • Kickoff prompt — tells the agent the goal, check command, exit condition, and how to self-pace between passes.
  1. 1

    Copy or open the kickoff prompt

    Click Use loop to copy the kickoff into your clipboard. Open in Cursor and Open in Claude Code only open the agent with that prompt — they do not configure automation for you.

  2. 2

    Paste into your coding agent

    Start a chat in Cursor, Claude Code, Codex, or any agent. Paste the kickoff. The prompt includes the goal, iteration limit, shell check, and first step.

  3. 3

    Agent self-paces until done

    The agent runs the loop: act → run check command → read output → repeat until the exit condition is met or max iterations is reached. No install step is required for prompt-only loops.

Full guide with Cursor /loop notes: How to install loops

manual trigger
Testing
Manual start
Run failing tests
Classify failures
Fix real failures
Confirm stability
All failures classified; real regressions fixed; flaky tests documented or stabilized
Rendering diagram…
Steps
What the agent does on each pass.

1. Run failing tests

Run the failing test file or suite 3–5 times. Record pass/fail pattern per test.

npm test -- --testPathPattern=<failing-suite>

2. Classify failures

Label each failure as flaky (intermittent) or real (consistent). Note timing, ordering, or env dependencies.

3. Fix real failures

Fix confirmed real failures with minimal changes. For flaky tests, propose stabilization (retries, isolation, mocks).

4. Confirm stability

Re-run the suite multiple times to confirm real failures are gone and flakiness is reduced or documented.

npm test -- --testPathPattern=<failing-suite>
Kickoff prompt
Copy this into your coding agent to start the loop.
Start the "Flaky Test Triage" loop.

Goal: classify failing tests as flaky vs real and fix only real regressions
Max iterations: 5
Between iterations run: npm test -- --testPathPattern=<failing-suite>
Exit when: every failure is classified and real regressions are fixed or explicitly deferred

Step 1: Run the failing suite multiple times. Classify each failure, fix real ones, and document flaky behavior.

Self-pace this loop. After each iteration, run the check command, read the output, and only continue if the exit condition is not met. Stop when the exit condition passes or max iterations is reached. Give a short status update each pass.

Related loops

Testing
manual
Hardened
32 copies
Run the production build, fix compile and bundling errors, and loop until the build succeeds.

Start the "Build Until Green" loop. Goal: production build succeeds Max iterations: 10 Between iterations run: npm run build Exit when: npm run build exits 0 Step 1: Run the build. If it fails, fix the first error, then repeat until green. Self-pace this loop. After each iteration, run the check command, read the output, and only continue if the exit condition is not met. Stop when the exit condition passes or max iterations is reached. Give a short status update each pass.

CursorClaude Code
build
compile
ci
by loops!
View
Testing
manual
Hardened
32 copies
Add focused tests until coverage meets your threshold (e.g. 80%), without changing production behavior unnecessarily.

Start the "Coverage Until Threshold" loop. Goal: coverage meets the target threshold (default 80%) with all tests passing Max iterations: 12 Between iterations run: npm test -- --coverage Exit when: coverage threshold is met and tests exit 0 Step 1: Run coverage. Add focused tests for the biggest uncovered gaps, then repeat. Self-pace this loop. After each iteration, run the check command, read the output, and only continue if the exit condition is not met. Stop when the exit condition passes or max iterations is reached. Give a short status update each pass.

Claude CodeCursor
coverage
testing
quality
by loops!
View
Testing
manual
Hardened
16 copies
Run end-to-end tests, fix UI and integration failures, and loop until the E2E suite passes.

Start the "E2E Until Green" loop. Goal: E2E suite passes Max iterations: 10 Between iterations run: npm run test:e2e Exit when: E2E command exits 0 Step 1: Run E2E tests. Fix the first failing spec, then repeat. Self-pace this loop. After each iteration, run the check command, read the output, and only continue if the exit condition is not met. Stop when the exit condition passes or max iterations is reached. Give a short status update each pass.

CursorClaude Code
e2e
playwright
testing
by loops!
View