Map what the test touches
Codna's deterministic engine resolves the whole repo into a dependency and blast-radius graph in about 60ms for zero LLM tokens — surfacing the fixtures, globals, and ordering the test really depends on.
Flaky tests waste engineering hours and erode trust in CI. Codna does flaky test detection through the dependency graph — it finds the shared state, ordering, or timing cause deterministically, then fixes it and proves the test green.
The problem
A test that passes alone but fails in the suite almost never has a bug in the test itself. The cause is usually shared state left behind by a neighbor, async work that resolves in a different order under load, or a global side effect set several files away. Engineers lose hours rerunning the suite, bisecting test order, and staring at logs that change between runs. The easy escape — adding a retry or a sleep — hides the flake instead of fixing it, and the next regression lands on top of it. Codna's blast-radius graph surfaces what the test actually depends on, so an AI debugging tool can fix the cause instead of masking the symptom.
How Codna fixes it
Codna's deterministic engine resolves the whole repo into a dependency and blast-radius graph in about 60ms for zero LLM tokens — surfacing the fixtures, globals, and ordering the test really depends on.
The agent works from a ~600-token evidence bundle scoped to the suspect cause — 162x less context than reading the suite — so it targets shared state or timing, not a retry.
Codna re-runs the test and its neighbors against your own suite, so the fix lands only after the flake is actually gone.
codna fix . --issue "test_orders is flaky under parallel run"
What you get
A zero-token map of the repo into a dependency and blast-radius graph in ~60ms — no RAG, no embeddings — so the real cause of the flake surfaces instead of a guess.
Every patch re-runs the failing test and its neighbors against your suite before it lands, so a flake is proven gone rather than masked with a retry.
The agent fixes from a ~600-token evidence bundle instead of reading the whole suite, so a stabilized test costs pennies.
The proof
Codna targets the root cause — shared state, async ordering, or timing — by tracing what the test depends on in the blast-radius graph, then verifies the fix by re-running the test. It does not mask the flake with a retry or a sleep.
Codna maps the repo deterministically and surfaces the fixtures, globals, and ordering a test depends on for zero tokens. That points straight at the likely cause, so you skip the rerun-and-bisect loop.
Yes. The native GitHub App can triage a flaky check and open a test-verified fix PR with the root cause attached — no log copy-paste required. You can also run it from the CLI or via the MCP server in Cursor or Claude.
About $0.04 per verified fix. The deterministic map costs zero LLM tokens, so the agent works from a ~600-token evidence bundle instead of the whole suite.
Codna builds the dependency graph from the source itself, so it works across languages and whatever runner you already use — pytest, Vitest, Jest, node:test, go test, and others. Your existing runner is what verifies the fix.
Yes. Codna supports self-hosting with your own keys (BYOK), fail-closed egress, and no training on your code.
Related