Autonomous PR Review Loop¶
AgentArmy's pull requests are reviewed automatically by three remote bots — Gemini Code Assist, the GitHub Copilot reviewer, and Codex. The autonomous review loop lets Claude close the loop on that feedback: it reads the bots' comments, fixes the code, pushes, the bots re-review, and it repeats until the PR is clean — escalating to a human only when it stalls.
It is opt-in per PR (you add a label) and bounded (a round cap), so it never runs away.
How to start it¶
Add the review-loop label to an open PR:
gh pr edit <PR> --add-label review-loop
That's it. From then on, every time a review bot posts feedback, the loop runs.
To stop it at any time, remove the label:
gh pr edit <PR> --remove-label review-loop
The flow¶
You add the `review-loop` label
↓
A review bot (Gemini / Copilot / Codex) submits review comments
↓
review-loop.yml (the "brain") runs:
• already asked Claude for this commit? → wait
• bots re-reviewed with nothing actionable? → ✅ converged (label review-loop:done)
• round cap reached? → 🙋 escalate to a HITL Decision artifact (label review-loop:escalated)
• otherwise → post ONE aggregated @claude comment listing the feedback
↓
claude.yml (the "hands") runs:
• Claude addresses each item, commits, and pushes (with the PAT)
↓
the push re-triggers Gemini / Copilot → they re-review
↓
back to the top
Two workflows split the work so each stays simple:
| Workflow | Role |
|---|---|
review-loop.yml |
The brain — decides whether to iterate, counts rounds, checks convergence, escalates. Never edits code. |
claude.yml |
The hands — the @claude responder. Reads the aggregated feedback, edits, commits, pushes, replies to threads. Also handles manual @claude mentions. |
Rounds, convergence, and escalation¶
Round tracking. Each time the loop asks Claude to fix something, it posts a hidden marker (<!-- review-loop round=N -->) in its comment. The round number is just how many of those markers exist.
Convergence (the happy path). There is no single "approved by all bots" signal from GitHub, so the loop keys off what's still unaddressed. An inline bot comment counts as actionable until its line changes — once Claude's fix moves that code, GitHub marks the comment outdated, and the loop treats it as resolved. The loop converges when no configured bot has any unaddressed (non-outdated) inline comment, and no bot's latest review is CHANGES_REQUESTED — and at least one bot has re-reviewed the latest commit. It then posts a ✅ summary and swaps review-loop → review-loop:done.
An unaddressed comment from any configured bot keeps the loop running — convergence is not declared just because one bot is happy while another still has open comments. It does not hard-block on every bot re-running each push (a bot may skip a round); the round cap is the backstop. And
review-loop:donemeans "no outstanding bot feedback," not "merge it" — the final merge stays a human (or auto-merge) decision. Top-level review summaries are treated as descriptive, not actionable.
Escalation (the safety net). If the loop reaches the round cap (default 3) while the bots still have actionable comments, it does not keep going. It opens a Decision artifact — a Human-in-the-Loop issue (hitl-decision label, Status Awaiting Decision) whose ## Blocks section lists the PR — and swaps review-loop → review-loop:escalated. From there the normal HITL flow takes over: a human comments their choice and closes the issue, which unblocks the PR.
Escalation also fires implicitly when Claude can't satisfy the bots: if a round produces no fix, the bots keep their comments, and the cap is eventually hit.
Labels¶
| Label | Color | Meaning |
|---|---|---|
review-loop |
blue | Opt-in: run the loop on this PR |
review-loop:done |
green | Converged — no actionable bot comments remain |
review-loop:escalated |
red | Hit the round cap — a Decision artifact was opened |
The workflow creates these labels automatically on first run.
Configuration¶
| Setting | Where | Default |
|---|---|---|
| Round cap | repo variable REVIEW_LOOP_MAX_ROUNDS |
3 |
| Reviewed bot accounts | BOT_LOGINS env in review-loop.yml |
gemini-code-assist[bot], copilot-pull-request-reviewer[bot], chatgpt-codex-connector[bot] |
To change the cap:
gh variable set REVIEW_LOOP_MAX_ROUNDS --body "5"
Cost and safety guards¶
- Opt-in only — nothing happens until you add the
review-looplabel. - Round cap — bounded iterations, then escalation. Never an infinite loop.
- Debounced — the loop posts at most one
@clauderequest per commit; extra bot reviews of the same commit don't trigger duplicate fixes. - Bot-scoped — only the three review-bot accounts count as feedback. Claude's own commits and comments can't re-arm the loop.
- Event-driven — the loop reacts to review events; it never polls or sleeps.
Prerequisites (one-time)¶
The loop needs the Claude responder wired up. See the Setup Guide for full steps:
- Install the Claude GitHub App on the repo (provides the
@claudetrigger). - Add the
CLAUDE_CODE_OAUTH_TOKENsecret — generate it locally withclaude setup-token, thengh secret set CLAUDE_CODE_OAUTH_TOKEN. PROJECT_TOKEN(the classic PAT — personal access token — already used by the board workflows) must exist. Claude pushes with it so that its fix-commits re-trigger the review bots; a push made with the defaultGITHUB_TOKENwould not re-trigger them, and the loop would never close.
Troubleshooting¶
| Symptom | Likely cause | Fix |
|---|---|---|
| Nothing happens after adding the label | Bots haven't reviewed yet | The loop runs on the next bot review — wait, or re-request a bot review |
@claude comment posts but nothing fixes |
claude.yml not installed or CLAUDE_CODE_OAUTH_TOKEN missing |
Check the Claude App + secret (see Prerequisites) |
| Claude fixes but bots never re-review | Claude pushed with GITHUB_TOKEN, not the PAT |
Confirm claude.yml passes github_token: ${{ secrets.PROJECT_TOKEN }} |
| Loop escalates immediately | Round cap too low, or feedback is genuinely subjective | Raise REVIEW_LOOP_MAX_ROUNDS, or resolve via the Decision artifact |
| Loop seems stuck on round N | A bot re-reviews slowly | It's event-driven; the next review advances it |
Related¶
- Human-in-the-Loop — the escalation target when the loop stalls
- GitHub Copilot Army — how the review bots fit the two-army model
- GitHub Projects v2 — board fields and the
Awaiting Decisionstatus - Setup Guide — installing the Claude App and tokens