The AI reviewer's produce pass re-runs the entire review up to 8 times per PR because it has no completion signal. A review is a single-pass job — read the diff, check acceptance criteria, emit a verdict — so the redundant iterations waste tokens/time and, by lengthening each review, widen the window to hit the transient claude-code exited with code 1 that already parked PR #77 (#76's review) via the fail-safe. This makes the gate (ADR-0020) fragile in practice.
Evidence
A live run of ReviewerAdapter.review() against an open PR produced this log:
Iteration 1/8 … Agent started … [complete review + verdict] … Agent stopped
Iteration 2/8 … Agent started … [re-reviews the whole diff again]
reviewer-adapter.ts pass-1 calls run({ …, maxIterations: 8 }) with no completionSignal, so sandcastle keeps re-invoking the agent toward a completion it never declares. The "final" verdict becomes whatever the last iteration emitted, not the first complete one.
Goal (end-to-end value)
The reviewer's produce pass runs exactly one complete review, then hands off to the existing extraction pass. Reviews are ~8× faster/cheaper and far less likely to hit a mid-run transient, while the verdict and fail-safe behavior are unchanged.
Acceptance criteria
Testing (highest, single seam — logic in TypeScript)
The adapter spawns a live sandbox, so the testable seam is the pure run-config builder, mirroring buildAgentInput in sandbox-runner.ts:
- Extract a small pure function that returns the pass-1
run options (model, sandbox, promptFile, name, and the iteration/completion config) from the adapter's inputs — no Docker, no network.
- Unit-test (in a
*.test.ts using node:test + node:assert/strict) that the produce-pass config caps at a single pass (maxIterations === 1 or a non-empty completionSignal). Prior art: buildAgentInput is already a pure builder tested without spawning.
- Keep live reviewer execution behind the gated
integration.test.ts (SANDCASTLE_INTEGRATION=1); default npm test and the sandcastle CI job stay green. Do not add a new seam beyond the pure config builder.
Unchanged (do not touch)
reduce.ts (verdict mapping / review gate, #74/ADR-0020), the extraction pass, parseDiffLines/filterInlineComments, the noSandbox() + permissionMode: "auto" choices, and the SandboxFailed retry (#76).
Constraints (per repo workflow)
Implement with /tdd, run shell via /exec, scope to reviewer-adapter.ts + the new pure builder + its test (+ a one-line code comment), no new dependencies, no pushing to main.
The AI reviewer's produce pass re-runs the entire review up to 8 times per PR because it has no completion signal. A review is a single-pass job — read the diff, check acceptance criteria, emit a verdict — so the redundant iterations waste tokens/time and, by lengthening each review, widen the window to hit the transient
claude-code exited with code 1that already parked PR #77 (#76's review) via the fail-safe. This makes the gate (ADR-0020) fragile in practice.Evidence
A live run of
ReviewerAdapter.review()against an open PR produced this log:reviewer-adapter.tspass-1 callsrun({ …, maxIterations: 8 })with nocompletionSignal, so sandcastle keeps re-invoking the agent toward a completion it never declares. The "final" verdict becomes whatever the last iteration emitted, not the first complete one.Goal (end-to-end value)
The reviewer's produce pass runs exactly one complete review, then hands off to the existing extraction pass. Reviews are ~8× faster/cheaper and far less likely to hit a mid-run transient, while the verdict and fail-safe behavior are unchanged.
Acceptance criteria
reviewer-adapter.tspass-1 stops after one complete review — eithermaxIterations: 1, or acompletionSignalthe prompt emits (reference run-with-extraction pattern). Prefer whichever keeps the adapter simplest; document the choice in a code comment.Output.objectextraction (and the!reviewResult.resumefallback to a freshrun) is untouched.null→reviewVerdict(null)→changes-requested(never auto-merge). Do not touchreduce.ts.Iteration 1/1(or one completion-signalled pass) in.sandcastle/logs/…-review-issue-*.log, not a re-review.Testing (highest, single seam — logic in TypeScript)
The adapter spawns a live sandbox, so the testable seam is the pure run-config builder, mirroring
buildAgentInputinsandbox-runner.ts:runoptions (model, sandbox, promptFile, name, and the iteration/completion config) from the adapter's inputs — no Docker, no network.*.test.tsusingnode:test+node:assert/strict) that the produce-pass config caps at a single pass (maxIterations === 1or a non-emptycompletionSignal). Prior art:buildAgentInputis already a pure builder tested without spawning.integration.test.ts(SANDCASTLE_INTEGRATION=1); defaultnpm testand thesandcastleCI job stay green. Do not add a new seam beyond the pure config builder.Unchanged (do not touch)
reduce.ts(verdict mapping / review gate, #74/ADR-0020), the extraction pass,parseDiffLines/filterInlineComments, thenoSandbox()+permissionMode: "auto"choices, and the SandboxFailed retry (#76).Constraints (per repo workflow)
Implement with
/tdd, run shell via/exec, scope toreviewer-adapter.ts+ the new pure builder + its test (+ a one-line code comment), no new dependencies, no pushing to main.