Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions docs/adr/0021-durable-agent-lifecycle-labels.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Durable agent lifecycle labels

Two GitHub labels β€” `agent:in-progress` and `agent:in-review` β€” mark the two work-item lifecycle phases that otherwise leave no durable artifact on GitHub: a sandbox actively implementing an issue, and the AI reviewer (ADR-0020) running on an open PR. They make the orchestrator's in-flight state **durable for crash recovery**: `State.inFlight` stays the runtime source of truth, and the labels are a mirror written at each transition and read only at startup to reconcile work a crashed predecessor left mid-flight.

## Why

The work-item lifecycle has four phases, but only the endpoints carry a GitHub artifact:

| Phase | Durable artifact before this change |
| --- | --- |
| unblocked, workable | `ready-for-agent` label |
| **sandbox implementing** | none β€” `State.inFlight` (in-memory) only |
| **reviewer running on the open PR** | the PR exists, but nothing marks that a review is in flight |
| done | closed issue |

If the orchestrator is killed mid-flight, the in-memory `inFlight` is lost and the issue β€” claimed (i.e. `ready-for-agent` removed) but unlabelled β€” is silently abandoned. `#76` re-queues on a `SandboxFailed` *event*, but process death is not an event, so the crash gap is real and otherwise uncovered.

## Lifecycle

- **Claim:** `ready-for-agent` β†’ `agent:in-progress`.
- **PR opened (SandboxFinished):** `agent:in-progress` β†’ `agent:in-review`.
- **Verdict (ReviewFinished):** `agent:in-review` β†’ removed. From here the PR + CI status are the artifact; `pass` β†’ EnableAutoMerge, `changes-requested`/fail-safe β†’ WaitForHuman (review posted).
- **SandboxFailed (#76):** `agent:in-progress` β†’ `ready-for-agent` (retry) or unlabelled + comment (retries exhausted).

## Startup reconcile (the point of the durability)

- `agent:in-progress` found (no PR yet) β†’ **re-queue** to `ready-for-agent`; the normal tick re-claims and starts fresh. `resetAgentBranch` (#23) already wipes the stale `agent/issue-N` branch.
- `agent:in-review` found (PR already open) β†’ **re-run the read-only review** on the existing PR; do **not** re-queue (that would spawn a duplicate sandbox and a second PR) and do not leave it stuck (the pending-PR reconcile loop does not re-review). The reviewer is read-only and runs off the PR diff, so re-reviewing is idempotent.

## Considered and rejected

- **Labels as the hot-path source of truth** β€” rejected. GitHub-API latency on every tick is not worth it; `State.inFlight` stays authoritative at runtime and the labels are a durable mirror, read only at boot.
- **One label (in-progress only)** β€” rejected: it misses the reviewer window, which has no artifact until the review posts. **Full pipeline as labels** (`agent:ready`/`…in-progress`/`…in-review`/`…merging`) β€” also rejected: the open PR + CI already are the artifact for the post-review phases, so extra labels just duplicate state to keep consistent. Two labels fill exactly the two artifact-less phases.
- **A hard cross-process lock** β€” rejected as a non-goal. Preventing double-claim is a *defensive guardrail* only (a running orchestrator skips `agent:in-progress`/`agent:in-review` issues); there is no concurrent-orchestrator scenario today, and GitHub labels have no atomic compare-and-swap (two writers both see `ready-for-agent` and both claim β€” unsolvable with labels). Single-writer-per-project holds; **do not** build atomic locking on top of labels.
- **Renaming `ready-for-agent` β†’ `agent:ready`** for namespace consistency β€” rejected: it is the `READY_LABEL` constant referenced across `reduce.ts`, `CLAUDE.md`, `to-issues`, and existing open issues; a breaking rename + migration is not worth prefix symmetry. Keep `ready-for-agent`; add the `agent:*` pair alongside it.

## Notes

- The label transitions belong in the pure reducer (`reduce.ts`, emitting `Relabel`/`SetLabel` actions) so they are unit-testable, consistent with the repo's reducer-is-the-seam philosophy. The orchestrator carries them out.
- Relates to ADR-0020 (the review gate whose window `agent:in-review` marks) and #76 (the `SandboxFailed` retry that is `agent:in-progress`'s failure exit).
Loading