Consolidate AI review CI quality gate#3550
Conversation
Consolidate the AI reviewers into a single required CI job with workaround for merge queue.
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
PR SummaryMedium Risk Overview The job runs on PR open/sync/reopen/ready, on completion of Cursor Bugbot or Claude Code Review check runs, and on merge queue entries. It uses a GitHub App token to read those checks and write a single For merge queue, it auto-posts success ( Reviewed by Cursor Bugbot for commit 341b6f5. Bugbot is set up for automated code reviews on this repo. Configure here. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c19330e836
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| with: | ||
| script: | | ||
| const sha = context.payload.pull_request.head.sha | ||
| const deadline = Date.now() + 8 * 60 * 1000 |
There was a problem hiding this comment.
Allow enough time for Claude reviews
Claude Code's Code Review docs state that reviews complete in 20 minutes on average, so this 8-minute deadline (with a 10-minute job timeout) will make normal PRs fail this required gate while Claude Code Review is still pending. In repositories where branch protection requires this consolidated check, those PRs remain blocked until a manual rerun or new push even though the review would later complete successfully.
Useful? React with 👍 / 👎.
| name: AI Review | ||
|
|
||
| on: | ||
| pull_request: |
There was a problem hiding this comment.
Trigger the gate when draft PRs become ready
With a bare pull_request trigger, GitHub only runs this workflow for the default opened/synchronize/reopened activity types, not ready_for_review. For PRs opened as drafts, Claude's automatic review does not run until the PR is marked ready, so this gate can time out on the draft and publishing it won't start a new gate run; the required check stays failed until a new push or manual rerun.
Useful? React with 👍 / 👎.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3550 +/- ##
==========================================
- Coverage 59.18% 58.27% -0.92%
==========================================
Files 2222 2144 -78
Lines 183555 174144 -9411
==========================================
- Hits 108639 101475 -7164
+ Misses 65132 63645 -1487
+ Partials 9784 9024 -760
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
| const blockers = [] // completed with a rejecting conclusion | ||
| const inconclusive = [] // completed without a verdict; rerun needed | ||
| const inProgress = [] // started but not finished | ||
| const notStarted = [] // no run yet | ||
| const advisory = [] // accepted but non-blocking (neutral/skipped) | ||
| REQUIRED.forEach((r, i) => { | ||
| const run = runs[i] | ||
| if (!run) { notStarted.push(r.name); return } | ||
| if (run.status !== "completed") { inProgress.push(r.name); return } | ||
| if (REJECTING.includes(run.conclusion)) { | ||
| blockers.push({ name: r.name, conclusion: run.conclusion }); return | ||
| } | ||
| if (PASSING.includes(run.conclusion)) return | ||
| if (ADVISORY.includes(run.conclusion)) { | ||
| advisory.push(run.conclusion === "neutral" ? r.name : `${r.name} (${run.conclusion})`) | ||
| return | ||
| } | ||
| inconclusive.push({ name: r.name, conclusion: run.conclusion ?? "unknown" }) | ||
| }) | ||
| const stillOut = inProgress.length + notStarted.length + inconclusive.length |
There was a problem hiding this comment.
🔴 The conclusion classification at lines 89-106 handles success (PASSING), neutral/skipped (ADVISORY), and failure/action_required (REJECTING), but the other standard GitHub check_run conclusions — cancelled, timed_out, and stale — fall through to the inconclusive bucket at line 106 and route the script into the pending branch (lines 128+). For check_run events that branch only calls core.info (line 151) and never posts a status, so once a reviewer terminates with one of those conclusions no further events fire to re-evaluate and the AI Review status is stuck at whatever was previously posted. Once AI Review is wired into branch protection, a timed_out Claude review (its documented behavior on internal deadlines), a maintainer-cancelled reviewer rerun, or a GitHub-marked stale check permanently blocks merge with no automatic recovery — fix by adding cancelled/timed_out/stale to REJECTING (block visibly with a clear message) or treating them as a distinct "needs rerun" state that posts a failure with an actionable description.
Extended reasoning...
What the bug is
The classification step at lines 41-43 + 94-107 enumerates three buckets of conclusion values: REJECTING = ["failure","action_required"], PASSING = ["success"], and ADVISORY = ["neutral","skipped"]. Per the GitHub Apps Checks spec, valid check_run.conclusion values also include cancelled, timed_out, and stale. None of these three appear in any of the buckets, so they fall through every includes() check and land in inconclusive.push(...) at line 106.
stillOut = inProgress.length + notStarted.length + inconclusive.length (line 108) is then nonzero. blockers.length === 0, so the failure branch at line 112 is skipped. stillOut !== 0, so the success branch at line 120 is skipped. Execution falls into the pending branch at lines 128-152. For a check_run event the if (context.eventName === "pull_request") guard at line 143 is false, so the script lands in the else at line 150 and only logs via core.info — no setStatus call is made.
The reviewer is already in status === "completed" on GitHubs side, so no further check_run.completedevent will fire for it on this SHA.pull_request.synchronizerequires a new push;merge_grouprequires entering the merge queue (which is blocked by this very gate). The script has no internal loop. TheAI Reviewcommit status is therefore frozen at whatever was last written — typically the originalpendingposted from the initialpull_request.openedseed, but it could equally be a priorsuccessorfailure`.
Why existing code doesn`t prevent it
The author labelled the bucket "completed without a verdict; rerun needed" (line 90), acknowledging that this needs human intervention — but the workflow does not update the status to make that need visible. The pull_request branchs own guard at line 145 (current.state !== "pending"skips) also prevents recovery via a subsequent push: even if asynchronize` event later fires, the script sees a non-pending state and skips, so any prior terminal state sticks.
Earlier review history on this PR (inline comment 3363327769) explicitly listed skipped/cancelled/timed_out/stale as needing the same handling. The author addressed skipped (moved into ADVISORY) but the remaining three were left in the inconclusive bucket — a partial fix.
Step-by-step proof (timed_out)
- Fresh PR opens. The
pull_request.openedseed runssetStatus("pending", "Waiting for reviewers to start ...")on commit X. - Cursor Bugbot completes with
conclusion: "neutral"→check_run.completedfires → script runs → Bugbot lands inadvisory, Claude stillin_progress→ falls into pending branch →eventName === "check_run"→ onlycore.info. No status update.AI Reviewremainspending. - Claude hits its internal deadline. The GitHub App posts the check with
status: "completed",conclusion: "timed_out"→check_run.completedfires. - Line 97:
status === "completed", skipinProgress. - Line 98:
REJECTING.includes("timed_out")isfalse, skipblockers. - Line 101:
PASSING.includes("timed_out")isfalse. - Line 102:
ADVISORY.includes("timed_out")isfalse. - Line 106:
inconclusive.push({name: "Claude Code Review", conclusion: "timed_out"}). - Line 108:
stillOut = 0 + 0 + 1 = 1. - Line 112:
blockers.length === 0, skip. - Line 120:
stillOut !== 0, skip. - Line 143:
eventName === "check_run"(notpull_request) → line 150 →core.infoonly. NosetStatuscall. - Both reviewers are now in terminal
completedstate on commit X; no furthercheck_run.completedevents fire for this SHA.AI Reviewis stuck at the seedspending`.
The same trace applies verbatim with cancelled (maintainer cancels a reviewer rerun via the Checks UI; GitHub Actions cancellation due to workflow concurrency) and stale (GitHub marks the check stale after certain branch operations) substituted at step 3.
Impact
Once AI Review is wired into branch protection (the explicit goal of this PR), any of these three conclusions permanently blocks merge with no automatic recovery path. The most realistic trigger is timed_out: Claude Code Review has internal deadlines and a long-running review hitting that deadline produces exactly this conclusion. A maintainer cancelling a reviewer rerun mid-flight produces cancelled. Both are normal user actions, not edge cases. This directly defeats the consolidated-gate goal of providing a single reliable required check.
A separate (and arguably worse) failure mode: if AI Review was previously written as success (e.g. both reviewers passed, then a rerun was triggered), and the rerun completes as timed_out/cancelled/stale, the gate stays at the stale success and merge is incorrectly allowed even though the latest reviewer run did not produce a verdict.
How to fix
Two acceptable options:
- Treat these conclusions as REJECTING — adds
cancelled/timed_out/staleto the REJECTING list so the gate posts a visiblefailurewith a message that tells the maintainer to rerun:
const REJECTING = ["failure", "action_required", "cancelled", "timed_out", "stale"]- Treat the inconclusive bucket as a posted-failure state — change the pending branch so that when
inconclusive.length > 0, the script always callssetStatus("failure", "Rerun needed — <reviewer> (<conclusion>)")rather than silently logging oncheck_runevents. This preserves the visible "needs human action" signal that the bucket comment already promises but the code never delivers.
Either is a small change and removes the stuck-pending failure mode before this gate becomes required.
| // PENDING: still waiting on someone. Only (re)post from a | ||
| // pull_request event. Posting pending from a check_run evaluation | ||
| // lets a stale parallel run clobber a success a newer run already | ||
| // wrote (commit statuses are last-write-wins), so we skip it here | ||
| // and just wait for the next reviewer to complete. | ||
| const done = REQUIRED.length - stillOut | ||
| const waiting = [...inProgress.map(n => `${n} (running)`), | ||
| ...notStarted.map(n => `${n} (not started)`), | ||
| ...inconclusive.map(r => `${r.name} (${r.conclusion}; rerun needed)`)].join(", ") | ||
| const msg = inconclusive.length | ||
| ? `Review incomplete (${done}/${REQUIRED.length} accepted) — ${waiting}` | ||
| : inProgress.length | ||
| ? `Still reviewing (${done}/${REQUIRED.length} in) — ${waiting}` | ||
| : `Waiting for reviewers to start — ${waiting}` | ||
|
|
||
| if (context.eventName === "pull_request") { | ||
| const current = await latestAiReviewStatus() | ||
| if (current && current.state !== "pending") { | ||
| core.info(`Skipping pending because ${statusContext} is already ${current.state}`) | ||
| return | ||
| } | ||
| await setStatus("pending", msg) | ||
| } else { | ||
| core.info(`Pending: ${msg}`) | ||
| } No newline at end of file |
There was a problem hiding this comment.
🔴 The PR added check_run types [created, rerequested] (lines 9-10) so the workflow now wakes up during reviewer re-runs, but the pending branch at lines 143-151 still refuses to call setStatus for check_run events — it only emits core.info. When a maintainer clicks Re-run on Cursor Bugbot or Claude, the rerun fires check_run.created, latestRun returns the new in_progress run, stillOut === 1, the script falls into the pending else branch, and the previous AI Review = success (or failure) stays attached to the commit during the entire rerun window. Once AI Review is wired into branch protection, a maintainer or auto-merge can merge the PR before the rerun's verdict lands. Fix: when eventName === 'check_run', post pending if the existing AI Review status's created_at predates the newest reviewer run's started_at (i.e. the terminal status is genuinely stale); the pull_request branch's current.state !== 'pending' guard at line 145 has the same gap and never downgrades terminal → pending either.
Extended reasoning...
What the bug is
The PR explicitly subscribes to check_run types [completed, created, rerequested] at line 10. The created and rerequested types only exist on the subscription so that reviewer reruns (a maintainer clicking the Re-run button, or a /claude review request) cause the workflow to re-evaluate while the new check is still in_progress. But the pending branch at lines 143-151 explicitly refuses to call setStatus for check_run events:
if (context.eventName === "pull_request") {
const current = await latestAiReviewStatus()
if (current && current.state !== "pending") { /* skip */ return }
await setStatus("pending", msg)
} else {
core.info(`Pending: ${msg}`) // ← check_run/merge_group fall here
}The pending-overwrite path for check_run is unreachable. The two trigger types this PR added for the rerun case are effectively dead code for that case.
The code path that triggers it
- t=0 — Both reviewers complete with accepted conclusions on commit X.
stillOut === 0⇒ line 124 postsAI Review = success. - t=5m — Maintainer clicks Re-run on Cursor Bugbot via the Checks UI. GitHub creates a fresh
check_runwithstatus: in_progresson the samehead_sha = Xand firescheck_run.created(andrerequestedon the prior run). - The workflow dispatches. Job-level
ifat line 23 matches (check_run.nameis in the allowlist). The script runs. latestRun('Cursor Bugbot')sorts bystarted_at desc(line 84) and returns the new in_progress run.latestRun('Claude Code Review')returns the prior completed success.- forEach at line 94: the in-progress run hits
run.status !== 'completed'⇒ pushed toinProgress.stillOut = 1. blockers.length === 0(skip line 112).stillOut !== 0(skip line 120 success).- Falls into the pending branch at 143.
context.eventName === 'check_run'⇒ theelseat line 150 fires:core.info(Pending: ...). NosetStatuscall. - The commit-status context
AI Reviewremainssuccessfrom t=0 throughout the rerun window.
Why existing safeguards don't prevent it
The comment at lines 128-132 documents the author's rationale: "Posting pending from a check_run evaluation lets a stale parallel run clobber a success a newer run already wrote." That rationale is sound for the parallel evaluation race (two reviewers completing within seconds and racing on createCommitStatus), and the workflow-level concurrency: block at lines 24-29 with cancel-in-progress: false already serializes evaluations on the same SHA, substantially mitigating that concern.
But the rationale doesn't cover the rerun race, where the existing terminal status is genuinely stale and needs to be downgraded. The parallel pull_request branch at lines 144-148 has the matching gap: if (current && current.state !== 'pending') return refuses to downgrade terminal → pending. And pull_request events don't fire on manual reviewer reruns anyway, so the check_run path is the load-bearing failure here.
Step-by-step proof
| Time | Event | AI Review status | Note |
|---|---|---|---|
| 0:00 | Both reviewers completed (success) | success | line 124 posted |
| 5:00 | Maintainer clicks Re-run on Bugbot | success | new check_run.created, status in_progress |
| 5:00+ε | Workflow runs, hits line 150 core.info only |
success | no setStatus call — gate looks green |
| 5:01 | Branch protection re-evaluates | success | required gate passes |
| 5:01 | Maintainer / auto-merge merges PR | success | merged before rerun's verdict lands |
| 5:30 | Bugbot rerun completes failure |
failure | too late — already in main |
Impact
Once AI Review is wired into branch protection (the explicit goal of this PR per the description), this gap allows a PR to be merged while a reviewer is actively re-evaluating. If the rerun would have surfaced a regression the prior run missed, it lands in main. The window is bounded by how long the rerun takes to complete (typically minutes), but it overlaps exactly the time a maintainer is most likely to merge after the rerun was kicked off.
This is the exact failure mode that was flagged pre-merge by both Cursor Bugbot (inline comment 3363146365 — "Stale success during reviewer re-runs") and Claude (inline comment 3363327757 — which specifically proposed adding [created, rerequested] as a one-line fix). The PR adopted the one-line YAML change but did not wire the corresponding script update; the regression is partially fixed.
How to fix it
The minimal correct fix is to detect the rerun case in the pending branch and post pending when the existing terminal status is older than the newest reviewer run. Sketch:
} else {
// check_run / rerequested: downgrade terminal → pending only if it's stale
const current = await latestAiReviewStatus()
const newestStart = Math.max(...runs.filter(Boolean).map(r => new Date(r.started_at).getTime()))
if (current && current.state !== 'pending' &&
new Date(current.created_at).getTime() < newestStart) {
await setStatus('pending', msg)
} else {
core.info(`Pending: ${msg}`)
}
}This preserves the parallel-clobber protection (the existing current.state !== 'pending' short-circuit still prevents post-success clobber within a single review cycle) while letting the newly-subscribed created/rerequested events actually do the job they were added for.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 8602d56. Configure here.
| - name: Generate GitHub App token | ||
| id: generate-token | ||
| uses: actions/create-github-app-token@v2 | ||
| with: | ||
| app-id: ${{ secrets.PLATFORM_CODE_AGENT_APP_ID }} | ||
| private-key: ${{ secrets.PLATFORM_CODE_AGENT_APP_PK }} |
There was a problem hiding this comment.
🔴 🔴 The Generate GitHub App token step at lines 33-38 reads secrets.PLATFORM_CODE_AGENT_APP_ID and secrets.PLATFORM_CODE_AGENT_APP_PK, but per GitHub's documented security model secrets are not passed to workflows triggered by pull_request events from forks — both interpolations evaluate to empty strings and actions/create-github-app-token@v2 fails with a missing-required-input error before the github-script step ever runs. Once AI Review is wired into branch protection as the consolidated required gate (the stated goal of this PR), every external-contributor fork PR is permanently blocked at token generation. Fix by switching the trigger to pull_request_target (safe here since no fork code is checked out), gating the workflow on github.event.pull_request.head.repo.full_name == github.repository to skip fork PRs, or wrapping the token step in continue-on-error with a try/catch that falls back to GITHUB_TOKEN.
Extended reasoning...
What the bug is
Lines 33-38 mint a Platform Code Agent App installation token via actions/create-github-app-token@v2, supplying app-id and private-key from secrets.PLATFORM_CODE_AGENT_APP_ID and secrets.PLATFORM_CODE_AGENT_APP_PK. The workflow is triggered by the bare pull_request event (line 5), not pull_request_target.
Per GitHub's documented security model (Using secrets in GitHub Actions): "With the exception of GITHUB_TOKEN, secrets are not passed to the runner when a workflow is triggered from a forked repository." So for any PR opened from a fork (the dominant contribution pattern for a public open-source repo like sei-chain), both secret references at lines 37-38 evaluate to empty strings.
Why actions/create-github-app-token@v2 fails on those empty values
Both app-id and private-key are declared as required inputs in the action's action.yml. The action either fails its own input validation or fails downstream when constructing the JWT for the installation token request to GitHub's API (empty private key cannot be parsed as a PEM, empty app-id cannot resolve an installation). Either way the step exits non-zero with a missing-required-input error and the entire evaluate job fails before the actions/github-script@v9 step at line 41 ever runs.
Why existing safeguards don't prevent it
There is no if: condition restricting the workflow to non-fork PRs, no continue-on-error: true on the token step, and no try/catch wrapper. The token-generation step is the first step in the only job, so its failure terminates the entire workflow run. The permissions: statuses: write block at lines 14-15 is irrelevant here — the failure happens before any API call.
This is a distinct failure mode from prior comments on the PR:
- Inline comment 3357982423 flagged the read-only
GITHUB_TOKEN403 oncreateCommitStatusfor fork PRs. The author addressed that by adding the App token step. - Inline comment 3363785995 flagged that the App token wasn't wired into github-script. The author fixed that in commit 341b6f5.
- This bug points out that the App token fix itself depends on secrets that aren't available in fork PR contexts — the App token can't even be minted.
Impact
Once AI Review is wired into branch protection as the consolidated required gate (the stated goal in the PR description), every fork PR is permanently blocked from merging until a maintainer overrides branch protection or pushes the work through a maintainer-owned branch. For sei-chain as a public OSS chain with external contributors, this is a real operational blocker, not an edge case.
Note: other workflows in this repo that need to operate on fork PR context (.github/workflows/uci-release-check.yml, uci-backport.yml) use pull_request_target precisely to access secrets — confirming the standard fix pattern.
Step-by-step proof
- External contributor opens a PR from
theirfork/sei-chain:feature→sei-protocol/sei-chain:main. pull_request.openedfires; theevaluatejob dispatches on a fresh runner.- The runner's environment for fork-PR workflows has access to
GITHUB_TOKENonly; thesecrets.*context returns empty strings for every other secret reference. - Step
Generate GitHub App tokenruns:uses: actions/create-github-app-token@v2withapp-id: ""andprivate-key: "". - The action fails — either at
core.getInput('app-id', { required: true })(which throwsInput required and not supplied: app-idwhen the value is empty), or downstream when it tries to sign a JWT with an empty PEM and GitHub responds 401. - Step exits non-zero. Job fails.
AI Reviewcommit status is never written (the seed at line 100+ of the github-script body never executes because github-script never runs). - Branch protection (once
AI Reviewis required) blocks the PR from merging. - Every subsequent fork PR repeats this — the gate becomes an unconditional block on external contributions.
How to fix it
Three viable options, in roughly increasing order of complexity:
(a) Switch to pull_request_target — secrets are available on this trigger, and this workflow is safe under it because no fork code is checked out (the script only calls the GitHub REST API):
on:
pull_request_target:
types: [opened, synchronize, reopened, ready_for_review]
check_run:
types: [completed, created, rerequested]
merge_group:(b) Skip fork PRs entirely — keeps fork PRs unblocked by simply not running the gate for them (requires the branch-protection setup to mark the check as not-required for fork PRs, or to accept that fork PRs land via a different path):
if: >
github.event_name == 'merge_group' ||
(github.event_name == 'pull_request' && github.event.pull_request.draft == false &&
github.event.pull_request.head.repo.full_name == github.repository) ||
(github.event_name == 'check_run' && ...)(c) Wrap token generation in continue-on-error with GITHUB_TOKEN fallback — least preferred because it inherits the read-only-token 403 problem on fork PRs that the App token was added to solve in the first place.
🔬 also observed by verify-runtime

Consolidate the AI reviewers into a single required CI job with workaround for merge queue.