feat(control-plane,web): persist + surface the runner failure detail (#137)#169
Conversation
#137) The runner already reported a free-text `detail` on POST /internal/tasks/{id}/status, but `set_status` only `info!`-logged it and dropped it ("not persisted yet"), and `tasks` had no column to hold it. So a review that failed or posted nothing showed green on the dashboard with no reason — a 14-day prod audit found 98 of 144 (~68%) "succeeded" PR-review tasks had posted nothing, failures swallowed as success. - Migration 0016: add nullable `error_detail TEXT` to `tasks`. - `set_task_status` takes `detail: Option<&str>` and writes it via `COALESCE($4, error_detail)` so a later detail-less report can't erase a recorded reason; threaded through `set_status` (stop dropping it) and both reaper sites. - `TaskRow.error_detail` (selected by `TASK_SELECT`'s `t.*`) so GET /tasks and GET /tasks/{id} return it. - Web run detail: surface `error_detail` as a calm inline status line — "Review did not post: <detail>" — for failed runs and for succeeded-but-empty no-ops, so a silent no-op is tellable apart from a real clean review. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
✅ AI Governance check passedThis PR declares AI usage, references a source of truth, and provides verification evidence. Thank you. |
There was a problem hiding this comment.
Code Review
This pull request introduces persistence and frontend visualization for the runner's free-text status reason (error_detail), preventing silent failures or no-ops from being mistaken for successful reviews. However, a critical issue was identified in the database status update logic: when a task is retried and transitions back to running, the previous attempt's error_detail is not cleared. This can lead to successful retried runs being incorrectly flagged as silent no-ops on the dashboard. A suggestion has been provided to clear the error detail upon transitioning to the running state.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
|
@lightbridge-assistant please review this |
|
@lightbridge-assistant please review this again |
#169 review) Gemini (high): COALESCE($4, error_detail) kept the prior attempt's failure reason forever, so a retried run that now succeeds was still flagged as a silent no-op on the dashboard. Clear error_detail on any 'running' transition (fresh attempt = clean slate); preserve it on detail-less terminal reports. Added a regression assertion. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1. Summary
This PR changes:
0016_task_error_detail.sql— a nullableerror_detail TEXTcolumn ontasks.detail:set_task_statusnow takesdetail: Option<&str>and writes it viaCOALESCE($4, error_detail);set_statuspassesupdate.detailthrough (it previously only logged and dropped it — the code literally said "not persisted yet"); both reaper call sites updated (the reaper's terminal-fail path records a reason).TaskRow.error_detail(selected byTASK_SELECT'st.*) soGET /tasksandGET /tasks/{id}return it.error_detailas a calm inline status line ("Review did not post: ") for failed runs and for succeeded-but-empty no-ops, so a silent no-op is tellable apart from a real clean review.It solves:
detailonlyinfo!-logged and then dropped ("not persisted yet").2. Intent
The intent of this PR is:
3. Scope
In Scope
services/control-plane: migration0016,db.rs(set_task_statussignature +TaskRow.error_detail),http/internal.rs(set_statusthreadingdetail),queue/reaper.rs(call-site updates).apps/web:Task.error_detailtype + run detail page rendering.Out of Scope
webhook.rsidempotency logic and the agent-runner itself (separate concurrent PRs).4. Verification
I verified this change by:
A new DB test (
set_task_status_persists_and_preserves_detail) covers the error case: a reporteddetailis persisted, and a later detail-less report does not erase the recorded reason (COALESCE).Commands run:
Results:
5. Screenshots / Evidence
http/internal.rs(StatusUpdate.detail— "not persisted yet") is now removed and the value is persisted.6. Risk Assessment
Risk level:
Potential risks:
ADD COLUMN IF NOT EXISTS— idempotent and backward-compatible (older rows readNULL).Mitigation:
COALESCEwrite semantics ensure a later report can't erase a recorded reason; covered by a test.error_detailtruthy), so existing clean runs are unaffected.7. AI Usage Declaration
AI was used for:
Human verification:
8. Reviewer Focus
Please focus your review on: