Skip to content

fix(control-plane): @mention re-reviews must always create a task (#137)#168

Merged
stephane-segning merged 2 commits into
mainfrom
claude/fix-mention-idempotency
Jun 23, 2026
Merged

fix(control-plane): @mention re-reviews must always create a task (#137)#168
stephane-segning merged 2 commits into
mainfrom
claude/fix-mention-idempotency

Conversation

@stephane-segning

Copy link
Copy Markdown
Contributor

1. Summary

This PR changes:

  • Adds create_explicit_task in db.rs — an always-lands-a-row, race-free insert for the @mention path that folds the next run_epoch into the INSERT (COALESCE(MAX(run_epoch), -1) + 1 over the same columns as the idempotency unique index, using the REAL command_text).
  • Changes handle_issue_comment (webhook.rs) to parse the instruction once and use create_explicit_task; drops the buggy next_run_epoch(…, "review", …) call from this path.
  • Removes the now-unused next_run_epoch and its test; adds two #[sqlx::test] regression tests.

It solves:


2. Intent

The intent of this PR is:

An @mention is an explicit human command — it must ALWAYS create a task. The old code computed run_epoch with the hardcoded literal "review" but created the task with the REAL command_text (e.g. "review this"). The idempotency unique index keys on the real command_text, so the epoch was computed against the wrong group, returned the same value every time, and a repeated identical mention on an unchanged head collided → create_task returned Ok(None) → logged "review task already exists; skipping (idempotent)"no task, no review. Varying the wording sidestepped it (different command_text), which is why it was intermittent. There was ALSO a non-atomic TOCTOU race: next_run_epoch then create_task were separate statements, so two near-simultaneous deliveries could compute the same epoch and one was dropped. True webhook redeliveries are already deduped upstream by the github_deliveries delivery-id PRIMARY KEY, so content-idempotency on the comment path adds nothing and only causes this bug.

Prod evidence: the "review task already exists; skipping (idempotent)" log on a legitimate re-request, observed on PR izhub#198.


3. Scope

In Scope

  • services/control-plane/src/db.rs — new create_explicit_task; remove next_run_epoch + its test; add regression tests.
  • services/control-plane/src/http/webhook.rs — comment/@mention path uses the new explicit insert.

Out of Scope

  • The auto pull_request (opened) path is unchanged — it keeps content-idempotency via create_task so GitHub's duplicate opened+synchronize deliveries for one head still collapse to a single review.
  • agent-runner, internal.rs status/detail handling, migrations, and web (separate concurrent PRs); deployment / ai-helm.

4. Verification

I verified this change by:

  • Running automated tests
  • Running manual tests
  • Checking logs
  • Checking metrics
  • Testing error cases
  • Testing permissions/security behavior
  • Testing rollback or failure behavior, if relevant

Commands run:

cargo fmt -p control-plane -- --check
cargo clippy -p control-plane --all-targets -- -D warnings
DATABASE_URL=postgresql://lightbridge:lightbridge@localhost:5432/lightbridge cargo test -p control-plane

Results:

$ cargo fmt -p control-plane -- --check
(no output — formatting OK)

$ cargo clippy -p control-plane --all-targets -- -D warnings
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.38s   (no warnings)

$ cargo test -p control-plane
test db::tests::explicit_mention_always_creates_a_task_at_the_next_epoch ... ok
test db::tests::auto_review_collapses_duplicate_opened_and_synchronize ... ok
test result: ok. 61 passed; 0 failed; 1 ignored; 0 measured; 0 filtered out; finished in 6.10s

The DB tests ran against a local pgvector at postgresql://lightbridge:lightbridge@localhost:5432/lightbridge (the repo CI DB).


5. Screenshots / Evidence

Add evidence here:

  • Logs: prod log line "review task already exists; skipping (idempotent)" on a legitimate re-request (PR izhub#198).
  • New regression test explicit_mention_always_creates_a_task_at_the_next_epoch proves two identical "review this" mentions on the same (repo, target, head) now create TWO tasks at epochs N and N+1.
  • New regression test auto_review_collapses_duplicate_opened_and_synchronize proves the auto path still collapses a duplicate to ONE task.

6. Risk Assessment

Risk level:

  • Low
  • Medium
  • High

Potential risks:

  • The comment path no longer content-dedupes. This is intentional: redeliveries are deduped upstream by the github_deliveries PK, so a genuine re-request now always lands.

Mitigation:

  • Auto opened path is untouched and still deduped; new tests cover both paths.

7. AI Usage Declaration

AI was used for:

  • Understanding existing code
  • Generating code
  • Generating tests
  • Reviewing the diff
  • Not used

Human verification:

  • I understand every meaningful change in this PR
  • I checked generated code manually
  • I checked generated tests manually
  • I removed unsupported AI assumptions
  • I accept responsibility for this PR

8. Reviewer Focus

Please focus your review on:

  • Correctness
  • Architecture
  • Security
  • Performance
  • Tests
  • Maintainability
  • Product intent
  • Edge cases

🤖 Generated with Claude Code

An explicit `@lightbridge-assistant` re-review was silently dropped when
the same wording landed on an unchanged head. `handle_issue_comment`
computed `run_epoch` via `next_run_epoch(…, "review", …)` — the hardcoded
literal "review" — but created the task with the REAL `command_text`
(e.g. "review this"). The idempotency unique index keys on the real
`command_text`, so the epoch was computed against the wrong group, came
back identical every time, and a repeated identical mention collided →
`create_task` returned `Ok(None)` → "skipping (idempotent)" → no task,
no review. Varying the wording sidestepped it, hence the intermittence.
There was also a TOCTOU race: `next_run_epoch` then `create_task` were
separate statements, so two near-simultaneous deliveries could compute
the same epoch and one was dropped.

An @mention is an explicit human command — it must ALWAYS create a task.
True webhook redeliveries are already deduped upstream by the
`github_deliveries` delivery-id PRIMARY KEY, so content-idempotency on
the comment path adds nothing and only causes this bug.

- Add `create_explicit_task` that folds the epoch into the INSERT
  (`COALESCE(MAX(run_epoch), -1) + 1` over the same columns as the
  unique index, minus run_epoch, using the REAL command_text). Atomic,
  collision-free, race-free; notifies TASK_QUEUED_CHANNEL like create_task.
- `handle_issue_comment` parses the instruction once and uses
  `create_explicit_task`; drop the `next_run_epoch(…, "review", …)` call.
- Leave the auto `pull_request` (opened) path on `create_task` so GitHub's
  duplicate opened/synchronize deliveries for one head still collapse.
- Remove now-unused `next_run_epoch` and its test.
- Regression tests: two identical "review this" mentions create two tasks
  at epochs N and N+1; the auto path still collapses a duplicate to one.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@github-actions

Copy link
Copy Markdown

✅ AI Governance check passed

This PR declares AI usage, references a source of truth, and provides verification evidence. Thank you.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a dedicated path for explicit @mention commands (create_explicit_task) to prevent them from being content-deduplicated, ensuring manual re-requests are always processed. It replaces the separate next_run_epoch query with an atomic subquery inside the INSERT statement. The review feedback identifies a potential concurrency race condition in create_explicit_task under the default READ COMMITTED isolation level, where concurrent inserts could compute the same run_epoch and trigger a unique constraint violation. A retry mechanism is suggested to handle this scenario robustly.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread services/control-plane/src/db.rs

@lightbridge-assistant lightbridge-assistant Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lightbridge review

The change correctly implements the core intent — explicit @mention tasks always land a row, never silently dropped by content-idempotency — and the SQL subquery is a clean design for atomic epoch computation. The test coverage is good. However, there is a P0 concurrency bug: two simultaneous create_explicit_task calls for the same natural key can compute the same epoch and either collide on a unique constraint (losing one task with an error) or silently produce duplicate epochs, contradicting the "no TOCTOU window" claim. This needs a retry loop or advisory lock. There are also P1 issues with a reused github_delivery_id in the test (may hit a UK) and silently swallowed errors in create_explicit_review_task.


🤖 AI-generated review — treat it as untrusted, verify before acting; a human owns the final decision (AI governance).

Comment thread services/control-plane/src/db.rs Outdated
Comment thread services/control-plane/src/db.rs
Comment thread services/control-plane/src/db.rs
Comment thread services/control-plane/src/db.rs
Comment thread services/control-plane/src/http/webhook.rs
…168 review)

Two round-2 fixes:
- Gemini (high): create_explicit_task's MAX(run_epoch)+1 subquery isn't atomic
  under READ COMMITTED — concurrent mentions could collide on the idempotency
  index (23505) and silently drop a task. Add a bounded retry on 23505 (fresh
  epoch each attempt); corrected the misleading 'collision-free' doc.
- Maintainer feedback: stop stripping the @handle — pass the WHOLE comment body
  as command_text (trimmed + capped). The agent knows its own name from the
  system prompt; stripping mangled co-mentions like '@bot & /gemini please
  review this'. Replaced parse_instruction with command_from_comment + tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@stephane-segning stephane-segning merged commit a85bd5e into main Jun 23, 2026
13 checks passed
@stephane-segning stephane-segning deleted the claude/fix-mention-idempotency branch June 23, 2026 10:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant