Skip to content

fix(agent-runner): round-2 review-agent fixes — never discard buffered findings, symlink-safe read_file, tunable budget (#137)#170

Merged
stephane-segning merged 1 commit into
mainfrom
claude/review-agent-round2
Jun 23, 2026
Merged

fix(agent-runner): round-2 review-agent fixes — never discard buffered findings, symlink-safe read_file, tunable budget (#137)#170
stephane-segning merged 1 commit into
mainfrom
claude/review-agent-round2

Conversation

@stephane-segning

Copy link
Copy Markdown
Contributor

1. Summary

This PR changes (round-2 fixes to the native review agent, all in services/agent-runner):

  • read_file symlink escape (SECURITY-HIGH): resolve_in_root was purely lexical (rejects ../absolute) but followed symlinks — a malicious PR could plant an in-repo symlink to /etc/passwd or the SA token and have the model read it. It now canonicalizes both the checkout root and the resolved path and verifies the real target stays inside the real root, rejecting otherwise.
  • read_file truncated line-count: a sliced read of an over-64 KiB file reported a total line count computed from the truncated content, so "(of {total})" lied. The message now says the file was truncated at the byte cap instead of quoting a bogus total.
  • MAX_TURNS → config-driven: the hardcoded const MAX_TURNS = 16 becomes a tunable review.max_turns knob (DEFAULT_MAX_TURNS = 40). 16 was far too tight: turns are ~6s on the deepseek model and the agent records roughly one finding per turn.
  • Never discard buffered work, never end silently: run_native_agent now returns a ReviewOutcome { Finished, Exhausted, Aborted(reason) } instead of bailing on exhaustion. main.rs finalizes on Finished AND Exhausted AND Aborted so buffered findings are always posted (with a truncation note on Exhausted, an abort note on Aborted). Only a true transport error posts nothing.
  • Failure detail on the FINAL status: the failure/exhaustion/abort detail is carried to the terminal succeeded/failed report instead of a mid-run running report (which the control plane clears on every running transition).

It solves:


2. Intent

The intent of this PR is:

Make every review run leave a visible artifact (findings, a clean "no issues" note, a truncation note, or an abort note) unless the gateway was unreachable — so buffered work is never silently dropped — and close the read_file symlink-escape hole. Plus make the turn budget operator-tunable so a large PR isn't truncated by a budget that was set too low.


3. Scope

In Scope

  • services/agent-runner/src/review/native/tools.rs — symlink-safe read_file (canonicalize + containment check) + honest truncation message + a symlink-escape unit test.
  • services/agent-runner/src/review/native/agent.rsReviewOutcome enum, config-driven max_turns, finish-nudge, no longer bails on exhaustion/abort.
  • services/agent-runner/src/bootstrap/config.rsmax_turns on ReviewFile/ReviewConfig + DEFAULT_MAX_TURNS.
  • services/agent-runner/src/main.rs — finalize on all three outcomes, set truncation/abort summaries, carry detail to the terminal status.
  • services/agent-runner/src/review/mod.rs — re-export ReviewOutcome.

Out of Scope

  • The ai-helm chart wiring to expose the max_turns knob (handled separately; this is the code side only).
  • The control-plane change that clears error_detail on running transitions (separate PR; this PR adapts to it by reporting detail on the terminal status).

4. Verification

I verified this change by:

  • Running automated tests
  • Running manual tests
  • Checking logs
  • Checking metrics
  • Testing error cases
  • Testing permissions/security behavior
  • Testing rollback or failure behavior, if relevant

Commands run:

cargo fmt -p agent-runner -- --check
cargo clippy -p agent-runner --all-targets -- -D warnings
cargo test -p agent-runner

Results:

$ cargo fmt -p agent-runner -- --check
FMT OK   (no diff)

$ cargo clippy -p agent-runner --all-targets -- -D warnings
    Checking agent-runner v0.1.1 (.../services/agent-runner)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 11.97s
   (no warnings)

$ cargo test -p agent-runner
test review::native::tools::tests::read_file_rejects_symlink_escape ... ok
test review::native::agent::tests::native_loop_exhausts_budget_without_discarding ... ok
test review::native::agent::tests::native_loop_abort_returns_aborted_outcome ... ok
test review::native::agent::tests::native_loop_searches_records_and_finishes ... ok
test review::native::agent::tests::native_loop_fails_over_to_secondary_model ... ok
...
test result: ok. 45 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
test result: ok. 8 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out   (control_plane_contract)

5. Screenshots / Evidence


6. Risk Assessment

Risk level:

  • Low
  • Medium
  • High

Potential risks:

  • Finalizing on Exhausted/Aborted posts a review that previously posted nothing — a behavioural change, but the whole point (and gated by the existing finalize empty-run backstop).
  • canonicalize requires the path to exist; a non-existent file now maps to "not found" (already the prior behaviour for a missing file).

Mitigation:

  • New unit tests cover the symlink-escape rejection, the exhaustion-without-discard path, and the abort outcome; existing finish/failover/circuit-breaker tests still pass.
  • max_turns defaults generously (40) and is config-gated, so behaviour is unchanged where unset except the higher ceiling.

7. AI Usage Declaration

AI was used for:

  • Understanding existing code
  • Generating code
  • Refactoring
  • Generating tests
  • Drafting documentation
  • Reviewing the diff
  • Not used

Human verification:

  • I understand every meaningful change in this PR
  • I checked generated code manually
  • I checked generated tests manually
  • I removed unsupported AI assumptions
  • I accept responsibility for this PR

8. Reviewer Focus

Please focus your review on:

  • Correctness
  • Architecture
  • Security
  • Performance
  • Tests
  • Maintainability
  • Product intent
  • Edge cases

🤖 Generated with Claude Code

…d findings, symlink-safe read_file, tunable budget (#137)

Five round-2 fixes to the native review agent (all in services/agent-runner):

1. read_file symlink escape (SECURITY-HIGH): resolve_in_root was purely
   lexical and followed symlinks — a planted in-repo symlink to /etc/passwd
   or the SA token could be read. Now canonicalize both the checkout root and
   the resolved path and verify the target stays inside the root; reject
   otherwise. Non-existent path → clean "not found". New unit test.

2. read_file truncated line-count: a sliced read of an over-cap file reported
   a total line count computed from truncated content. Now the message says
   the file was truncated at the byte cap instead of quoting a bogus total.

3. MAX_TURNS → config: max_turns is now a ReviewConfig knob (review.max_turns
   / LLM_MAX_TURNS), DEFAULT_MAX_TURNS = 40 (was a hardcoded 16, far too tight
   on the ~6s/turn deepseek model). ai-helm chart wiring handled separately.

4. Never discard buffered work, never end silently: run_native_agent returns a
   ReviewOutcome { Finished, Exhausted, Aborted(reason) } instead of bailing on
   exhaustion. main.rs finalizes on Finished AND Exhausted AND Aborted, with a
   truncation note on Exhausted and an abort note on Aborted. A real prod run
   lost 5 findings when exhaustion used to drop the buffer at turn 16. Only a
   true transport Err posts nothing. Light one-time nudge to call finish once a
   finding is recorded.

5. Failure detail on the FINAL status: the review-failure/exhaustion/abort
   detail is carried to the terminal succeeded/failed report instead of a
   mid-run running report (the control plane clears error_detail on running).

Verified: cargo fmt --check, clippy -D warnings, cargo test (45+8 pass).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@github-actions

Copy link
Copy Markdown

✅ AI Governance check passed

This PR declares AI usage, references a source of truth, and provides verification evidence. Thank you.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an operator-tunable turn budget (max_turns) for the review agent, defaulting generously to 40 turns. It refactors the agent loop to return a structured ReviewOutcome (Finished, Exhausted, or Aborted) rather than treating non-fatal outcomes as errors, which ensures that buffered findings are finalized and posted instead of being discarded. Additionally, it resolves a security vulnerability in the read_file tool by canonicalizing paths to prevent symlink escapes outside the repository checkout root. There are no review comments provided, so I have no feedback to address.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@stephane-segning stephane-segning merged commit 85d1122 into main Jun 23, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Ticket]: Implement real authN — better-auth sessions + Rust user store

1 participant