Skip to content

feat(review-pr): incremental reviews of new commits since last review#37

Merged
Sayt-0 merged 5 commits into
mainfrom
feat/incremental-review
Jul 3, 2026
Merged

feat(review-pr): incremental reviews of new commits since last review#37
Sayt-0 merged 5 commits into
mainfrom
feat/incremental-review

Conversation

@Sayt-0

@Sayt-0 Sayt-0 commented Jul 2, 2026

Copy link
Copy Markdown
Member

Implements Phase 2D (incremental review): on PRs with multiple review cycles, re-reviews only the commits pushed since the last completed review instead of the full PR diff. Saves tokens, avoids duplicate comments, and skips unchanged code. Every unsafe or ambiguous state falls back to a full review (fail-open), so a review is never blocked or degraded.

Issue mapping

# Item Implementation
31 Incremental review state tracking Last reviewed SHA read from the commit_id GitHub records on every posted review (GET /pulls/{n}/reviews), which is the PR head SHA at posting time. Survives across workflow runs with no extra writes and cannot be edited away like an HTML comment marker. Only completed runs count (### Assessment: or LGTM bodies); timeout and failure fallbacks leave commits unreviewed.
32 Commit-range diffing git diff <last-reviewed-sha>..HEAD, restricted at file level to the full PR diff so inline comments never anchor outside what GitHub accepts (422). Full diff fallback when no previous review exists or the base branch has been merged in (detected by comparing merge-bases at last review vs now).
33 Rebases and force-pushes git cat-file -e (SHA absent from the clone) and git merge-base --is-ancestor (history rewritten) both force a full review.
34 Finding deduplication across cycles src/dedupe-findings/: new comments matching an existing bot comment by file path + line proximity (tolerance 3) + finding-heading Jaccard similarity (threshold 0.5) are dropped before posting. Human comments and bot replies never suppress a finding. Runs on every posting path (staged at /tmp/dedupe-findings.js, existing comments pre-fetched by the action).
35 incremental input Added to review-pr/action.yml (default "true") and to the reusable workflow (type: boolean, default true). false forces a full review.

Flow

pr.diff (full)  ->  incremental-review CLI
                      |- no previous review / rebase / force-push /
                      |  base merged in / net-zero / any error  ->  full review (unchanged)
                      |- otherwise:
                           pr.diff      <- git diff last-sha..HEAD (restricted to PR files)
                           pr_full.diff <- preserved full diff
                                           (stale-thread resolution + suggestion validation)

Fallback reasons (all covered by unit tests)

Reason Trigger
no-previous-review no completed docker-agent review on the PR
no-new-commits HEAD is the last reviewed commit (explicit re-request)
unknown-sha last reviewed SHA not in the local clone (force-push)
history-rewritten last reviewed SHA not an ancestor of HEAD (rebase)
base-unresolved / base-merged-in base ref missing, or merge-base moved since last review
net-zero-changes changes since last review cancel out against the base
disabled / error / no-diff input off, API/git failure, missing pr.diff

Design notes

  • Anchor safety: in incremental mode the original full diff is preserved as pr_full.diff; stale-thread resolution and the suggestion validator run against it (GitHub validates anchors against the full PR diff, not the incremental range).
  • Per AGENTS.md, all decision logic lives in TypeScript (src/incremental-review/, src/dedupe-findings/) with Vitest coverage; the action YAML only orchestrates CLI calls.
  • The agent prompt gains an "Incremental Review" section instructing it to review only pr.diff contents and to scope the assessment to the new commits.

Validation

Check Result
pnpm build (tsup) pass
tsc --noEmit pass
biome ci . pass
actionlint pass
Unit tests 742 passed (61 new)
tests/*.sh shell integration pass
Bundled CLI smoke tests (dedupe drop/keep, disabled and fail-open paths) pass

Sayt-0 and others added 2 commits July 2, 2026 17:36
Re-reviews only the range from the last reviewed SHA to HEAD instead
of the full PR diff, tracking state via the commit_id GitHub records
on completed reviews. Falls back to a full review on force-push or
rebase, base merge-ins, or any error, and dedupes findings against
already-posted comments so full re-reviews never duplicate threads.
Adds the incremental input (default true) to the action and the
reusable workflow.
@Sayt-0 Sayt-0 requested a review from derekmisler July 3, 2026 11:49

@docker-agent docker-agent left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟡 NEEDS ATTENTION

One medium-severity finding in the incremental-review CLI: a writeFileSync failure in the catch fallback path could leave pr.diff partially corrupted without restoring the backup copy. All other hypotheses were verified and dismissed.

`full diff preserved at ${preservedPath})`,
);
setOutputs('incremental', 'ok', sha);
} catch (err: unknown) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[medium] writeFileSync failure leaves pr.diff corrupted without restoring the backup

After copying diffPath to preservedPath (line 170) and then overwriting diffPath with the incremental diff (line 171), the catch block correctly reports mode=full — but it does not restore diffPath from preservedPath.

If writeFileSync at line 171 throws (e.g. disk-full or a permission error mid-write), pr.diff may be partially overwritten with incomplete content. The downstream review steps will then read a corrupted diff even though mode=full was reported, producing either a broken review or a misleading one. The good original copy exists at preservedPath but is never used to recover.

Fix: restore diffPath in the catch block before reporting mode=full:

Suggested change
} catch (err: unknown) {
} catch (err: unknown) {
core.warning(
`incremental-review failed (${err instanceof Error ? err.message : String(err)}) — falling back to full review`,
);
// Restore the original diff if the incremental write may have corrupted it.
try {
const preservedPath = fullDiffPath(diffPath);
copyFileSync(preservedPath, diffPath);
} catch {
// Best-effort; if restoring also fails, the full review will error too.
}
setOutputs('full', 'error', sha);
Confidence Score
🟡 moderate 67/100

@derekmisler derekmisler left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ci is green and the fail-open design is solid. two things worth a look: in src/incremental-review/index.ts around line 170, if writeFileSync throws after copyFileSync has already written pr_full.diff, the catch block warns and falls back to full mode but doesn't restore pr.diff from the backup, so a downstream consumer could read a partially-written diff. easy fix: add a copyFileSync(preservedPath, diffPath) inside the catch before returning. separately, in findingSignature() (dedupe-findings.ts:81) the regex [^*]+ matches newlines, which can pull in extra text from multi-line bold blocks and inflate the jaccard similarity denominator enough to miss a dupe. swapping it for [^*\n]+ tightens it up.

@derekmisler derekmisler left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two things other i meant to flag.


const preservedPath = fullDiffPath(diffPath);
copyFileSync(diffPath, preservedPath);
writeFileSync(diffPath, result.restricted, 'utf-8');

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: if writeFileSync throws here (disk full, permission error, etc.), pr.diff is left partially written. the catch block at line 181 falls back to full mode, but it doesn't restore pr.diff from preservedPath, so anything downstream that reads pr.diff gets a corrupted file. preservedPath is already written at this point, so the fix is one line in the catch:

} catch (err: unknown) {
  try { copyFileSync(preservedPath, diffPath); } catch {}
  core.warning(...)
  setOutputs('full', 'error', sha);
}

alternatively, write to a temp file and atomically rename over diffPath.

Confidence 🟡 LIKELY 70/100

Comment thread src/dedupe-findings/dedupe-findings.ts Outdated
* Returns null when no usable text exists.
*/
export function findingSignature(body: string): string[] | null {
const bold = body.match(/\*\*([^*]+)\*\*/);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (non-blocking): [^*]+ matches newlines in js, so if a comment body has a bold span that wraps lines (e.g. **[high] title\ncontinued**), the captured group pulls in all lines up to the closing **. that inflates the token set and can drop jaccard similarity below 0.5, causing a missed dedupe. the fallback branch already handles this correctly (takes only the first non-empty line). tightening the regex to match the same behavior:

Suggested change
const bold = body.match(/\*\*([^*]+)\*\*/);
const bold = body.match(/\*\*([^*\n]+)\*\*/);
Confidence 🟡 LIKELY 62/100

Sayt-0 added 2 commits July 3, 2026 21:30
If writeFileSync throws mid-write after the full diff was already
preserved, pr.diff could be left partially written while mode=full is
reported, so downstream steps would review a corrupted diff. The catch
block now restores pr.diff from the preserved copy and removes
pr_full.diff, whose presence would otherwise signal incremental mode to
downstream steps. The restore only runs when the backup copy completed,
so a partial backup can never overwrite an intact pr.diff.
…ures

In findingSignature, [^*]+ also matches newlines, so a bold span
wrapping across lines pulled every line up to the closing marker into
the token set. That inflates the Jaccard union enough to fall below the
similarity threshold and miss a duplicate. Restrict the match to
single-line bold blocks; multi-line headings now use the existing
first-non-empty-line fallback.
@Sayt-0 Sayt-0 requested a review from derekmisler July 3, 2026 20:19

@docker-agent docker-agent left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟢 APPROVE

The incremental review implementation is well-structured with thorough fail-open semantics throughout. All error paths correctly fall back to a full review, SHA40 regex guards prevent shell injection via git arguments, and temporary files are cleaned up with force: true in finally blocks. The deduplication logic (dedupe-findings) is sound — findingSignature always deduplicates tokens before similarity scoring, and the Jaccard formula is correct for set inputs. The findLastReviewedSha ordering logic (newest wins via >= on oldest-first API responses) is consistent with the GitHub Reviews API contract and is explicitly commented. No bugs were confirmed in the new code.

@derekmisler derekmisler left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

solid implementation overall; the fallback coverage and test suite are thorough. two minor suggestions around signatureSimilarity's exported contract and a rename-path asymmetry in restrictDiffToFiles, both non-blocking. ci is green.

return tokens.length > 0 ? [...new Set(tokens)] : null;
}

export function signatureSimilarity(a: string[], b: string[]): number {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (non-blocking): signatureSimilarity can return > 1.0 if a has duplicate tokens. the loop increments intersection for each occurrence of a duplicate, but union = a.length + b.length - intersection uses the inflated raw length, so the ratio overflows. in production this is fine because findingSignature always deduplicates before calling this, but the exported function has no precondition documented or enforced. suggest deduping at entry:

Suggested change
export function signatureSimilarity(a: string[], b: string[]): number {
export function signatureSimilarity(aIn: string[], b: string[]): number {
const a = [...new Set(aIn)];
const setB = new Set(b);
let intersection = 0;
for (const token of a) {
if (setB.has(token)) intersection++;
}
const union = a.length + setB.size - intersection;
return union === 0 ? 0 : intersection / union;
}
Confidence 🟡 LIKELY 70/100

Comment thread src/incremental-review/incremental-review.ts
Co-authored-by: Derek Misler <derek.misler@docker.com>
Signed-off-by: Sayt0 <138035894+Sayt-0@users.noreply.github.com>
@Sayt-0 Sayt-0 merged commit ddb0139 into main Jul 3, 2026
8 checks passed
@Sayt-0 Sayt-0 deleted the feat/incremental-review branch July 3, 2026 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants