Add OPC desktop hardening and audit docs#13
Conversation
|
Important Review skippedToo many files! This PR contains 295 files, which is 145 over the limit of 150. To get a review, narrow the scope: Upgrade to a paid plan to raise the limit. ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (5)
📒 Files selected for processing (295)
You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
- Ajout en-tête "Dernière mise à jour : 2026-06-22 — v6.0.0" - Nouvelle section "Index rapide" (table de navigation par besoin) - §7 Coeur agent loop : pièges P1-P10 intégrés en clair (étaient dans learnings.md avant) - Nouvelle §8 : profil d'une session réussie + indicateur de dérive - Nouvelle R12 : ne jamais modifier CLAUDE.md en spéculation, consigner uniquement après validation explicite ou 2 occurrences - §9 renommée "Design & UX" (cohérence avec le nouveau §8) - Boucle d'apprentissage §5.1 rendue explicite via schéma Produire → Observer → Capturer → Corriger Toutes les références inter-sections pointent vers des sections existantes. Le contenu des sections préservées (§7.2 table de vérité, §8.2 indicateur de dérive, §12 règles numérotées, §14 décisions) est intact. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…t scripts - Add tsx ^4.22.4 to devDependencies (used to run TS sources without a bundler in node --test for the loop tests). - Split the monolithic verify:ci into per-step scripts so each gate can run in isolation: typecheck:refactor, typecheck:utils, typecheck:loop-tests, test:loop. - Bump root package.json type to "commonjs" to match the existing layout (no behaviour change at runtime). Lockfiles bumped accordingly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…mory + tasks The OPC agent loop is a ReAct-style cycle (act -> observe -> reason -> repeat) with budget bounds, human-in-the-loop escalation, and a stop/abort policy. The files added in src/utils/ are the runtime heart of that loop; they are pure TypeScript with no runtime dependencies outside Node, so they run directly under tsx without a bundler. Core modules: - loopRunner.ts orchestrator (runLoop, dispatchReason) - loopBudget.ts BudgetTracker + sterile-action detection - loopStop.ts evaluateStop central decision - loopReasoner.ts ReAct reasoner (LLM or heuristic fallback) - loopEvents.ts typed event union + emitter - humanInTheLoop.ts HumanGate interface - loopEventJsonl.ts newline-delimited event stream - contextBudget.ts context-window accounting - contextCompactor.ts compaction strategy - criteria.ts task completion criteria - verification.ts task verification - toolRegistry.ts tool dispatch table - taskChecklist.ts persistent task list (memory + tasks context) - memory.ts long-term project memory The 82 node:test cases in src/utils/__tests__/ cover the stop policy, budget tracker, JSONL framing, and event emission. tsconfig.utils.json + tsconfig.loop-tests.json keep typechecks scoped to the loop without pulling the full Claude Code source tree into the typecheck. src/context.ts now injects the long-term memory and the active task checklist into the user context, and src/utils/plans.ts initializes the task checklist on plan copy for resume support. docs/OPC_LOOP_ENGINEERING_AUDIT_2026-06-21.md is the engineering audit that describes the design, P1-P10 traps, and guarantees of the loop. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…witch resume to chdir - plan.tsx: drop the c as _c from react/compiler-runtime memoization blocks; the compiler cache scaffolding was a vendoring artifact that adds noise without changing behavior. PlanDisplay now reads as plain JSX. - resume.tsx: replace the cross-project "copy a command to the clipboard and ask the user to run it" path with a direct process.chdir + setOriginalCwd/setProjectRoot/setCwdState when the resumed conversation is from a different directory. The renderer already tells us the target project path, so changing cwd directly is simpler and matches how the rest of the resume flow navigates the workspace. Behaviour change scoped to the cross-project resume branch: same project resumes are unchanged. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ler wrapper
New IPC modules (desktop/electron/ipc/):
- sessionIpc.cjs opc:session-list, opc:session-mark-done,
opc:session-config (read/patch maxTurns and
maxBudgetUsd by session id).
- humanGateIpc.cjs opc:human-gate-ask, opc:human-gate-respond.
Fail-safe by construction: every code path resolves
to 'timeout' on error, so a renderer crash or a
stale main-process handler can never leave the
agent loop blocked waiting forever.
- loopEventIpc.cjs opc:loop-event broadcast bridge; subscribers
receive every LoopEvent from the runner.
New persistence module:
- sessionStore.cjs SQLite-backed session store for incomplete runs
(running|interrupted) so the renderer can offer resume on next
launch. findIncomplete + markOrphanedAsInterrupted + pruneOld
drive the recovery flow on the renderer.
ipcHandlers.cjs (M1):
- Wrap every ipcMain.handle in a try/catch that logs a
channel-scoped error and re-throws. Without this, an uncaught
exception inside a user handler crashes the main process or leaves
the renderer with no visible cause.
- Track registered channels so unregisterIpcHandlers() can call
removeHandler on each one cleanly.
sessionIpc.cjs (Sprint 2 hardening):
- normalizeSessionId strips null bytes, trims, and caps length at
MAX_SESSION_ID_CHARS (256) for every id routed through session
IPC (smuggling control chars or padding the column is no longer
possible from the renderer).
- payload.taskId and existing.taskId go through the same gate to
avoid a stale taskId row leaking into an unrelated session.
Tests (desktop/test/):
- ipcHandlers.test.cjs covers the M1 wrapper (logs and re-throws on
failure, leaves the original handler untouched on success).
- ipcLoopGuards.test.cjs covers the human gate queue (ask / respond
/ mark-done / shutdown / queue full / malformed JSON / channel
constant) and the session IPC (list / mark-done / config / respond
flow / shutdown).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…onical whitelist Three coordinated changes close the gap where the sandboxed preload could declare its own set of allowed channels and the main process would trust it. ipcContract.cjs: - Export validateIpcContract() that asserts every invoke/event method has a non-empty name, a recognized channel kind, and that the channel is in the canonical whitelist. The whitelist is the single source of truth; the contract now mirrors it (no drift possible). windowManager.cjs: - preloadIpcContractArgument() takes the preload-supplied contract and the canonical whitelist, validates it before embed, and returns the payload the renderer will see. Any unknown channel is rejected with a clear error; the window is not created. preload.cjs: - parseIpcContract() reads the whitelist embedded in the HTML and rejects the contract on the first unknown channel. The renderer can no longer enumerate channels that the main process did not pre-approve. Tests (desktop/test/): - ipcContract.test.cjs covers the shape contract (non-empty names, valid kinds, valid channels) and round-trips a canonical contract. - windowManager.test.cjs now also covers the contract validation (rejects invoke channels outside the whitelist, rejects event channels outside the whitelist, accepts the canonical preload contract). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rden send() cliRunner.constants.cjs (new): - 19 named constants for the CLI runner: idle bounds, payload caps, runtime activity cadence, linger kill/term delays, exit codes, bundled plugin list. Centralized so every cap is testable and discoverable from a single module. cliRunner.cjs: - H1: stdoutBuffer is now appended through appendBoundedText(..., MAX_ STDOUT_CHARS). A run-away CLI that streams forever can no longer OOM the main process; the buffer is silently truncated and the remainder is dropped. - H2: clearLingerTimers(run) is called from both finishRun and the child close handler. Without this, a child that exits before the linger timer fires left dangling references that prevented GC of the run object. - Re-export CLI_RUNNER_CONSTANTS so callers and tests can pin the bounds. main.cjs (M1): - send(channel, payload) guards against a destroyed webContents (window alive, renderer torn down after a crash). Without this, calling send on a destroyed webContents throws synchronously and pollutes the main-process logs as a misleading error. Implemented inlined; the testable extraction lives in the next commit (sendChannel.cjs). renderer (desktop/renderer/): - confirmationBanner.js: non-blocking banner service for destructive confirmations (session delete, loop kill, budget reset). The user can still read the stream while the banner is open; Enter confirms, Esc cancels. - sessionBanner.js: a separate, lower-priority banner service for session resume proposals surfaced after a relaunch. - runController.js, app.js, index.html, styles.css: minor wiring to surface the banners and stream JSONL events live (not buffered until the end of a loop). Tests (desktop/test/): - cliRunner.test.cjs: 138 new lines covering H1 (stdoutBuffer cap) and H2 (clearLingerTimers called on both finish and close). - cliRunnerConstants.test.cjs (new): shape + exact values for every exported constant. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tants + extract send()
ipcValidation.constants.cjs (new):
- 41 named constants for the IPC validation layer, grouped by kind:
- text length caps (MAX_PROMPT_CHARS, MAX_STRING_CHARS, MAX_REFINE_
PROMPT_CHARS, MAX_TOOL_NAME_CHARS, MAX_COMMAND_CHARS,
MAX_SHORT_ID_CHARS, MAX_MEDIUM_TEXT_CHARS, MAX_LONG_TEXT_CHARS,
MAX_TAG_CHARS, MAX_REASON_CHARS, MAX_DESCRIPTION_CHARS,
MAX_QUERY_CHARS, MAX_ERROR_CHARS, MAX_SUMMARY_CHARS,
MAX_INDEXED_AT_CHARS, MAX_SNIPPET_CHARS, MAX_SNIPPETS,
MAX_KEYWORD_CHARS, MAX_KEYWORDS, MAX_HEADING_CHARS, MAX_HEADINGS,
MAX_SERVICE_NAME_CHARS, MAX_MODEL_CHARS)
- array length caps (MAX_TASKS, MAX_SERVICES_PER_TASK, MAX_PROJECT_
CONTEXT_FILES, MAX_ALLOWED_TOOLS, MAX_PAUSE_MODELS, STATE_SEARCH_
LIMIT_MIN/MAX/DEFAULT)
- numeric clamps (MAX_TURNS, MAX_BUDGET_USD, PID_MIN, PID_MAX)
- whitelists (TRUSTED_BYPASS_COMMANDS, PERMISSION_MODES, BYPASS_
PERMISSION_MODES, COMMAND_INTENT_SOURCES, STATE_SEARCH_TYPES,
LOCAL_HTTP_HOSTNAMES, DEFAULT_PERMISSION_MODE)
- Whitelists are frozen so a caller cannot mutate them by accident.
- These are part of the IPC contract; changing a value shifts what
the renderer is allowed to send. Treat as a breaking change.
ipcValidation.cjs:
- All inline magic numbers (40, 80, 100, 120, 240, 256, 300, 400, 600,
800, 1000, 1200, 20000, 65535 references kept where they are
protocol constants not business bounds) replaced with the named
constants. The values themselves are unchanged; this is a
pure refactor.
- Re-exports the constants module so callers and tests can align
their bounds with what the IPC layer actually enforces.
sendChannel.cjs (new) + main.cjs:
- Extract send(channel, payload) from main.cjs into a testable
factory. createSendChannel({ getMainWindow, log }) returns the
send function. Lazy accessor means getMainWindow can return null
while the window is being created and a different value once
createWindow() assigns it. Any throw from webContents.send is
swallowed (and optionally logged) so a renderer crash never
surfaces in the main-process logs as a misleading error.
Tests (desktop/test/):
- ipcValidationConstants.test.cjs (new): shape (every key present),
exact values, re-export consistency with ipcValidation.cjs, and
frozen-Set immutability.
- sendChannel.test.cjs (new): 9 cases including all four guard
conditions (no window, window destroyed, no webContents, webContents
destroyed), happy path forwarding, exception swallowing, and the
"log also throws" defensive path.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Suite aux sprints 1-4 du hardening (buffer borné stdout/stderr, wrapper
sendChannel pour webContents.send), capture des patterns émergents :
- P11 : appendBoundedText(current, next, maxChars) garde la queue via
slice(-maxChars) — flux infini (stdout loop agentique) = OOM et perte
d'info utile si on borne mal.
- P12 : createSendChannel({ getMainWindow, log }) no-op sur 4 cas unsafe
(mainWindow null/destroyed, webContents null/destroyed) et swallow
toute exception send() — fenêtre détruite = crash en cascade sinon.
§14 consigne l'obligation d'utiliser ces helpers systématiquement.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Test Plan
Notes