Skip to content

feat(lfg): add compounding workflow and bring the skill to parity#879

Open
kieranklaassen wants to merge 6 commits into
mainfrom
feat/lfg-compounding-workflow
Open

feat(lfg): add compounding workflow and bring the skill to parity#879
kieranklaassen wants to merge 6 commits into
mainfrom
feat/lfg-compounding-workflow

Conversation

@kieranklaassen

Copy link
Copy Markdown
Collaborator

Summary

/lfg was linear and amnesiac: every run started from zero and ended teaching the next one nothing. This branch adds a compounding re-imagining as a multi-agent workflow (.claude/workflows/lfg.js) and brings the original linear skill up to the same capabilities, so both surfaces behave consistently.

The workflow keeps lfg's end-to-end autonomy and adds the two halves that make it compound — institutional recall in, a durable learning out — runs in an isolated git worktree so it never touches your checkout, and fans out the phases the Workflow engine can parallelize.

What's new

The compounding workflow runs 16 named phases:

Worktree → Riffrec → Research → Ideate → Plan → Doc Review → Work → Code Review → Autofix → Re-review → Simplify → Test → Dogfood → Commit & PR → Compound → Cleanup

  • Compound IN — Riffrec folds in a product recording when present; four read-only researchers (learnings, repo, git history, best practices) run in parallel and distill one brief, so the run starts from accumulated knowledge instead of a blank slate.
  • Compound OUT — a ce-compound phase captures the non-obvious learning into docs/solutions/ (committed to the branch) so the next run's Research starts ahead.
  • Verified review — persona reviewers per dimension, deduped by file:line, then adversarially verified by skeptics before any autofix; a re-review pass catches regressions the fixes themselves introduce.
  • Isolation — everything from Plan onward happens in the worktree; if creation fails it falls back to the main checkout and says so. { dryRun: true } stops before Commit & PR / CI / Compound.

Skill parity — the linear /lfg skill gained the capabilities it was missing:

  • A Dogfood step that exercises the changed journeys as a real user before shipping.
  • Feedback as input — pass a Riffrec bundle, video/audio, or screenshots instead of a typed description; lfg analyzes it before planning.
  • Reproduce-before-fix for bug fixes, with synthetic, anonymized state.
  • Non-interactive demo capture via ce-demo-reel, plus a fixed PR template for feedback-sourced runs.

Design decisions

  • Worktree isolation by default so a background run never mutates the user's working branch; a non-dry run still opens a real PR, but on the worktree branch.
  • disable-model-invocation skills are inlined, not invoked. ce-dogfood-beta (and ce-test-xcode) can't be called from a model-driven pipeline, so their behavior is inlined into Dogfood/Test on both surfaces.
  • Structured agent output — agents that return data are forced through a JSON schema, so the script gets validated objects rather than prose to parse.
  • Args normalization — args arriving as a JSON-encoded string are parsed so dryRun is honored (a stringified object once opened a real PR during a dry run).
  • Screenshots route through direct viewing, not the riffrec analyzer (which only accepts zip/video/audio/md). R2 frame upload stays environment-specific to the erf harness rather than leaking into general lfg.

Testing

bun test (frontmatter + shell-safety), bun run release:validate, and a wrapped-parse check of the workflow JS all pass. The LLM-behavior changes to the skill/workflow are not exercised by bun test — validate those via the skill-creator eval workflow or a fresh session, per the repo's plugin-caching rule.


Compound Engineering
Claude Code

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 46b93a0b09

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .claude/workflows/lfg.js
Comment on lines +513 to +515
if (!planPath) {
log('Plan phase produced no plan file — aborting before work (mirrors the /lfg plan gate).')
return { error: 'no-plan', research: researchBrief, direction, worktree: worktreePath }

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Clean up worktrees on abort paths

If the workflow has already created an isolated worktree and this gate aborts, the direct return bypasses the Cleanup phase, so the sibling worktree (and its branch) is left behind. The same pattern exists for the Doc Review fatal and no-changes gates; users who hit these normal failure paths have to manually discover and remove the stale worktree/branch. Route aborts through shared cleanup/finally logic before returning.

Useful? React with 👍 / 👎.

kieranklaassen and others added 6 commits June 12, 2026 13:18
A 13-phase re-imagining of the linear /lfg skill as a deterministic
multi-agent Workflow: Riffrec intake, parallel institutional recall,
judge-panel shaping, plan + adversarial stress-test, build, dedup +
adversarially-verified review, fix + re-review, simplify, verify,
always-on dogfood, ship, CI-monitor-until-green, and gated compound-out.

Includes a dryRun mode (args { dryRun: true } or a [dry-run] marker) that
stops before Ship/CI/Compound for safe testing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Each phase now reads as the CE step/skill it runs (Riffrec, Research,
Ideate, Plan, Doc Review, Work, Code Review, Autofix, Re-review, Simplify,
Test, Dogfood, Commit & PR, Compound). Splits the plan stress-test out as
its own Doc Review phase. Orchestration unchanged — relabel only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A dry-run test still opened a real PR because args arrived as a
JSON-encoded string ('{"task":...,"dryRun":true}'), landed in the
string branch, and dryRun was never read. Normalize args: if it's a
string that parses to an object, use the parsed object. Also document
that a non-dry run does real git ops on the live checkout.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds a Worktree phase that creates lfg/<slug> in a sibling worktree, and a
Cleanup phase that removes it (keeping the branch/PR on real runs, dropping
the throwaway branch on dry runs). Every mutating phase from Plan onward is
told to operate inside the worktree; Riffrec/Research/Ideate stay in the main
checkout (read-only, and Riffrec must search the user's real ~/Downloads).
Compound now commits+pushes its learning doc to the branch so worktree
cleanup doesn't discard it.

Fixes the test-surfaced problem where the Commit & PR phase switched the
user's live branch mid-run.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Bring the linear /lfg skill and the lfg workflow to parity and extend
both:

- Add a Dogfood step to the skill, inlining ce-dogfood-beta's diff-scoped
  behavior (it is disable-model-invocation and can't be invoked from the
  pipeline), mirroring the workflow's Dogfood phase.
- Accept a Riffrec bundle, video, audio, or screenshots as input and
  analyze it before planning, on both surfaces.
- Reproduce bugs locally with synthetic, anonymized state before writing
  the fix.
- Capture demos non-interactively via ce-demo-reel for observable
  changes, and emit a fixed PR template for feedback-sourced runs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… anchors

The lfg workflow reimplements review inline with its own severity
(blocker/high/medium/low) and confidence (low/medium/high) scales,
forking from the canonical ce-code-review model that the ce-* persona
reviewers it dispatches natively emit. Map both to P0-P3 and the
0/25/50/75/100 confidence anchors so the workflow speaks the plugin's
review vocabulary and the personas aren't forced through a lossy remap.

Vocabulary only: dedup, severity-gated autofix, and the report-then-apply
architecture are unchanged.
@tmchow tmchow force-pushed the feat/lfg-compounding-workflow branch from 46b93a0 to 999f526 Compare June 13, 2026 05:27
@tmchow

tmchow commented Jun 13, 2026

Copy link
Copy Markdown
Collaborator

Heads up @kieranklaassen — I rebased this branch onto current main and pushed a small follow-up commit (force-with-lease), since it was ~2 weeks behind its base (#874) and main has moved a lot. History was rewritten, so re-pull before any further local work.

Why the rebase was needed (not just a courtesy): #881 (lean apply model) removed mode:autofix from ce-code-review and rewrote the linear lfg skill's review/apply steps to use mode:agent + references/review-followup.md. This branch was authored before that, so its lfg/SKILL.md still drove review with the now-deleted mode:autofix — and the branch didn't carry review-followup.md (that's #881's file). A plain merge would have silently overwritten steps 3–4; the rebase makes the branch deliberately adopt the current model. Verified: no mode:autofix vocabulary remains, review-followup.md is present, step renumbering is consistent.

One follow-up commit — refactor(lfg): align workflow review vocabulary to P0-P3 + confidence anchors: the workflow reimplements review inline (it doesn't call ce-code-review), but it had forked to its own blocker/high/medium/low severity and low/medium/high confidence scales, while the ce-* personas it dispatches natively emit P0–P3. I mapped both to P0–P3 + the 0/25/50/75/100 anchors so the workflow speaks the plugin's review vocabulary and the personas aren't forced through a lossy remap. Vocabulary only — dedup, severity-gated autofix, and the report-then-apply architecture are unchanged.

Deliberately not changed: since the workflow runs its own review, #900 (lean SKILL.md / on-demand references) doesn't apply to a JS workflow, and #845 (triage grouping) is a human-presentation lens that adds little to a machine auto-applier. #881's report-then-apply principle was already embodied via the worktree Autofix phase.

Verification on the rebased + reconciled branch: rebase applied with zero conflicts; full bun test passes including #930's new content-convention checks; wrapped-parse of lfg.js OK; release:validate in sync. The pre-existing CLI test failures are environmental (they fail identically on clean main).

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 999f526bdd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .claude/workflows/lfg.js
compound = await agent(
wt +
`Invoke the ce-compound skill with mode:headless to capture the durable learning from this task into docs/solutions/. Focus on what was NON-OBVIOUS: the approach chosen and why, any gotcha hit during work, review, simplify, test, dogfood, or CI, and how it connects to the learnings surfaced in research. ${NON_INTERACTIVE}\n\n` +
(worktreeBranch ? `IMPORTANT: after ce-compound writes the doc, commit it (\`docs(compound): capture learning\`) and push to ${worktreeBranch} so it lands on the PR — otherwise Cleanup will discard it.\n\n` : '') +

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Re-watch CI after pushing compound docs

When worthCompounding is true on a real run, this instruction commits and pushes a new docs/solutions/ change after the Commit & PR phase has already finished its CI monitor, so the PR head changes after the workflow decided whether CI was green. I checked the workflow ordering: the CI loop runs in Commit & PR before Compound, which means the returned ciGreen value can refer to the previous head while the latest pushed commit is still pending or failing; move Compound before the CI loop or rerun/watch CI after this push.

Useful? React with 👍 / 👎.

Comment thread .claude/workflows/lfg.js
`Commit, push, and open a pull request for this work. This repo requires a PR — NEVER push to main directly. Invoke the ce-commit-push-pr skill. ${NON_INTERACTIVE}\n\n` +
(worktreeBranch ? `You are already on branch ${worktreeBranch} in the worktree — commit ALL work to THIS branch, push it, and open the PR from it. Do NOT create another branch or switch branches.\n\n` : '') +
(observable
? `This change has an observable surface (${surfaceList}) — also invoke the ce-demo-reel skill to capture visual/CLI proof and include its markdown in the PR body.\n\n`

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid invoking interactive demo capture here

For observable web/CLI/iOS changes, this background workflow delegates to ce-demo-reel, but I checked plugins/compound-engineering/skills/ce-demo-reel/SKILL.md and its tier-selection/upload flow requires asking the user and even says batch/background mode should wait for a reply. In those runs the supposedly non-interactive Commit & PR phase can block before CI; add a headless demo mode/default tier or inline a non-interactive capture path instead of invoking the interactive skill as-is.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants