feat(runtime): publish the optimization suite + corpus flywheel wiring by drewstone · Pull Request #209 · tangle-network/agent-runtime

drewstone · 2026-06-09T23:21:00Z

What

Closes packaging gap G1 from the integration playbook and wires the corpus flywheel (the across-run layer's write side) — the two parallel tracks.

Published suite (graduates from bench/ R&D into src/, exported via /loops):

src/runtime/strategy.ts — the domain-blind core: the Environment seam (AgenticSurface), the canonical depth/breadth drivers (Supervisor + observe() — the +16.4pp configuration), open Strategy + sample/refine, defineStrategy (author a loop in ~20 lines from shot()+critique()), adaptiveRefine, runAgentic.
src/runtime/run-benchmark.ts — runBenchmark/printBenchmarkReport; paired lift now via agent-eval's pairedBootstrap (substrate stats, not a bench-local copy).
bench consumers rewired to package imports; bench/src/agentic.ts + run-benchmark.mts deleted (−1000 LOC of R&D duplication). Products can now import { Environment, defineStrategy, runBenchmark } from '@tangle-network/agent-runtime/loops' — the playbook's step 3/4 unblocked.

Corpus flywheel wiring (Track A's substrate):

AgenticOptions.corpus/corpusTags — the analyst's observe() pass appends trace-derived facts (zero extra LLM calls); priming = the caller folds corpus.query() facts into the task prompt.
bench/src/eops-corpus-ab.mts — THE across-run experiment (primed-vs-cold, same stream, equal compute; slope + paired lift + frozen holdout; the four falsifiers from layer-across-run.md designed in). Smoke verified the loop: run 1 wrote 3 facts, run 2 injected them. Full n=16+holdout run in flight — result reported separately.

Test

typecheck clean (both packages) · 680 tests pass · strategy-demo runs all four strategies end-to-end through the published exports · corpus A/B smoke verified write+read.

AgenticOptions gains corpus/corpusTags — the analyst's observe() pass now appends trace-derived facts (the flywheel write side) with zero extra LLM calls; priming (the read side) is the caller folding corpus.query() facts into the task systemPrompt. eops-corpus-ab.mts: THE across-run experiment (layer-across-run.md). Two arms over the same task stream, same order, canonical depth: cold (fresh every run) vs primed (query top-k facts before, observe-append after). Equal compute by construction. Reports paired lift, the SLOPE (first-half vs second-half — the flywheel signature is a growing advantage), fact-injection counts, and a frozen holdout (fresh tasks, corpus read-only). Smoke verified: run 1 wrote 3 facts, run 2 injected them.

…defineStrategy/runBenchmark Closes packaging gap G1 (docs/research/product-integration-playbook.md): the suite graduates from bench/ R&D into the published package, importable by any product. - src/runtime/strategy.ts — the domain-blind core moved from bench/src/agentic.ts: the AgenticSurface seam (Environment), the canonical depth/breadth drivers (Supervisor + observe() — the +16.4pp configuration), open Strategy + sample/refine, defineStrategy with the shot()/critique() steps, adaptiveRefine, runAgentic, and the corpus flywheel threading (AgenticOptions.corpus → the analyst's observe() appends trace-derived facts). - src/runtime/run-benchmark.ts — runBenchmark/printBenchmarkReport over an Environment; the paired lift now uses agent-eval's pairedBootstrap (the substrate's stats) instead of a bench-local copy. - Exported via /loops. bench consumers (agentic-eops, agentic-run, eops-gepa, eops-corpus-ab, examples/strategy-demo) now import from the package; bench/src/ agentic.ts + run-benchmark.mts deleted (-1000 LOC of R&D duplication). Verified: typecheck clean both packages; 680 tests pass; strategy-demo runs all four strategies end-to-end through the published exports.

tangletools

✅ Auto-approved PR — `d879ac93`

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

_{tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-09T23:21:07Z}

drewstone added 2 commits June 9, 2026 17:16

tangletools approved these changes Jun 9, 2026

View reviewed changes

drewstone merged commit 8bb31a0 into main Jun 9, 2026
1 check passed

drewstone deleted the feat/publish-strategy-suite branch June 9, 2026 23:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(runtime): publish the optimization suite + corpus flywheel wiring#209

feat(runtime): publish the optimization suite + corpus flywheel wiring#209
drewstone merged 2 commits into
mainfrom
feat/publish-strategy-suite

drewstone commented Jun 9, 2026

Uh oh!

tangletools left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

drewstone commented Jun 9, 2026

What

Test

Uh oh!

tangletools left a comment

Choose a reason for hiding this comment

✅ Auto-approved PR — d879ac93

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

✅ Auto-approved PR — `d879ac93`