Skip to content

Replace streams Sequence/Flow/Reaktive backends with a round-based Step engine#2740

Draft
slisson wants to merge 7 commits into
mainfrom
feature/streams-round-based-engine
Draft

Replace streams Sequence/Flow/Reaktive backends with a round-based Step engine#2740
slisson wants to merge 7 commits into
mainfrom
feature/streams-round-based-engine

Conversation

@slisson

@slisson slisson commented Jun 13, 2026

Copy link
Copy Markdown
Member

What

Replaces the streams module's three interchangeable backends (Sequence, Flow, Reaktive) and deferred builder with a single round-based Step interpreter, and removes the Reaktive third-party dependency — while keeping the entire public API unchanged.

A new streams2 module is added as a standalone, dependency-free reference implementation of the engine (with its own tests and README). The same design is then ported into streams to back its richer legacy API.

Why

streams carried three backends because of one capability plain coroutines/Flow can't give cheaply: automatic bulk-request batching for lazy loading of large content-addressed models. Reaktive was there only for the push semantics that let the runtime collect a set of independent requests before forcing any of them.

That requirement is an applicative property (independent branches → same batch round; flatMap dependency → next round). It can be satisfied structurally by one interpreter, eliminating the three backends and the dependency. Prior art: Haxl, ZIO Query, Stitch.

How

  • engine/StepEngine.ktStep IR (Done/Blocked/Failed), applicative/monadic combinators, blocking + suspending round drivers, fetch leaves, and async leaves (for fromFlow/singleFromCoroutine).
  • StreamImpl.kt — one backing impl for every IStream cardinality, plus CompletableImpl and the single StreamBuilderImpl.
  • SimpleStreamExecutor / BlockingStreamExecutor / BulkRequestStreamExecutor are now thin wrappers over the driver. enqueue(key) is a fetch leaf; batching is structural (per source, per round) rather than tied to an executor-context queue.
  • Deleted the four backend builders, the concrete stream classes, the Reaktive helpers, and the Reaktive dependency.

API compatibility / migration

The public surface (IStream.*, IStreamBuilder, IStreamExecutor, IExecutableStream, IStreamInternal, IBulkExecutor, BulkRequestStreamExecutor, all operators/extensions) is unchanged. No downstream module needed API-driven changes. The only other edits remove dead code — unused Reaktive imports in three modelql-untyped files and two unused private helpers in NodeAsAsyncNode.kt — which compiled solely because streams used to re-export Reaktive via api.

Verification

  • streams compiles JVM + JS; unit tests pass.
  • datastructures and model-datastructure compile untouched; full JVM suites pass.
  • modelql-core / modelql-untyped tests pass.
  • model-client, model-server, model-server-api, modelql-typed/-html/-client/-server, bulk-model-sync-lib compile (JVM + JS where applicable).
  • model-server LazyLoadingTest passes — the access-pattern / batching-correctness test, confirming structural batching reproduces the old Reaktive collect-and-batch behavior.

Reviewer notes — behavioral tradeoffs (intentional, "no incremental emission")

  1. iterate / iterateSuspending now fully materialize before visiting (Reaktive previously drained between batches). Raises peak memory for very large server-side iterations; clean follow-up is per-round streaming in just the iterate* drivers.
  2. cached() is currently a no-op — fetch-level dedup (the expensive part) is handled by the per-run cache; only pure-recompute memoization is lost.
  3. take / skip operate on materialized results (don't prune upstream fetches).
  4. SimpleStreamExecutor now also batches per source/round → strictly fewer round-trips, never more.

See streams-redesign.md (repo root) and streams2/README.md for the full design writeup and known limitations.

🤖 Generated with Claude Code

@slisson slisson marked this pull request as draft June 13, 2026 18:25
…ased Step engine

The streams module previously carried three interchangeable backends (Sequence,
Flow, Reaktive) plus a deferred builder. Reaktive existed solely to provide the
push semantics needed for bulk-request batching: collecting a set of independent
data requests before forcing any of them.

That batching requirement is an *applicative* property and can be satisfied
structurally by a single round-based interpreter, removing the need for three
backends and the third-party Reaktive dependency.

This change:
- Adds streams2: a standalone, dependency-free reference implementation of the
  Step-based engine (prototype + tests + README).
- Ports the engine into the streams module behind its existing public API:
  - engine/StepEngine.kt: Step IR, applicative/monadic combinators, blocking +
    suspending round drivers, fetch leaves, async leaves (fromFlow/
    singleFromCoroutine).
  - StreamImpl.kt: one backing impl for every IStream cardinality, plus
    CompletableImpl and the single StreamBuilderImpl.
  - SimpleStreamExecutor / BlockingStreamExecutor / BulkRequestStreamExecutor are
    now thin wrappers over the driver; enqueue() is a fetch leaf and batching is
    structural (per source, per round).
- Deletes the four backend builders, the concrete stream classes, the Reaktive
  helpers, and the Reaktive dependency.

The public API (IStream.*, IStreamBuilder, IStreamExecutor, IExecutableStream,
IStreamInternal, IBulkExecutor, BulkRequestStreamExecutor, all operators) is
unchanged. No downstream module required API-driven changes; the only other edits
remove dead code (unused Reaktive imports/helpers in model-api and
modelql-untyped) that compiled only via the former transitive Reaktive export.

Verified: streams compiles JVM+JS and tests pass; datastructures and
model-datastructure compile untouched and their suites pass; modelql-core/-untyped
tests pass; model-client/model-server/modelql-* compile; model-server
LazyLoadingTest (batching/access-pattern correctness) passes.

See streams-redesign.md and streams2/README.md for details and tradeoffs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@slisson slisson force-pushed the feature/streams-round-based-engine branch from 4c9c626 to e7b72ae Compare June 13, 2026 18:30
The streams2 module duplicated both the IStream interface hierarchy and the Step
engine. Since the engine was already ported into streams in a more complete form
(Pending with async leaves, recover/doOnError, the suspending driver, batch
chunking) and nothing depended on streams2, the separate module was pure
redundancy.

- Remove the streams2 module and its settings.gradle.kts registration.
- Keep its batching/dedup/stack-safety tests, rewritten against the real
  BulkRequestStreamExecutor / enqueue API (BulkRequestBatchingTest).
- Add streams/README.md with the design overview (previously in streams2/README).
- Update streams-redesign.md to describe the prototype-then-merge history.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Test Results

  265 files    265 suites   43m 29s ⏱️
1 528 tests 1 517 ✅ 11 💤 0 ❌
1 538 runs  1 527 ✅ 11 💤 0 ❌

Results for commit fbe71a6.

♻️ This comment has been updated with latest results.

slisson and others added 2 commits June 13, 2026 21:11
…ource

The IStreamExecutor passed to getBlocking/getSuspending/iterate*/execute* no
longer carried any batching context: fetch leaves carry their own IBulkExecutor
source and the per-run Execution provides caching/dedup. Two changes remove the
residual coupling.

B. Batch size on the source: IBulkExecutor now declares `val batchSize`
   (default DEFAULT_BULK_REQUEST_BATCH_SIZE = 5000). The round driver chunks each
   source's keys to its own batchSize and no longer takes a batch-size parameter.
   BulkRequestStreamExecutor(source, batchSize) still works (it exposes the
   constructor batch size on the source the leaves bind to).

A. Executor-less terminals: add getBlocking()/getSuspending()/iterateBlocking{}/
   iterateSuspending{}/executeBlocking()/executeSuspending(); deprecate the
   executor-taking overloads with ReplaceWith, keeping their original behavior for
   compatibility. All ~180 in-repo terminal call sites migrated to the
   executor-less form.

IStreamExecutor / IStreamExecutorProvider and BulkRequestStreamExecutor.enqueue
remain (enqueue creates fetch leaves; CONTEXT is still set during
BulkRequestStreamExecutor runs because ModelQL resolves the current executor via
IStreamExecutor.getInstance()).

Verified: streams JVM+JS compile and tests pass; datastructures and
model-datastructure compile + full suites pass; model-client (JVM+JS),
model-server, modelql-* compile; model-server LazyLoadingTest (batching
correctness) passes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
cached() was a no-op, which broke ModelQLClientTest.testCaching: a query step
consumed by multiple branches (with a side-effecting setProperty) was evaluated
once per consumer instead of once.

Implement real multicast in the engine: cached() resolves its inner stream through
a shared MemoCell stored per-Execution and keyed by a stable token. Every consumer
gets a view of the same cell, and the cell's inner step is advanced at most once
per round, so the underlying work and any side effects run exactly once and the
result is shared. Pending.union now dedupes async leaves by token so a memoized
leaf shared across branches isn't run twice in a round.

Verified: new CachedStreamTest (side-effect-once across consumers, including via
zip); ModelQLClientTest passes (16/16, incl. testCaching); datastructures,
model-datastructure, modelql-core/-untyped suites pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@slisson slisson force-pushed the feature/streams-round-based-engine branch from aade9f4 to 051474e Compare June 13, 2026 19:21
@github-actions

github-actions Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

JVM coverage report

Overall Project 57.98% -0.62%
Files changed 74.2%

File Coverage
SimpleStreamExecutor.kt 100% 🍏
ModelTreeBuilder.kt 97.86% 🍏
RandomModelChangeGenerator.kt 97.02% 🍏
DeleteNodeOp.kt 96.75% 🍏
SetReferenceOp.kt 95.61% 🍏
AbstractOperation.kt 95.08% -0.82% 🍏
SetPropertyOp.kt 94.95% 🍏
StepEngine.kt 94.66% -5.34% 🍏
MoveNodeOp.kt 90.73% 🍏
SetConceptOp.kt 90.32% 🍏
VersionMerger.kt 89.84% 🍏
RepositoriesManager.kt 89.48% -1.14%
AddNewChildOp.kt 89.43% 🍏
BTree.kt 88.82% -3.29% 🍏
AllChildrenTraversalStep.kt 88.31% 🍏
HistoryQueries.kt 88.17% -2.86% 🍏
BulkRequestStreamExecutor.kt 88% -11.11% 🍏
BindingWorker.kt 85.92% 🍏
DescendantsTraversalStep.kt 82.07% 🍏
CLVersion.kt 78.53% -0.47%
AllReferencesTraversalStep.kt 78.48% 🍏
RolesMigration.kt 78.47% 🍏
HistoryIndexNode.kt 78.41% 🍏
SingleThreadMutableModelTree.kt 72.53% 🍏
CLTree.kt 67.98% 🍏
IStream.kt 63.73% -9.84% 🍏
IModelClientV2.kt 62.2% 🍏
OperationsCompressor.kt 59.56% -6.01%
StreamImpl.kt 57.9% -42.1%
IStreamExecutor.kt 31.49% -11.95%
NodeAsAsyncNode.kt 11.76% 🍏
AddNewChildSubtreeOp.kt 6.82% -32.05%
BlockingStreamExecutor.kt 0%
BulkUpdateOp.kt 0% -10%
LionwebApiImpl.kt 0% 🍏

sarif-multitool 5.x removed --merge-runs; merging runs by tool is now
the default merge behavior, so the flag caused an unknown-option error.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@slisson slisson force-pushed the feature/streams-round-based-engine branch from 7440151 to 23790a0 Compare June 13, 2026 20:24
Add frequently-needed operators to IStream as extension functions
composed from existing engine primitives (no StreamImpl changes):
boolean reductions (any/all/none), filterNot/filterIsInstance/
filterIndexed, first/last accessors, mapIndexed, collection
conversions (toSet/groupBy/associateBy/associateWith/toMap), sorting,
distinctBy, reduce/sum/sumOf/min-maxBy, startWith/endWith, joinToString.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@slisson slisson force-pushed the feature/streams-round-based-engine branch from 284bb68 to 30f7a44 Compare June 13, 2026 20:47
Rework the Step engine from a Blocked/Pending/resume lockstep model to a
lazily-evaluated node graph (Done/FetchStep/MapStep/FlatMapStep/ZipStep/
FanStep/AsyncStep/Recover/OnError/MemoStep) driven by a depth-first walk.

Each round collects the first-unmet request of every pending branch into a
shared pool capped at the source's batchSize, and FanStep builds its children
lazily (left-to-right, dropped once resolved). The peak request frontier is
therefore <= batchSize regardless of how wide a tree level is, restoring the
property the old BulkRequestStreamExecutor.RequestQueue.sendNextBatch provided
(a depth-first traversal that bounds the live request set). Sibling requests
still share a round when they fit under the cap, so applicative batching is
preserved; an unbounded cap collapses to one round per level.

Combinators evaluate eagerly when their inputs are already Done, so a
synchronous prefix and its side effects run at build time in build order. Some
call sites depend on this (e.g. HamtLeafNode.getChanges sets a var in one
synchronous stream and reads it in a later deferZeroOrOne); fully-lazy
evaluation broke tree diffing until eager-on-Done was restored.

The public IStream API and combinator signatures are unchanged; only
flatMapOrdered/flatMapZeroOrOne switch to the new lazy fanOut, and the driver
functions became Execution members.

Tests: FrontierSizeTest asserts the frontier stays within batchSize on a
wide+deep tree (widest level 16384, cap 1000) and collapses to one round per
level when uncapped. HamtGetChangesTest adds direct coverage of the HAMT diff
(guards the eager-evaluation dependency). Validated across streams (JVM + JS),
datastructures, model-datastructure, modelql-core/untyped/typed, model-api,
and bulk-model-sync-lib.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant