perf: speed up manifest JSON rendering by He-Pin · Pull Request #879 · databricks/sjsonnet

He-Pin · 2026-05-30T12:40:20Z

Motivation

std.manifestJson, std.manifestJsonMinified, and std.manifestJsonEx routed through java.io.StringWriter, paying StringBuffer synchronization per write/flush on the hot manifestation path. Source-built jrsonnet comparisons showed sjsonnet trailing on object-heavy manifest workloads.

Modification

Add StringBuilderWriter: an unsynchronized Writer over a StringBuilder.
Add package-private FastMaterializeJsonRenderer backed by StringBuilderWriter; route the three std.manifestJson* builtins through it. Public MaterializeJsonRenderer ABI/shape unchanged.
Use an in-place codepoint sort for sortedVisibleKeyNames / maybeSortKeys (avoids .sorted boxing).
Fix codepoint comparison for raw surrogate prefixes; UnicodeHandlingTests extended.

Result

Scala Native hyperfine on kube-prometheus, jrsonnet HEAD 2d7eed05:

Workload (native)	Before	After	Δ
kube-prometheus, sjsonnet	158.4 ± 16.8 ms	143.7 ± 3.2 ms	−9.3%
`manifestJsonEx`, sjsonnet	—	5.09 ± 1.01 ms	new

Test plan

./mill __.reformat
./mill 'sjsonnet.jvm[3.3.7]'.test — 518/518 pass

This PR is the base for the stacked follow-ups #875 (TomlRenderer reuses StringBuilderWriter), and the independent #876/#877/#878.

Motivation: std.manifestJson* still contributed to the local Scala Native gap versus source-built jrsonnet, especially in real-world object-heavy rendering. Modification: Add an internal StringBuilder-backed FastMaterializeJsonRenderer for std.manifestJson, std.manifestJsonMinified, and std.manifestJsonEx while preserving the public MaterializeJsonRenderer StringWriter API. Reuse an in-place codepoint key sorter backed by java.util.Arrays.sort, and fix raw-surrogate prefix ordering in compareStringsByCodepoint. Result: Full validation passed: ./mill --no-server --ticker false --color false __.reformat and ./mill --no-server --ticker false --color false -j 1 __.test reported 451/451 tests passing. JMH regressions: manifestJsonEx 0.055 ms/op, realistic2 43.596 ms/op, gen_big_object 0.842 ms/op. Direct hyperfine against source-built jrsonnet: manifestJsonEx sjsonnet-native 5.090 ms vs jrsonnet 4.075 ms; kube-prometheus sjsonnet-native 143.738 ms vs jrsonnet 97.385 ms.

## Motivation `std.manifestTomlEx` had three sources of avoidable overhead on the hot manifestation path: 1. **Synchronized writer.** `TomlRenderer` and `ManifestModule.evalRhs` rendered into a `java.io.StringWriter`, whose backing `StringBuffer` pays a monitor enter/exit on every `write`/`flush`. The `FastMaterializeJsonRenderer` already uses the unsynchronized `StringBuilderWriter` (#874); TOML did not. 2. **Redundant field lookups in `renderTableInternal`.** Each key's `Val.Obj.value(k)` was resolved twice — once to classify scalar vs section, then again to render or recurse. The cache deduplicates the result, but the lookup itself still costs. 3. **Wasted indexing work.** `visibleKeyNames` was iterated and each key binary-searched back into `sortedVisibleKeyNames` — `sortedVisibleKeyNames` can be iterated directly, skipping `O(n log n)` compares per table. ## Modification Two commits: - **`perf: use unsynchronized StringBuilderWriter in TomlRenderer`** — Swap `TomlRenderer` and the `manifestTomlEx` render path in `ManifestModule` from `java.io.StringWriter` to the package-private `StringBuilderWriter`. `std.deepJoin` keeps `StringWriter` (separate concern). - **`perf: cache resolved field values and skip binary search in renderTableInternal`** — Resolve each field once into a `resolved: Array[Val]` during section classification and reuse it during render/recurse; iterate `sortedVisibleKeyNames` directly (removes the now-unused `sortedKeyIndex` binary search); hoist `childIndent = cumulatedIndent + indent` out of the section loop (was an identical allocation per sibling section); pre-size the output `StringBuilderWriter` to 1 KiB so small/medium outputs skip the first ~6 doublings. Output is byte-identical (verified at 1,228,186 bytes on the benchmark workload). ## Result Scala Native, hyperfine A/B against `master` (`fc292fa6`). Workload: object comprehension over 8000 small tables → ~1.2 MB TOML output (render-dominated). Four interleaved-order passes, `--warmup 10 --min-runs 100 --shell=none`: | pass | order | before mean | after mean | before min | after min | **min ratio** | |---|---|---:|---:|---:|---:|---:| | 1 | before → after | 59.4 ± 2.7 ms | 53.2 ± 23.4 ms | 55.4 ms | 43.8 ms | **1.27×** | | 2 | after → before | 64.1 ± 7.7 ms | 51.8 ± 12.2 ms | 56.4 ms | 43.7 ms | **1.29×** | | 3 | before → after | 64.1 ± 8.1 ms | 53.2 ± 14.3 ms | 56.4 ms | 42.0 ms | **1.34×** | | 4 | after → before | 63.3 ± 14.3 ms | 49.2 ± 3.7 ms | 57.2 ms | 42.8 ms | **1.34×** | Mean is noisy on the host (1.12× – 1.29×), but **after is faster in every one of the 4 passes** and the **min values are tight at ~1.27–1.34× faster** (best observed: 42.0 ms vs 56.4 ms, ~25.5% reduction). Output byte-identical, 1,228,186 bytes both sides. For comparison, the StringBuilderWriter swap alone (commit 1) measures ~1.08–1.14× min; the cache + binary-search elimination + childIndent hoist (commit 2) lifts that to ~1.27–1.34× min. ## Test plan - [x] `./mill __.reformat` - [x] `./mill 'sjsonnet.jvm[3.3.7]'.test` — 519/519 pass - [x] Scala Native A/B hyperfine — 4 interleaved-order passes, all positive; output byte-identical --- > Rebased onto current `master` (`fc292fa6`). The companion commit "speed up manifest JSON rendering" was merged separately as #879, so this PR now contains only the TomlRenderer / ManifestModule changes.

He-Pin mentioned this pull request May 30, 2026

perf: use unsynchronized StringBuilderWriter in TomlRenderer #875

Merged

3 tasks

stephenamar-db approved these changes Jun 2, 2026

View reviewed changes

stephenamar-db merged commit 6dc684f into databricks:master Jun 2, 2026
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: speed up manifest JSON rendering#879

perf: speed up manifest JSON rendering#879
stephenamar-db merged 1 commit into
databricks:masterfrom
He-Pin:perf/manifest-json-rendering

He-Pin commented May 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

He-Pin commented May 30, 2026

Motivation

Modification

Result

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants