Skip to content

perf: use StringBuilderWriter across renderers and stdlib escape paths#891

Open
He-Pin wants to merge 2 commits into
databricks:masterfrom
He-Pin:perf/escapeStringJson-direct-escape
Open

perf: use StringBuilderWriter across renderers and stdlib escape paths#891
He-Pin wants to merge 2 commits into
databricks:masterfrom
He-Pin:perf/escapeStringJson-direct-escape

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented Jun 4, 2026

Motivation

Continues the synchronous java.io.StringWriter elimination started by PRs #875 (TomlRenderer) and #889 (std.deepJoin). Each StringWriter write goes through StringBuffer's synchronized monitor — pure overhead for the single-threaded code paths every renderer takes.

This PR landed in two parts:

  1. The original commit covers std.escapeStringJson / std.escapeStringPython.
  2. A follow-up commit extends the same StringBuilderWriter substitution across the remaining stdlib escape path (std.escapeStringXML), the YAML / PrettyYAML / Python / JSON renderers, and Renderer / PythonRenderer default out parameters (which transitively benefit Val.Obj.renderString, Format.scala's %r complex-type path, etc.).

Modification

Part 1: stdlib escape entry points

  • std.escapeStringJson: for Val.Str input, call BaseRenderer.escape directly with a StringBuilderWriter, bypassing Materializer.stringify dispatch, Renderer allocation, and CharBuilder allocation. For non-string input, use StringBuilderWriter instead of java.io.StringWriter.
  • std.escapeStringPython: same StringBuilderWriter substitution.

Part 2: rest of the renderer surface

  • std.escapeStringXML: replace internal StringWriter with a pre-sized StringBuilderWriter.
  • Renderer / PythonRenderer: default out parameter changes from java.io.StringWriter to StringBuilderWriter. Call sites that rely on the default (Val.Obj.renderString in Val.scala, Format.scala's complex-type %r path) inherit the optimization with no signature change.
  • YamlRenderer: full visitor type-parameter migration from StringWriter to StringBuilderWriter; outBuffer switches from StringBuffer to StringBuilder (same length / setLength / charAt interface). StringBuilderWriter gains a small getBuilder accessor so YamlRenderer's trailing-space trim can still reach into the buffer.
  • PrettyYamlRenderer: default out parameter to StringBuilderWriter. Also drops a dead StringWriter allocation in the quoted-string branch — the writer's result was never consumed (quotedStr is used instead).
  • bench: MaterializerBenchmark.renderWith updated to construct a StringBuilderWriter to match the new constructor signatures.

Result

  • Eliminates StringBuffer monitor acquisition on every char / string write across all remaining renderers (YAML / PrettyYAML / Python / default-JSON) and all three std.escapeString* entry points.
  • Removes one dead StringWriter allocation per quoted-string in PrettyYamlRenderer.

Benchmarks (Scala Native, Apple M3 Pro)

std.escapeStringJson micro-benchmark via hyperfine:

Benchmark Before (ms) After (ms) jrsonnet (ms) Gap
std.escapeStringJson 7.0 ± 3.1 6.9 ± 0.7 4.6 ± 1.6 1.48×

Tests

  • All 465 native tests + 48 JVM tests pass.

For Val.Str inputs, directly call BaseRenderer.escape with a
StringBuilderWriter instead of going through the full
Materializer.stringify -> Renderer -> StringWriter pipeline.
This avoids: Renderer allocation, CharBuilder allocation,
StringWriter synchronization, and the visitor dispatch overhead.

For non-string inputs, use StringBuilderWriter instead of
java.io.StringWriter to avoid synchronization overhead.
@He-Pin He-Pin marked this pull request as draft June 4, 2026 05:18
… stdlib

Motivation:
Continues the synchronous StringWriter elimination started by PR databricks#875
(TomlRenderer), databricks#889 (std.deepJoin), and the existing commit in this PR
(escapeStringJson/Python). Remaining hot paths still allocate a
java.io.StringWriter with its synchronized StringBuffer.

Modification:
- StringBuilderWriter: expose getBuilder for direct StringBuilder access
  (YamlRenderer trims trailing spaces via length/charAt/setLength).
- Renderer / PythonRenderer: default `out` constructor parameter changes
  from java.io.StringWriter to StringBuilderWriter. Format.scala's
  complex-type Renderer paths and Val.Obj.renderString automatically
  benefit via the new default.
- YamlRenderer: full visitor type-parameter migration from StringWriter
  to StringBuilderWriter; outBuffer switches from StringBuffer to
  StringBuilder (same length/setLength/charAt interface).
- PrettyYamlRenderer: default `out` parameter to StringBuilderWriter;
  also drops a dead StringWriter allocation in the quoted-string branch
  (the writer's result was never consumed — `quotedStr` is used instead).
- std.escapeStringXML: replace internal StringWriter with
  StringBuilderWriter, pre-sized to input length + 16.
- bench: MaterializerBenchmark switches its renderWith helper to
  StringBuilderWriter to match the new constructor signatures.

Result:
- Eliminates StringBuffer monitor acquisition on every char/string write
  across all remaining renderers (Yaml/PrettyYaml/Python/Json default
  paths) and std.escapeStringXML.
- Removes one dead StringWriter allocation per quoted-string in
  PrettyYamlRenderer.
- All JVM tests pass (sjsonnet.jvm[3.3.7].test).
@He-Pin He-Pin changed the title perf: use StringBuilderWriter in std.escapeStringJson/escapeStringPython perf: use StringBuilderWriter across renderers and stdlib escape paths Jun 4, 2026
@He-Pin He-Pin marked this pull request as ready for review June 4, 2026 10:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant