Skip to content

Render graph completion + TAA sky fix + froxel light clustering (audit tasks #15, #23)#66

Merged
proggeramlug merged 7 commits into
mainfrom
audit/render-graph-completion
Jun 12, 2026
Merged

Render graph completion + TAA sky fix + froxel light clustering (audit tasks #15, #23)#66
proggeramlug merged 7 commits into
mainfrom
audit/render-graph-completion

Conversation

@proggeramlug

@proggeramlug proggeramlug commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Completes the final two open items from the architecture audit: the render-graph migration (#15) and froxel light clustering (#23). Also fixes a real TAA bug the new coverage caught.

Render graph (#15)

  • The frame runs on the graph: 15 PassNodes from froxel-assign/shadow through auto-exposure, with declared reads/writes + order pins reproducing the hand-tuned sequence exactly (pin-loosening documented as the next refinement). Context-owns-renderer pattern at full scale.
  • Final extractions: GTAO, translucent, auto-exposure become record_* methods.
  • TAA bug fix: zero-velocity sky pixels reprojected positionally through a degenerate far-plane reconstruction; the luma-only history clamp then locked the wrong chroma in permanently — TAA-on games had skies tinted toward whatever bright object the broken UV landed on. Far-depth pixels now reproject the view direction (rotation-exact, translation-invariant).
  • New TAA-on golden pins the fix and the TAA branch of the post-FX cascade.

Froxel light clustering (#23)

The 256-light cap raise (task #14) removed the capability ceiling but left the scene shader paying O(live lights) per fragment. Now:

  • 16×9×24 froxel grid, log-distributed depth slices. A compute pass assigns lights per cluster (sphere vs froxel AABB); cluster counts + index list live in storage buffers, light data stays in the existing UBO.
  • Clustered loop spliced into SCENE_SHADER at pipeline build between BEGIN/END-POINT-LIGHT-LOOP markers; the plain loop remains the WebGL2 fallback and the semantic reference. Capability-gated on fragment-stage storage buffers (FroxelPass::supported), BLOOM_DISABLE_FROXEL=1 kill-switch for field bisection.
  • Bit-exact parity, enforced: new golden many_point_lights_clustered_scene drives the retained scene path under 40 lights; its golden was generated with the reference loop and the clustered path reproduces it with mean/max diff 0.0 (the assignment is strictly conservative, so the paths are mathematically identical). The test asserts the clustered path is actually active so it can't silently regress to testing the fallback against itself.
  • lighting.rs extraction: the group-1 bind-group entry list existed in four hand-synced copies (a binding-drift hazard); every lighting bind group now goes through one builder. mod.rs ratchets 13,956 → 11,775 since the audit began.

All goldens pixel-identical (except the two new ones, by definition), 97 unit tests green, validators clean.

Cluster 9 prep: the GTAO dispatch (temporal EMA ping-pong, Halton-5
rotation) leaves end_frame_with_scene; p22/p32 derive inside the
method. Goldens pixel-identical.
Cluster 10 prep: the Phase-4b translucent/refractive/additive dispatch
(back-to-front sort, scene-color snapshot for reads_scene materials,
impulse-field update) becomes record_translucent_pass. Goldens
pixel-identical.
Cluster 11 prep: the measure+adapt pass becomes record_auto_exposure;
its luminance source derives from composite_source_view() internally
(it measures whatever the composite will read), and the caller keeps
the src/dst ping-pong indices because the composite binds the same dst
view. Goldens pixel-identical.
…jection

With zero sky velocity, TAA reprojected sky pixels POSITIONALLY: the
far-plane world reconstruction divides by a near-zero w, landing
prev_uv on arbitrary scene geometry. The luma-only YCoCg history clamp
(correct for sparkle) then locks the wrong CHROMA in forever — skies
tinted toward whatever bright object the broken UV hit (uniform green
with a green sphere in frame; reddish with a red cube). Every TAA-on
game shipped with this.

Sky is at infinity, so far-depth pixels now reproject the view
DIRECTION (w=0 through prev_vp): exact under camera rotation,
translation-invariant by definition, with an in.uv fallback for
degenerate w.

Found by the new TAA-on golden (lit_primitives_taa), which now pins
both the fix and the TAA branch of the post-FX cascade. TAA-off
goldens unaffected (the change is inside the far-depth else-branch).
…b complete

Every render pass between geometry upload and the terminal composite
now executes as a PassNode in one frame Graph: shadow → hdr_scene →
translucent → hiz_build → occlusion_capture → gtao → ssao_blur →
ssr_march → ssr_temporal → ssgi → bloom → compose → postfx_tail →
auto_exposure. Reads/writes declare the real data dependencies
(Shadow/HdrColor/MRT/Depth named outputs; Transient tokens for the
intermediates the enum doesn't name); each node additionally carries a
with_after pin to its predecessor so the schedule reproduces the
hand-tuned order exactly. Loosening those pins so the scheduler can
interleave independent passes is the documented next refinement — do
it dependency-by-dependency with the goldens watching.

The context owns &mut Renderer (the cluster-1 pattern at full scale):
closures borrow nothing at build time and call the record_* methods.
Feature toggles are checked inside closures/methods, never by omitting
nodes (with_after on a missing node is a schedule error by design).
Immediate-mode uploads moved ahead of the graph build — queue
write-buffer ordering is submission-scoped, so within this one encoder
the move is semantically identical.

All 6 goldens (incl. the new TAA-on scene) pixel-identical.
Point lights were O(live lights) per fragment since the 256-light cap
raise. A 16x9x24 froxel grid now restores O(cluster lights) on every
backend with fragment-stage storage buffers:

- froxel.rs: compute pass assigns lights to clusters (sphere vs
  froxel AABB, log-distributed depth slices); cluster counts + index
  list in storage buffers; light data stays in the existing UBO.
- The clustered point-light loop is spliced into SCENE_SHADER between
  the BEGIN/END-POINT-LIGHT-LOOP markers at pipeline build; the plain
  loop remains the WebGL2 fallback and semantic reference.
- lighting_layout grows bindings 10-12 when clustered; pipeline_3d
  shares the layout unaffected (immediate-mode keeps the plain loop).
- froxel_assign runs as a render-graph node before hdr_scene.
- BLOOM_DISABLE_FROXEL=1 kill-switch forces the reference loop.

Parity: new golden many_point_lights_clustered_scene drives the
retained scene path (scene_pipeline) under 40 lights; its golden was
generated with the reference loop and the clustered path reproduces
it bit-exactly (mean/max diff 0.0). The assignment is strictly
conservative, so the two paths are mathematically identical.

Also extracts lighting.rs: the group-1 bind-group entry list existed
in four hand-synced copies (a binding-drift hazard); all lighting
bind groups now go through one builder. mod.rs ratchets 11871->11775.
@proggeramlug proggeramlug changed the title Render graph completion + TAA sky fix (finishes audit task #15) Render graph completion + TAA sky fix + froxel light clustering (audit tasks #15, #23) Jun 12, 2026
@proggeramlug proggeramlug merged commit bab3f07 into main Jun 12, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant