Skip to content

feat(storage): diff-layer state storage with bounded pruning#444

Draft
MegaRedHand wants to merge 5 commits into
mainfrom
feat/state-diff-layers
Draft

feat(storage): diff-layer state storage with bounded pruning#444
MegaRedHand wants to merge 5 commits into
mainfrom
feat/state-diff-layers

Conversation

@MegaRedHand

@MegaRedHand MegaRedHand commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Summary

Replaces aggressive state pruning with a diff-layer storage model so the full state history stays available cheaply. Builds on the block-signature pruning from #453 (now merged to main): keeping block headers/bodies forever is what lets historical states be reconstructed.

State storage

  • Every non-genesis state is stored as a parent-linked StateDiff (StateDiffs table, never pruned) plus a full-state snapshot (States) at anchors and hot states.
  • StateDiff stores slot, both checkpoints, and the justification fields in full, plus the appended historical_block_hashes tail. config/validators come from the nearest snapshot (they never change); latest_block_header is read back from BlockHeaders (the stored state caches the real state_root there, so it matches byte-for-byte).
  • get_state returns a snapshot directly, else reconstructs by walking base_root to the nearest ancestor snapshot and replaying appended tails forward.
  • 1024-slot anchors (StateAnchors, permanent) bound the reconstruction walk.
  • Snapshot eviction (prune_old_states) keeps the last SNAPSHOT_HOT_WINDOW = 300 slots + anchors + finalized/justified/head; evicted snapshots leave their diff behind. The States table is never written alone (always paired with a StateDiffs or StateAnchors entry).
  • DiffBase captures the parent (root, hbh_len, slot) before it is consumed into the post-state; StateDiff/DiffBase live in the storage crate.

Tests

  • StateDiff build/SSZ round-trip; state reconstruction (single + multi-diff after eviction); anchor recording; snapshot eviction (window/protected/anchors).
  • Storage + blockchain suites green (49 storage tests); clippy -D warnings clean.

Status / follow-ups

  • prune_old_data runs on the node's finalization path (blockchain/src/lib.rs), so snapshot eviction + reconstruction are exercised after ~300 slots.
  • Validated on a local 4-node devnet (finalizing healthily, ~1000 evictions, zero errors over ~2h).
  • Migration: existing DBs keep their full states as snapshots; diffs start accruing going forward (no backfill).

@MegaRedHand MegaRedHand force-pushed the feat/state-diff-layers branch 4 times, most recently from 2cb4ca3 to a342fa7 Compare June 23, 2026 18:56
@MegaRedHand MegaRedHand changed the base branch from main to feat/prune-block-signatures June 23, 2026 18:57
@MegaRedHand MegaRedHand force-pushed the feat/prune-block-signatures branch from 6600cb2 to fb1cf37 Compare June 23, 2026 19:27
@MegaRedHand MegaRedHand force-pushed the feat/state-diff-layers branch from a342fa7 to 34d6345 Compare June 23, 2026 19:31
MegaRedHand added a commit that referenced this pull request Jun 23, 2026
… forever (#453)

## Summary

Relaxes block pruning: instead of deleting old blocks wholesale, keep
block headers and bodies **forever** and prune only the signatures of
old finalized blocks. This preserves the full block history (for
debugging, re-org safety, and state reconstruction) while still
reclaiming the heavy signature data (~3KB+ per block).

## Change

- Replaces `prune_old_blocks` (deleted headers, bodies, and signatures
beyond a fixed `BLOCKS_TO_KEEP` window) with
`prune_old_block_signatures(finalized_slot, tip_slot)`.
- Policy, with `cutoff = tip_slot - SIGNATURE_PRUNING_RANGE`:
- **healthy finality** (`cutoff <= finalized_slot`): delete signatures
for `slot < cutoff` (entirely within finalized history);
- **deep non-finality** (non-finalized range exceeds the window): prune
nothing, so non-finalized signatures are never touched.
- `BlockHeaders` and `BlockBodies` are kept forever; all non-finalized
signatures are always retained.
- `get_signed_block` returns `None` when a signature is absent, now
including a pruned finalized block (deep historical signed-block serving
via BlocksByRoot is lost; peers use checkpoint sync).
- `prune_old_data` derives the tip slot from the head header and runs
signature pruning alongside state pruning.

## Key layout (slot-ordered pruning)

- `BlockSignatures` is now keyed by **`slot||root`** (big-endian slot),
reusing the shared `encode_slot_root_key` codec also used by
`LiveChain`.
- Pruning iterates in slot order and **stops at the first entry past the
cutoff** (`take_while`), and no longer does a per-row `BlockHeaders`
lookup to recover the slot. Every read site already has the slot:
`get_signed_block` loads the header first.
- `InMemoryBackend::prefix_iterator` now returns keys in sorted order to
match the RocksDB backend, which these slot-ordered range scans rely on.

## Migration

Changes the `BlockSignatures` key format: existing root-keyed entries
are not read after upgrade (old finalized blocks read as pruned). Fine
for fresh / checkpoint-synced nodes; no backfill.

## Tests

- `prune_signatures_keeps_recent_window_when_finality_healthy`,
`prune_signatures_noop_when_non_finalized_range_exceeds_window`,
`prune_signatures_noop_when_tip_within_window`.
- Storage lib suite green (42 tests); clippy `-D warnings` clean.

## Context

Split out of #444 (diff-layer state storage), which depends on this:
reconstructing historical states needs block headers retained. #444
stacks on top of this PR.

Co-authored-by: Pablo Deymonnaz <pdeymon@fi.uba.ar>
Base automatically changed from feat/prune-block-signatures to main June 23, 2026 20:56
@MegaRedHand MegaRedHand force-pushed the feat/state-diff-layers branch 3 times, most recently from 608e97f to d1a76aa Compare June 24, 2026 17:56
Store every non-genesis state as a parent-linked StateDiff (StateDiffs,
never pruned) plus a full snapshot (States) written only at 1024-slot
anchors (and the bootstrap). Neither table is ever pruned, so the full
state history is preserved cheaply.

- get_state returns an anchor snapshot directly, else reconstructs by
  walking base_root back to the nearest anchor and replaying the appended
  historical_block_hashes tails; config/validators come from the snapshot
  and latest_block_header from BlockHeaders.
- Reconstructed and freshly imported states are memoized in an in-memory
  LRU (STATE_CACHE_CAPACITY), keyed by block root. States are immutable per
  root, so the cache never needs invalidation; it keeps recent reads (e.g. a
  child block's parent state) hot without reconstruction.
- DiffBase captures the parent (root, hbh_len, slot) before it is consumed
  into the post-state. StateDiff/DiffBase live in the storage crate.
- No snapshot eviction and no StateAnchors table: anchors are simply the
  snapshots in States, so the prune-states scan is gone entirely.
@MegaRedHand MegaRedHand force-pushed the feat/state-diff-layers branch from d1a76aa to 1a060f2 Compare June 24, 2026 18:31
The migrated test built its DiffBase from the target post-state itself
(DiffBase::from_state(a, &head_state)), so base.slot/base.hbh_len came
from the target rather than the parent. That made the anchor-boundary
check always false (no snapshot written, contradicting the comment) and
left the diff self-referential, passing only via the cache memoization.

Diff against the genesis anchor already present in the store instead, so
the base correctly describes the parent state.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant