fix(flatkv): implement flatkv_only mode state-sync int testings#3545
fix(flatkv): implement flatkv_only mode state-sync int testings#3545blindchaser wants to merge 4 commits into
Conversation
PR SummaryHigh Risk Overview Storage / snapshot path: Non-delete changesets with empty values are normalized via Integration: New Reviewed by Cursor Bugbot for commit c84cf4e. Bugbot is set up for automated code reviews on this repo. Configure here. |
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit a8ff4db. Configure here.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a8ff4db93e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3545 +/- ##
==========================================
- Coverage 59.17% 58.54% -0.64%
==========================================
Files 2219 2161 -58
Lines 183185 177213 -5972
==========================================
- Hits 108395 103742 -4653
+ Misses 65029 64134 -895
+ Partials 9761 9337 -424
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
… SS import)
Two distinct bugs broke post-migration flatkv_only state-sync, surfaced by a
new end-to-end 4-validator state-sync integration test:
1. Empty-value WAL replay: a proto.KVPair{Value: []byte{}, Delete: false} is a
legitimate "set empty value" write, but it round-trips through the WAL as
Value=nil and ApplyChangeSets treated nil as a deletion, dropping the key on
replay/restore. This diverged the FlatKV LtHash (evm_lattice) and broke
AppHash verification during state sync. Fixed by normalizing non-delete
values via nonNilValue in classifyAndPrefix.
2. State Store empty after flatkv_only restore: the flatkv_only snapshot stream
omitted the keys.FlatKVStoreKey module header, so the restore-side SS
importer never ran convertFlatKVNodes and the State Store came up empty
(every EVM-RPC/historical query returned nil) even though SC/AppHash were
healthy. Fixed by making the FlatKV exporter self-describing: KVExporter now
emits its own keys.FlatKVStoreKey header ahead of its nodes, mirroring the
memiavl MultiTreeExporter. The composite exporter no longer injects the
header (dropped flatkvHeaderPending and the cosmos-transition injection), and
CompositeCommitStore.Exporter returns the bare flatkv exporter in
flatkv_only. The header is now correct whether the stream is consumed bare or
appended after the cosmos modules.
Adds end-to-end coverage (verify_flatkv_only_statesync.sh + CI matrix entry,
docker/Makefile/app.toml plumbing for GIGA_FLATKV_ONLY) and deterministic Go
regression tests (TestEmptyValueSurvivesWALReplay,
TestFlatKVOnlySnapshotRestorePopulatesSS).
Co-authored-by: Cursor <cursoragent@cursor.com>
a8ff4db to
0e18f23
Compare

Summary
Fixes two distinct bugs that broke post-migration
flatkv_onlystate-sync, where a node boots directly in the terminal v3 steady state (all SC writes route to FlatKV, memiavl is never allocated). Both bugs only surface on the WAL-replay / snapshot path, so they were invisible to live execution and to the existing memiavl-backed state-sync coverage. A new end-to-end 4-validator integration test (GIGA_FLATKV_ONLY=true) exercises the full kill → wipe → state-sync → verify loop, backed by deterministic Go regression tests.sei-db/state_db/sc/flatkv/store_apply.go:classifyAndPrefixnow normalizes non-delete changeset values through the newnonNilValuehelper. A changeset pair is a deletion iffDelete == true; a zero-length value withDelete == falseis a legitimate "set this key to an empty value" write. Protobuf cannot distinguish[]byte{}fromnil, so such a write round-trips through the WAL (catchup, read-only clone, snapshot export, state-sync restore) asValue == nil. Previously the downstreamprocess*Changeshelpers, which use thenil value == deletionconvention, dropped the key on replay — diverging the per-DB LtHash, theevm_latticestore hash, and ultimately the consensus AppHash from the live chain. The helper guarantees empty-value writes survive replay; true deletes still arrive asnilvia theDeleteflag and are unaffected.sei-db/state_db/sc/composite/store.go:CompositeCommitStore.Exporterno longer returns the bare FlatKV exporter inflatkv_onlymode. It wraps it in the compositeSnapshotExporter(NewExporter(nil, flatkvExporter)) so thekeys.FlatKVStoreKeymodule header is emitted ahead of the nodes. The restore-side State Store importer keys off that header to route the snapshot throughconvertFlatKVNodes; the bare exporter omitted it.sei-db/state_db/sc/composite/exporter.go:SnapshotExportergains aflatkvHeaderPendingflag, set when the stream starts directly inphaseFlatKV(cosmosExporter == nil). With no cosmos→flatkv transition to carry the module header,nextFromFlatKVnow emitskeys.FlatKVStoreKeyonce before the first node. This guarantees that aflatkv_onlysnapshot populates the State Store on restore, so EVM-RPC and historical queries return real data instead ofnileven though SC/AppHash were already healthy.docker/localnode/scripts/step4_config_override.sh: addsGIGA_FLATKV_ONLY, which writessc-write-mode = "flatkv_only"andevm-ss-split = falsetoapp.toml, booting the node in the v3 steady state. This mode is mutually exclusive with theGIGA_STORAGEdual-write override.docker/docker-compose.yml,Makefile: threadGIGA_FLATKV_ONLYthrough the cluster env vars somake docker-cluster-startcan boot aflatkv_onlycluster..github/workflows/integration-test.yml: adds aFlatKV Only State Syncmatrix entry (GIGA_FLATKV_ONLY=true) that deploys the EVM fixture and runsverify_flatkv_only_statesync.sh.integration_test/contracts/verify_flatkv_only_statesync.sh: end-to-end harness. Asserts the cluster booted FlatKV-only (no memiavl on disk), deploys an EVM fixture, kills and wipes one validator, configures it for state-sync from the surviving validators, waits for the restored node to catch up, and verifies the recovered FlatKV digest matches the donors at a shared height plus EVM-RPC reads (balance, contract storage, code) return the expected values.Test plan
sei-db/state_db/sc/flatkv/empty_value_replay_test.goTestEmptyValueSurvivesWALReplay: writes a key with an empty ([]byte{}, non-delete) value, rebuilds the store from the WAL, and asserts the replayed root hash matches the live store — directly guarding the empty-value-vs-deletion regression.sei-cosmos/storev2/rootmulti/flatkv_snapshot_test.goTestFlatKVOnlySnapshotRestoreAppHashParity: takes a snapshot of aflatkv_onlysource store and restores into a fresh destination, asserting the restoredLastBlockAppHashmatches the donor (crash/restore AppHash parity).TestFlatKVOnlySnapshotRestorePopulatesSS: populates acc/bank/EVM state in an SS-enabledflatkv_onlysource, snapshots, restores into a fresh SS-enabled destination, and reads those keys back through the State Store read path (Prove=false) at the snapshot height — fails without the module-header fix, passes with it.sei-cosmos/storev2/rootmulti/flatkv_helpers_test.go: adds theflatkv_onlyconfig factory and shared fixtures/assertions used by the new tests.integration_test/contracts/verify_flatkv_only_statesync.sh, wired into CI asFlatKV Only State Sync): validates the live kill → wipe → state-sync → verify loop on a 4-validator cluster, including post-restore FlatKV digest parity against donors and EVM-RPC read-back. Verified locally against aGIGA_FLATKV_ONLY=truedocker cluster.