feat: full-node shunning — node implementation (ADR-0003)#161
Open
mickvandijke wants to merge 9 commits into
Open
feat: full-node shunning — node implementation (ADR-0003)#161mickvandijke wants to merge 9 commits into
mickvandijke wants to merge 9 commits into
Conversation
First slice of ADR-0003. replicate_fresh now retries each per-peer push up to FRESH_REPLICATION_DELIVERY_MAX_RETRIES on a transport failure, so a transient hiccup doesn't silently drop the offer. The encoded offer is shared via Arc so the common single-attempt path keeps one clone per peer. Delivery assurance only — possession scoring (the delayed 5-15 min check and AuditChallenge-severity penalty) is the next stage. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ant-protocol 2.2.0 pulled saorsa-core 0.26.0 from crates.io while the node used a local path of the same version, producing two copies that collided on shared types (e.g. saorsa_core::address::MultiAddr) and broke the ant-devnet binary. Point the direct dependency at WithAutonomi/saorsa-core PR #119 (trust quarantine thresholds) and add a matching [patch.crates-io] so the transitive copy unifies onto one source. Full workspace now builds. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Implements the ADR-0003 gate. A client PUT is now accepted only when this node is within its own local SELF_CLOSENESS_GATE_WIDTH (= K_BUCKET_SIZE) closest peers to the address, so the fresh replication it triggers is legitimate and cannot mis-penalise honest peers. The width equals the client's PUT fallback ceiling (ADR-0002), so a client routing past full close-group members onto further peers is still accepted; a genuinely far node is turned away. The P2P handle is bound out of the lock before the await, and the gate no-ops when no handle is attached (unit tests). Fresh-replication receives keep their own narrower admission check. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Implements the detection+penalty core of ADR-0003. After fresh replication, replicate_fresh now returns the responsible close-group peers; the fresh-write drainer enqueues a possession-check event, and a new scheduler task waits a randomised 5-15 minute settle delay (POSSESSION_CHECK_DELAY_MIN/MAX) before probing every responsible peer for actual possession via a presence-only VerificationRequest. A peer confirmed absent is penalised at AuditChallenge severity (report_trust_event(ApplicationFailure(AUDIT_FAILURE_TRUST_WEIGHT))); a peer that holds it earns nothing; an unreachable peer yields no verdict and is re-probed under a bounded grace, never penalised. Every responsible peer is tested regardless of whether the original push reached it. New possession module, config tunables, and a delay-bounds unit test. 376 replication tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds a test-only ReplicationEngine::run_possession_check_now that drives the ADR-0003 possession check without the 5-15 min scheduler delay, plus an e2e test on a 10-node testnet: node A probes an absent peer B and a present peer C over real transport; B's trust score drops (penalised, the signal saorsa-core eviction acts on) while C's is untouched. Proves the detection+penalty core deterministically. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Proves the full replicate_fresh -> enqueue -> delayed scheduler -> penalty chain. Makes the 5-15 min possession delay a ReplicationConfig field (POSSESSION_CHECK_DELAY_MIN/MAX as defaults) so tests can shorten it, threads an optional replication_config override through TestNetworkConfig, and has the public replicate_fresh entry point schedule the possession check (the production PUT path already does via the drainer). The e2e test runs a 10-node testnet with a ~200-500ms delay, triggers fresh replication with no payment cache so close peers reject and are absent, and asserts the scheduled check penalises an absent close peer. 675 lib tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds an e2e test on a 25-node testnet: the closest (responsible) node accepts a client PUT (gate must not break normal uploads), while the farthest node (rank >= K_BUCKET_SIZE, outside its own closest view) rejects before payment with a closeness error. Proves both the regression guard and the intended reject behaviour of the ADR-0003 self-closeness gate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nvergence Amends ADR-0003. A healthy replica whose routing table still lists closer full nodes it hasn't evicted yet ranks itself outside the narrow storage_admission_width window and would wrongly reject a fresh-replication offer it should accept — stalling convergence, and inconsistent with the wider K-window it already accepts client PUTs within. Widen the fresh-offer accept gate to config.paid_list_close_group_size (= K_BUCKET_SIZE = 20). Safe: the sender still fans out only to close_group_size, so this adds no stores; retention/pruning stays at storage_admission_width, so steady-state replication is unchanged and any transient over-coverage is reclaimed once the close group converges (the multi-day prune hysteresis spans the minutes-long eviction window). Sender fan-out deliberately not widened — the repair path heals that case. 675 lib tests + fresh-replication & full-close-group-shunning e2e tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Node-side implementation of ADR-0003 (full-node detection, penalisation, eviction). Stacked on the ADR PR #160 (base:
docs/full-node-shunning-adr), so this diff is the implementation + tests only.[patch.crates-io]— fixes the duplicate-saorsa_coreversion skew; the node now builds fully.K_BUCKET_SIZE, coupled to the client fallback ceiling).AuditChallengeseverity, grace/no-penalty for unreachable; eviction is the existing saorsa-core trust system.Tests: e2e proofs on a live multi-node testnet — possession detection+penalty, scheduler wiring, self-closeness gate accept/reject, and full close-group shunning; 675 lib tests pass; clippy
-D warnings+ fmt clean.🤖 Generated with Claude Code