Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,12 @@ jobs:
run: cargo test --lib --features test-utils
- name: Run e2e tests
run: cargo test --test e2e --features test-utils -- --test-threads=1
- name: Run v12 storage-bound audit attack PoCs
run: cargo test --test poc_commitment_audit_attacks --features test-utils
- name: Run v12 live audit-handler tests
run: cargo test --test poc_audit_handler_live --features test-utils
- name: Run bootstrap-stall PoC regression marker
run: cargo test --test poc_bootstrap_stall --features test-utils

doc:
name: Documentation
Expand Down
24 changes: 24 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,30 @@ name = "e2e"
path = "tests/e2e/mod.rs"
required-features = ["test-utils"]

# v12 storage-bound audit attack PoCs. Uses the test-only one-shot
# commitment builder/verifier helpers, so it requires the test-utils
# feature. CI runs it via `cargo test --test poc_commitment_audit_attacks
# --features test-utils`.
[[test]]
name = "poc_commitment_audit_attacks"
path = "tests/poc_commitment_audit_attacks.rs"
required-features = ["test-utils"]

# Live responder-handler tests for the v12 audit. Use
# LmdbStorageConfig::test_default(), gated on test-utils.
[[test]]
name = "poc_audit_handler_live"
path = "tests/poc_audit_handler_live.rs"
required-features = ["test-utils"]

# Bootstrap-stall DoS regression marker (documents the unfixed attack; the
# eventual fix must land with a follow-up test asserting bounded drain).
# Declared like the other PoC suites so CI invokes it explicitly.
[[test]]
name = "poc_bootstrap_stall"
path = "tests/poc_bootstrap_stall.rs"
required-features = ["test-utils"]

[features]
default = ["logging"]
# Enable tracing/logging infrastructure.
Expand Down
259 changes: 259 additions & 0 deletions docs/adr/ADR-0002-gossip-triggered-contiguous-subtree-audit.md

Large diffs are not rendered by default.

293 changes: 293 additions & 0 deletions docs/adr/ADR-0003-commitment-bound-quote-pricing.md

Large diffs are not rendered by default.

109 changes: 109 additions & 0 deletions docs/adr/ADR-0003-implementation-slices.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# ADR-0003 implementation slicing

This file tracks the slicing strategy used to ship ADR-0003 incrementally inside
ant-node alone, while the multi-repo evmlib breaking change ripens. The ADR
itself (`ADR-0003-commitment-bound-quote-pricing.md`) describes the end state;
this document describes the order in which the end state lands.

The constraint that drives the slicing: `PaymentQuote`, `ProofOfPayment`,
`bytes_for_signing`, and `quote.hash()` live in evmlib (crates.io `0.8.1`) and
flow into the on-chain `payForQuotes` interface. Adding signed fields to
`PaymentQuote` is therefore a coordinated four-repo release
(`evmlib` → `ant-protocol` → `ant-client` → `ant-node`). Until that lands,
every part of ADR-0003 that does NOT require new signed quote fields can —
and should — ship behind the rollout const the ADR's "Rollout" section
already specifies.

## Slice 1 — arithmetic re-check (shipped)

**What:** every storer re-runs `price == calculate_price(n)` for some
non-negative integer `n`, by exact recomputation, on every quote in every
payment bundle (all 7 single-node quotes and all 16 merkle candidates), in
every `VerificationContext`. Reject-only when enforced; no trust evidence.
Rollout-gated by `QUOTE_ARITHMETIC_RECHECK_ENABLED` (defaults to `false` —
observe-only). Telemetry runs only after ML-DSA-65 signature verification has
passed, so unauthenticated peers cannot poison rollout logs.

**Why first:** needs no evmlib change, no new state, no new wire types, no
new gossip; it is the ADR's "every storer re-runs the
price-equals-formula-of-count check on every quote in the bundle" rule in
its purest form. The price already encodes the count, so canonicality testing
the price alone catches every off-curve lie (a strictly weaker attack than
on-curve count inflation, which Slice 2 addresses).

**Files touched:** `src/payment/verifier.rs` (new functions
`validate_quote_arithmetic`, `validate_merkle_candidate_arithmetic`,
`log_off_curve_single_node`, `log_off_curve_merkle`,
`price_off_curve_diagnostics`, `candidate_count_to_usize`,
`quote_price_is_on_curve`), `src/replication/config.rs` (new const
`QUOTE_ARITHMETIC_RECHECK_ENABLED`).

**Scope it does NOT cover:** an on-curve quote for a fake `n`. That requires
the signed `claimed_key_count` and `commitment_pin` fields that only Slice 3
can add.

## Slice 2 — commitment-binding sidecar (no evmlib change)

**What:** carry the issuing node's current signed `StorageCommitment` as a
sidecar inside the existing payment-proof envelope. Wire the storer-side
cross-check (claimed count from the quote vs. pinned commitment's
`key_count`) using the sidecar where present, the gossiped cache where the
sidecar pin matches, or a `GetCommitmentByPin` fetch otherwise. Adds the
`FailureEvidence::QuoteCommitmentMismatch` variant. Adds the
deterministic-first-audit queue keyed on monetized pins.

**Why second:** this is the ADR's "peers cross-check the original and route
monetized commitments into audit" paragraph. It lands the full audit funnel
end-to-end against real signed commitments without changing evmlib. The
sidecar's `claimed_count` is not yet covered by the on-chain quote hash, so
the binding is enforced at the gossip/audit layer rather than at the chain
layer — exactly the residual the ADR's rollout phase already names.

**Files touched (planned):** `src/payment/proof.rs` (sidecar serialization
envelope), `src/payment/verifier.rs` (cross-check rule),
`src/replication/protocol.rs` (`GetCommitmentByPin` request/response),
`src/replication/commitment_state.rs` (quote-issuance answerability refresh),
`src/replication/mod.rs` (first-audit queue alongside
`last_commitment_by_peer`), `src/replication/types.rs`
(`FailureEvidence::QuoteCommitmentMismatch`), `src/payment/quote.rs` (read
current pin from commitment state).

## Slice 3 — signed quote fields (multi-repo, breaking cutover) — LANDED

**What:** signed `committed_key_count: u32` and `commitment_pin:
Option<[u8; 32]>` added to `PaymentQuote` and `MerklePaymentCandidateNode`
in evmlib, included in `bytes_for_signing` (single-node) and `bytes_to_sign`
(merkle), with the quote types' fields placed at the struct **tail** so an
old-format value still rmp-decodes (as `(0, None)`). `ant-protocol` is
patched in lockstep to verify the 5-field merkle message, so the merkle
binding is genuine same-key-signed evidence too. Both `evmlib` and
`ant-protocol` are brought in via `[patch.crates-io]` against local
checkouts; the eventual upstream path is published `evmlib` →
`ant-protocol` → `ant-client` → `ant-node` releases.

**This is a HARD CUTOVER, not a mixed-fleet observe-only.** Appending the
fields to the signed payload changes every quote's signature and
`quote.hash()`, so an old quote fails signature verification on a new node
regardless of any flag — there is no version in which old and new nodes
interoperate on the quote wire. The whole fleet and the clients upgrade
together. The earlier "Slice 3 deferred behind observe-only" framing was
wrong on this point (a round-2 review finding): only the **arithmetic
enforcement** (`QUOTE_ARITHMETIC_RECHECK_ENABLED`, reject vs log) is a
rollout dial; the signed-fields format is a one-shot breaking change. With
the fields signed, Slice 1's arithmetic gate strengthens from curve
canonicality to the exact `price == calculate_price(committed_key_count)`
binding rule (`binding_violation`), and pricing moves off the on-disk count
entirely (no-commitment → baseline).

## Rollout coupling

The ADR's "Rollout" section says full enforcement requires the fleet
upgraded **and** the ADR-0002 timeout-eviction gate enabled.
`QUOTE_ARITHMETIC_RECHECK_ENABLED` (reject vs observe-only-log) is
independent of timeout eviction: the arithmetic/binding gate is reject-only
on a confirmed off-curve or mis-shaped quote, with no silence lane. The
own-quote price-staleness gate is retired for commitment-bound quotes (it
compared against the on-disk count, which the committed responsible count
legitimately differs from). The Slice-2 cross-check's silence lane (an
unanswerable quoted pin) is what couples to timeout eviction, exactly as the
ADR specifies.
123 changes: 80 additions & 43 deletions src/node.rs
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,36 @@ impl NodeBuilder {
Self { config }
}

/// Reject startup in production mode without a usable rewards address.
///
/// A node that cannot receive payment must not silently run on the
/// production network. The placeholder address shipped in the example
/// config and an empty string both count as "unconfigured".
///
/// # Errors
///
/// Returns [`Error::Config`] if `network_mode` is `Production` and
/// `payment.rewards_address` is unset, empty, or the example placeholder.
fn validate_production_rewards_address(config: &NodeConfig) -> Result<()> {
if config.network_mode != NetworkMode::Production {
return Ok(());
}
let configured = config
.payment
.rewards_address
.as_deref()
.is_some_and(|addr| !addr.is_empty() && addr != "0xYOUR_ARBITRUM_ADDRESS_HERE");
if configured {
Ok(())
} else {
Err(Error::Config(
"CRITICAL: Rewards address is not configured. \
Set payment.rewards_address in config to your Arbitrum wallet address."
.to_string(),
))
}
}

/// Build and start the node.
///
/// # Errors
Expand All @@ -54,26 +84,7 @@ impl NodeBuilder {
pub async fn build(mut self) -> Result<RunningNode> {
info!("Building ant-node with config: {:?}", self.config);

// Validate rewards address in production
if self.config.network_mode == NetworkMode::Production {
match self.config.payment.rewards_address {
None => {
return Err(Error::Config(
"CRITICAL: Rewards address is not configured. \
Set payment.rewards_address in config to your Arbitrum wallet address."
.to_string(),
));
}
Some(ref addr) if addr == "0xYOUR_ARBITRUM_ADDRESS_HERE" || addr.is_empty() => {
return Err(Error::Config(
"CRITICAL: Rewards address is not configured. \
Set payment.rewards_address in config to your Arbitrum wallet address."
.to_string(),
));
}
Some(_) => {}
}
}
Self::validate_production_rewards_address(&self.config)?;

// Resolve identity and root_dir (may update self.config.root_dir)
let identity = Arc::new(Self::resolve_identity(&mut self.config).await?);
Expand Down Expand Up @@ -140,31 +151,57 @@ impl NodeBuilder {
}

// Initialize replication engine (if storage is enabled)
let replication_engine =
if let (Some(ref protocol), Some(fresh_rx)) = (&ant_protocol, fresh_write_rx) {
let repl_config = ReplicationConfig::default();
let storage_arc = protocol.storage();
let payment_verifier_arc = protocol.payment_verifier_arc();
match ReplicationEngine::new(
repl_config,
Arc::clone(&p2p_arc),
storage_arc,
payment_verifier_arc,
&self.config.root_dir,
fresh_rx,
shutdown.clone(),
)
.await
{
Ok(engine) => Some(engine),
Err(e) => {
warn!("Failed to initialize replication engine: {e}");
None
let replication_engine = if let (Some(ref protocol), Some(fresh_rx)) =
(&ant_protocol, fresh_write_rx)
{
let repl_config = ReplicationConfig::default();
let storage_arc = protocol.storage();
let payment_verifier_arc = protocol.payment_verifier_arc();
match ReplicationEngine::new(
repl_config,
Arc::clone(&p2p_arc),
storage_arc,
payment_verifier_arc,
Arc::clone(&identity),
&self.config.root_dir,
fresh_rx,
shutdown.clone(),
)
.await
{
Ok(engine) => {
// ADR-0003: wire the engine's commitment state as the
// quote generator's commitment source so quotes force
// their price from the live storage commitment. Done
// here because the engine owns the commitment state and
// is built after the protocol.
if let Some(ref protocol) = ant_protocol {
let concrete = Arc::clone(engine.commitment_state());
let source: Arc<dyn crate::payment::quote::CommitmentSource> = concrete;
protocol.attach_commitment_source(source);
// ADR-0003: share the engine's gossip commitment
// cache with the verifier so the cross-check can
// resolve quote pins against neighbours' commitments.
protocol
.payment_verifier_arc()
.attach_commitment_cache(Arc::clone(engine.last_commitment_by_peer()));
// ADR-0003: give the verifier the monetized-pin sender so
// commitments that back a payment get a deterministic
// first audit from the engine's drainer.
protocol
.payment_verifier_arc()
.attach_monetized_pin_sender(engine.monetized_pin_sender());
}
Some(engine)
}
} else {
None
};
Err(e) => {
warn!("Failed to initialize replication engine: {e}");
None
}
}
} else {
None
};

let node = RunningNode {
config: self.config,
Expand Down
30 changes: 30 additions & 0 deletions src/payment/metrics.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,18 @@ impl QuotingMetricsTracker {
self.close_records_stored.fetch_add(1, Ordering::SeqCst);
}

/// Overwrite the counter with an authoritative count of held records.
///
/// This is the deletion-aware path and the SINGLE source of truth for the
/// priced record count: the handler calls it at quote time with the live
/// LMDB entry count (`current_chunks()`), so any record removed from
/// storage — by delete, prune, or otherwise — is reflected on the next
/// quote with no per-delete bookkeeping to keep in sync. `record_store`
/// remains only an optimistic between-quote hint; the resync overwrites it.
pub fn set_records(&self, count: usize) {
self.close_records_stored.store(count, Ordering::SeqCst);
}

/// Get the number of records stored.
#[must_use]
pub fn records_stored(&self) -> usize {
Expand Down Expand Up @@ -62,4 +74,22 @@ mod tests {
tracker.record_store();
assert_eq!(tracker.records_stored(), 3);
}

#[test]
fn test_set_records_resyncs_to_authoritative_count() {
let tracker = QuotingMetricsTracker::new(100);
assert_eq!(tracker.records_stored(), 100);

// Resync down (e.g. after deletions/pruning the store now holds fewer).
tracker.set_records(42);
assert_eq!(tracker.records_stored(), 42);

// Resync up (e.g. after new stores).
tracker.set_records(57);
assert_eq!(tracker.records_stored(), 57);

// Resync to zero (empty store).
tracker.set_records(0);
assert_eq!(tracker.records_stored(), 0);
}
}
Loading
Loading