fix: increase P2P pubsub max message size to match blob size limit#3344
Conversation
Move DefaultMaxBlobSize from block/internal/common to pkg/blobsize so the P2P layer can import it. Pass it to pubsub.WithMaxMessageSize when creating FloodSub, preventing fullnode desync when blocks exceed the default 1 MB libp2p limit.
📝 WalkthroughWalkthroughThis PR relocates blob-size configuration from an internal package ( ChangesBlob Size Configuration Migration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Claude finished @chatton's task in 2m 46s —— View job Code Review
SummaryThis PR fixes a real production bug: the libp2p pubsub layer was hard-capped at 1 MB (its default), while the actual blob size limit could be configured to 120 MB via ldflags. Any full block exceeding 1 MB would silently fail to propagate over P2P, causing full-node desync. The fix is correct and the extraction into Issues1. Unsafe c.ps, err = pubsub.NewFloodSub(ctx, c.host, pubsub.WithMaxMessageSize(int(blobsize.DefaultMaxBlobSize)))
if blobsize.DefaultMaxBlobSize > math.MaxInt {
return fmt.Errorf("configured blob size %d exceeds max int size", blobsize.DefaultMaxBlobSize)
}2. Empty After the migration the file contains only 3. Flaky // allow subscription to propagate
time.Sleep(200 * time.Millisecond)A fixed sleep is a CI flakiness risk — under load the subscription may not be ready in 200 ms. The rest of the test already uses a 5-second receive timeout; a simple poll loop is more robust: require.Eventually(t, func() bool {
_ = tp.Publish(ctx, []byte("ping"))
recvCtx, cancel := context.WithTimeout(ctx, 50*time.Millisecond)
defer cancel()
_, err := sub.Next(recvCtx)
return err == nil
}, 5*time.Second, 100*time.Millisecond, "subscription did not become ready")Or at minimum raise the sleep to 500 ms. 4. No unit tests for The Minor Observations
VerdictThe core fix is correct and the package extraction is the right design. The two issues worth addressing before merge are the unchecked |
|
The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).
|
| func (c *Client) setupGossiping(ctx context.Context) error { | ||
| var err error | ||
| c.ps, err = pubsub.NewFloodSub(ctx, c.host) | ||
| c.ps, err = pubsub.NewFloodSub(ctx, c.host, pubsub.WithMaxMessageSize(int(blobsize.DefaultMaxBlobSize))) |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3344 +/- ##
==========================================
+ Coverage 60.71% 60.73% +0.02%
==========================================
Files 127 127
Lines 13781 13781
==========================================
+ Hits 8367 8370 +3
Misses 4500 4500
+ Partials 914 911 -3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
pkg/p2p/client_test.go (2)
85-125: ⚖️ Poor tradeoffConsider testing the actual configured limit.
The test name suggests it validates the maximum message size, but it only tests that a 1.5 MB payload can be delivered. While this proves the limit exceeds the old 1 MB default, it doesn't verify:
- Messages at the configured
blobsize.DefaultMaxBlobSize(5 MB) are accepted- Messages exceeding the limit are rejected
This would provide stronger validation that the pubsub configuration matches the blob size limit as intended.
💡 Suggested enhancement
Consider adding test cases that:
- Publish a message at exactly
blobsize.DefaultMaxBlobSizeand verify delivery- Publish a message slightly exceeding the limit and verify it's rejected (if rejection behavior is predictable and observable)
Alternatively, rename the test to
TestPubSubLargeMessageto better reflect what it actually validates.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/p2p/client_test.go` around lines 85 - 125, The test TestPubSubMaxMessageSize currently only verifies a 1.5MB payload; extend it to assert the actual configured limit by using blobsize.DefaultMaxBlobSize: add a case that publishes a payload of exactly blobsize.DefaultMaxBlobSize and assert tp.Publish + sub.Next deliver successfully, and add a second case that publishes a payload slightly larger than blobsize.DefaultMaxBlobSize and assert tp.Publish or sub.Next fails (timeout or error) to verify rejection behavior; keep reuse of tp.Publish and sub.Next and the existing ctx/recvCtx setup and ensure payload sizes are computed from blobsize.DefaultMaxBlobSize rather than hard-coded values.
104-104: ⚡ Quick winReplace fixed sleep with deterministic wait.
The fixed
time.Sleep(200 * time.Millisecond)can cause test flakiness on slow CI systems or be unnecessarily slow on fast systems. Consider using thewaitForConditionhelper (already defined at line 298) to poll for subscription readiness instead.♻️ Proposed fix using waitForCondition
sub, err := subTopic.Subscribe() require.NoError(err) - // allow subscription to propagate - time.Sleep(200 * time.Millisecond) + // Wait for subscription to propagate + err = waitForCondition(2*time.Second, func() bool { + return len(subTopic.ListPeers()) > 0 + }) + require.NoError(err, "subscription did not propagate") tp, err := clients[0].PubSub().Join(topic)Note: You may need to verify the correct API for checking subscription state (e.g.,
ListPeers()or topic mesh membership).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/p2p/client_test.go` at line 104, Replace the brittle time.Sleep(200 * time.Millisecond) in the test with a deterministic poll using the existing waitForCondition helper (defined as waitForCondition at line ~298) to wait until the subscription/peer state is ready; call the subscription readiness check (e.g., use the client's ListPeers(), topic mesh membership, or the specific subscription object's method) inside the waitForCondition predicate and fail the test on timeout so the test proceeds as soon as readiness is observed and is robust on slow CI.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/blobsize/blobsize.go`:
- Around line 14-21: The init() in pkg/blobsize/blobsize.go must guard
DefaultMaxBlobSize against uint64→int overflow: after parsing
defaultMaxBlobSizeStr into v, verify v is non-zero and v <= uint64(math.MaxInt)
(or equivalent maxInt) before assigning DefaultMaxBlobSize; if it exceeds max
int, set DefaultMaxBlobSize to a safe fallback or clamp to int(maxInt) so that
later cast in pkg/p2p/client.go where
pubsub.WithMaxMessageSize(int(blobsize.DefaultMaxBlobSize)) cannot
wrap/truncate. Ensure you reference DefaultMaxBlobSize, init(), and
defaultMaxBlobSizeStr when making the change.
---
Nitpick comments:
In `@pkg/p2p/client_test.go`:
- Around line 85-125: The test TestPubSubMaxMessageSize currently only verifies
a 1.5MB payload; extend it to assert the actual configured limit by using
blobsize.DefaultMaxBlobSize: add a case that publishes a payload of exactly
blobsize.DefaultMaxBlobSize and assert tp.Publish + sub.Next deliver
successfully, and add a second case that publishes a payload slightly larger
than blobsize.DefaultMaxBlobSize and assert tp.Publish or sub.Next fails
(timeout or error) to verify rejection behavior; keep reuse of tp.Publish and
sub.Next and the existing ctx/recvCtx setup and ensure payload sizes are
computed from blobsize.DefaultMaxBlobSize rather than hard-coded values.
- Line 104: Replace the brittle time.Sleep(200 * time.Millisecond) in the test
with a deterministic poll using the existing waitForCondition helper (defined as
waitForCondition at line ~298) to wait until the subscription/peer state is
ready; call the subscription readiness check (e.g., use the client's
ListPeers(), topic mesh membership, or the specific subscription object's
method) inside the waitForCondition predicate and fail the test on timeout so
the test proceeds as soon as readiness is observed and is robust on slow CI.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 34b3bf94-3e44-4285-bcd5-15762c52c7bf
📒 Files selected for processing (13)
CHANGELOG.mdapps/testapp/Dockerfileblock/internal/common/consts.goblock/internal/da/client.goblock/internal/executing/executor.goblock/internal/submitting/batching_strategy_test.goblock/internal/submitting/da_submitter.goblock/internal/submitting/submitter.goblock/internal/syncing/syncer.goblock/internal/syncing/syncer_forced_inclusion_test.gopkg/blobsize/blobsize.gopkg/p2p/client.gopkg/p2p/client_test.go
💤 Files with no reviewable changes (1)
- block/internal/common/consts.go
| func init() { | ||
| v, err := strconv.ParseUint(defaultMaxBlobSizeStr, 10, 64) | ||
| if err != nil || v == 0 { | ||
| DefaultMaxBlobSize = 5 * 1024 * 1024 // 5 MB fallback | ||
| return | ||
| } | ||
| DefaultMaxBlobSize = v | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify where DefaultMaxBlobSize is cast to int and confirm no other guards exist.
rg -nP 'WithMaxMessageSize\s*\(\s*int\s*\(\s*blobsize\.DefaultMaxBlobSize\s*\)\s*\)' -C 2
rg -nP '\bDefaultMaxBlobSize\b' --type=go -C 2
rg -nP 'defaultMaxBlobSizeStr' -C 2Repository: evstack/ev-node
Length of output: 13381
Guard DefaultMaxBlobSize against uint64→int overflow before pubsub.WithMaxMessageSize.
pkg/blobsize/blobsize.go parses defaultMaxBlobSizeStr into DefaultMaxBlobSize and only checks err != nil || v == 0, while pkg/p2p/client.go later does pubsub.WithMaxMessageSize(int(blobsize.DefaultMaxBlobSize)). If defaultMaxBlobSizeStr is set to a value > maxInt, the uint64→int cast can wrap/truncate and produce an invalid pubsub max message size.
Suggested fix
package blobsize
import "strconv"
+const fallbackMaxBlobSize uint64 = 5 * 1024 * 1024 // 5 MB
+
// defaultMaxBlobSizeStr holds the string representation of the default blob
// size limit. Override at link time via:
//
// go build -ldflags "-X github.com/evstack/ev-node/pkg/blobsize.defaultMaxBlobSizeStr=125829120"
//
var defaultMaxBlobSizeStr = "5242880" // 5 MB
// DefaultMaxBlobSize is the max blob size limit used for blob submission.
var DefaultMaxBlobSize uint64
func init() {
v, err := strconv.ParseUint(defaultMaxBlobSizeStr, 10, 64)
if err != nil || v == 0 {
- DefaultMaxBlobSize = 5 * 1024 * 1024 // 5 MB fallback
+ DefaultMaxBlobSize = fallbackMaxBlobSize
return
}
+ maxInt := uint64(^uint(0) >> 1)
+ if v > maxInt {
+ DefaultMaxBlobSize = maxInt
+ return
+ }
DefaultMaxBlobSize = v
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/blobsize/blobsize.go` around lines 14 - 21, The init() in
pkg/blobsize/blobsize.go must guard DefaultMaxBlobSize against uint64→int
overflow: after parsing defaultMaxBlobSizeStr into v, verify v is non-zero and v
<= uint64(math.MaxInt) (or equivalent maxInt) before assigning
DefaultMaxBlobSize; if it exceeds max int, set DefaultMaxBlobSize to a safe
fallback or clamp to int(maxInt) so that later cast in pkg/p2p/client.go where
pubsub.WithMaxMessageSize(int(blobsize.DefaultMaxBlobSize)) cannot
wrap/truncate. Ensure you reference DefaultMaxBlobSize, init(), and
defaultMaxBlobSizeStr when making the change.
Source: Coding guidelines
Overview
This PR correctly initializes the p2p system to use the configured blob size (previously it defaulted to 1Mb and could cause failures for large blocks)
I extracted it out into a separate package so it is readable from both
pkg/p2pandblock/internalSummary by CodeRabbit
Release Notes
Bug Fixes
Tests