fix: enforce alloc limit in bytes decode; perf: exact-capacity readNGrow growth (#63)#75
fix: enforce alloc limit in bytes decode; perf: exact-capacity readNGrow growth (#63)#75xe-nvdk wants to merge 3 commits into
Conversation
The fork's readN/readNGrow split (upstream's readN was the chunked, limit-respecting variant) left bytes(), bytesPtr() calling the package-level readN directly — the unbounded variant that allocates the full declared length upfront. On a stream decode, a malicious bin32/ str32 header declaring ~4GB forced a ~4GB allocation before any payload was read, even with alloc limits enabled (the default). The string path (d.readN method) was unaffected — it already branches on the flag. Route bytes decoding through a new readNInto method that mirrors d.readN's branch: chunked readNGrow by default, unbounded readN only when DisableAllocLimit(true) is set. Trusted workloads with large binary payloads can keep the old single-allocation behavior via DisableAllocLimit; the follow-up commit reduces the cost of the chunked path.
Replace the append-based chunk growth in readNGrow with explicit exact-capacity sizing: grow to min(n, max(2*pos, pos+bytesAllocLimit)) per round. This preserves the prior security bound — allocation never runs more than max(received, bytesAllocLimit) ahead of data actually supplied — while eliminating append's capacity overshoot beyond n and reducing alloc+copy rounds. readN (DisableAllocLimit path) similarly allocates exactly n instead of append-growing, skipping a useless copy of old contents that ReadFull overwrites anyway. 4MB stream decode vs v6 (count=6, Apple M3 Max): DecodeStringStream4MB 549.7µs → 381.9µs (-30.5%) 15.4MiB → 11.0MiB (-28.5%) 7 → 6 allocs The bytes path is slower than v6 (+127%) only because v6 skipped the alloc limit entirely there (see previous commit); against the corrected chunked baseline this change is -20% time, -38% B/op. DisableAllocLimit restores single-allocation reads for trusted input. Closes #63
- readNInto: validate the declared length against remaining input on the byte-slice (Unmarshal) path before allocating — restores single exact-size allocation there (1 alloc for a 4MB payload, matching the pre-fix behavior) while also closing the same declared-length bomb on the byte-slice path. Error semantics match io.ReadFull (EOF vs ErrUnexpectedEOF), keeping decoderTests expectations intact. - Remove the pre-Go-1.21 local min helper; use builtin min/max. - Add Unmarshal-path tests: huge declared length fails fast; 2.5MB round-trip; result is caller-owned (no aliasing of the input). - Add CHANGELOG entries (Bug Fixes + Performance) for #63.
There was a problem hiding this comment.
Code Review
This pull request introduces changes to enforce allocation limits during msgpack bytes and string decoding, preventing malicious headers with huge declared lengths from triggering upfront out-of-memory allocations. It routes decoding through a new readNInto helper, optimizes slice growth in readNGrow to avoid capacity overshoot, and adds corresponding benchmarks and tests. The review feedback highlights critical integer overflow vulnerabilities on 32-bit architectures where large uint32 lengths can cast to negative int values, potentially bypassing length checks, causing panics during allocation, or being incorrectly interpreted as Nil values.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| // decoder's alloc limit: unless DisableAllocLimit is set, allocation for | ||
| // a declared length is chunked by bytesAllocLimit so a malicious header | ||
| // can't force a huge upfront allocation. | ||
| func (d *Decoder) readNInto(b []byte, n int) ([]byte, error) { |
There was a problem hiding this comment.
On 32-bit architectures, a large uint32 length (e.g., >= 2^31) decoded from bin32 or str32 headers can overflow when cast to int, resulting in a negative value for n. If n is negative, it will bypass the remaining check and cause a panic in make([]byte, n) or slice bounds out of range. We should add a check for n < 0 at the beginning of readNInto to prevent this.
| func (d *Decoder) readNInto(b []byte, n int) ([]byte, error) { | |
| func (d *Decoder) readNInto(b []byte, n int) ([]byte, error) { | |
| if n < 0 { | |
| return nil, fmt.Errorf("msgpack: invalid length %d", n) | |
| } |
| return readNGrow(d.r, b, n) | ||
| } | ||
|
|
||
| func readN(r io.Reader, b []byte, n int) ([]byte, error) { |
There was a problem hiding this comment.
To prevent panics or silent data corruption on 32-bit architectures when a large uint32 length overflows to a negative int, we should validate that n >= 0 at the beginning of readN.
func readN(r io.Reader, b []byte, n int) ([]byte, error) {
if n < 0 {
return nil, fmt.Errorf("msgpack: invalid length %d", n)
}| } | ||
| } | ||
| for pos < n { | ||
| newCap := min(max(2*pos, pos+bytesAllocLimit), n) |
There was a problem hiding this comment.
On 32-bit architectures, if pos is large (e.g., >= 2^30), 2*pos or pos+bytesAllocLimit can overflow to a negative value. This would cause max to return a negative value or a value smaller than expected, leading to a negative newCap and a panic in make([]byte, newCap). We should calculate newCap in an overflow-safe manner.
newCap := pos + bytesAllocLimit
if newCap < pos || newCap < 0 {
newCap = n
}
if double := 2 * pos; double > newCap && double > 0 {
newCap = double
}
if newCap > n || newCap < 0 {
newCap = n
}| if n == -1 { | ||
| return nil, nil | ||
| } | ||
| return readN(d.r, b, n) | ||
| return d.readNInto(b, n) |
There was a problem hiding this comment.
On 32-bit architectures, a large uint32 length (e.g., 0xffffffff) decoded from bin32 or str32 headers will overflow to -1 when cast to int. Since n == -1 is checked here, it will be incorrectly treated as msgpcode.Nil, returning nil, nil and silently succeeding instead of failing with an error. We should ensure that n == -1 only returns nil, nil if the code is actually msgpcode.Nil, and otherwise fail on negative lengths.
| if n == -1 { | |
| return nil, nil | |
| } | |
| return readN(d.r, b, n) | |
| return d.readNInto(b, n) | |
| if n == -1 && c == msgpcode.Nil { | |
| return nil, nil | |
| } | |
| if n < 0 { | |
| return nil, fmt.Errorf("msgpack: invalid length %d", n) | |
| } | |
| return d.readNInto(b, n) |
Summary
Closes #63. Two related changes (separate commits):
1. Security fix: bytes decode bypassed the alloc limit
The fork's readN/readNGrow split (upstream's
readNwas the chunked variant) leftbytes()/bytesPtr()calling the unbounded package-levelreadNdirectly. A maliciousbin32/str32header declaring ~4GB forced a ~4GB upfront allocation — on stream decode and onUnmarshal— even with alloc limits enabled (the default). The string path was unaffected (d.readNalready branches on the flag).Bytes decoding now routes through a new
readNIntomethod:Unmarshal) path: declared length is validated against the remaining input before allocating → still a single exact-size allocation (1 alloc for 4MB, same as before), bomb closed for free.readNGrowby default;DisableAllocLimit(true)restores unbounded single-allocation reads for trusted input.2. Perf (#63): exact-capacity chunked growth in readNGrow
Replaces append-based chunk growth with explicit
min(n, max(2*pos, pos+bytesAllocLimit))sizing — provably the same allocation-ahead-of-data bound as before (≤ max(received, 1MiB)), but no capacity overshoot beyondnand fewer alloc+copy rounds.readNsimilarly allocates exactlyn, dropping a useless prefix copy.Benchmarks (Apple M3 Max, benchstat, count=6)
DisableAllocLimitrestores old perfTesting
Decode(&[]byte),DecodeBytes, struct field) for stream, plus theUnmarshalpath — fail fast with ≤1MiB allocated.DisableAllocLimit); caller-ownership (no input aliasing) asserted.go test ./...,go test -race ./...pass; existingdecoderTestsEOF semantics preserved.Note for follow-up
The security review also found a pre-existing allocation bomb in
DecodeUntypedMap(decode_map.go:224 —make(map[interface{}]interface{}, n)without themaxMapSizeclamp its siblings have). Separate PR incoming.🤖 Generated with Claude Code