feat(alloc): support wait-free allocator#696
Conversation
Signed-off-by: Yuuki Takano <ytakanoster@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR adds a feature-selectable heap allocator backend to awkernel_lib, enabling wf_alloc on supported architectures while preserving the existing TLSF-based path as the fallback.
Changes:
- Introduces
heap-wf-alloc/heap-tlsffeature wiring to explicitly select the heap backend (kernel defaults updated per-arch). - Plumbs CPU-count-aware heap initialization (
init_*_with_num_cpu) into x86_64 and aarch64 kernel bring-up. - Refactors
awkernel_lib::heapto select backend via atype Allocator = ...alias, addswf_allocinitialization logic, and updates heap module documentation.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| kernel/src/arch/x86_64/kernel_main.rs | Detects CPU count earlier and passes num_cpu into backup/primary heap initialization. |
| kernel/src/arch/aarch64/kernel_main.rs | Passes the detected CPU count into primary/backup heap initialization. |
| kernel/Cargo.toml | Adds kernel-level feature flags to select allocator backend and sets per-target defaults. |
| awkernel_lib/src/heap.rs | Implements backend selection, adds wf_alloc backend, adds *_with_num_cpu init APIs, and updates docs. |
| awkernel_lib/Cargo.toml | Adds optional wf_alloc dependency (target-gated) and exposes heap-wf-alloc / heap-tlsf features. |
Comments suppressed due to low confidence (1)
awkernel_lib/src/heap.rs:42
- The documentation states that only 32/64 CPUs are supported, but the implementation uses NUM_MAX_CPU (currently 512) for the CPU flag bitmap and for
wf_alloc'sactive_threadsupper bound. This limitation note looks stale and can mislead callers configuringnum_cpu.
//! # Limitation
//!
//! Only 32 or 64 CPUs are supported for 32 or 64 bits CPU architectures.
//!
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Yuuki Takano <ytakanoster@gmail.com>
atsushi421
left a comment
There was a problem hiding this comment.
The mdbook chapter at mdbook/src/internal/memory_allocator.md still describes the old TLSF-only design (Allocator/BackUpAllocator structs, init_primary/init_backup only). It should be updated to cover the new HeapBackend trait, backend selection by feature, and the *_with_num_cpu APIs (and explain why an explicit num_cpu is needed for wf_alloc).
The new ~200 lines of layout math in WfAllocBackend::init (alignment, overflow guards, metadata sizing, active_threads validation) have no unit tests. align_up alone is a pure helper that is trivially testable. Consider adding #[cfg(test)] mod tests that at least covers align_up overflow/normal cases and the rejection paths of init (too small heap, active_threads == 0 or > NUM_MAX_CPU, misaligned heap_start).
is_primary_mem (around line 384 in awkernel_lib/src/heap.rs) reads primary_start and primary_size with Relaxed, and the matching stores in init_*_with_num_cpu are also Relaxed. Another CPU can observe "new primary_start, old primary_size = 0" between the two stores; in that window end == start, the range check is always false, and primary-owned deallocations are routed to the backup, which would treat them as foreign and corrupt its free list. Use Release for the size store (last) and Acquire for the load, or document that init must happen-before any allocation.
Address review comments on heap.rs: - Reword the module doc so the wf_alloc init-failure path is accurate: alloc returns null and Talloc panics (primary) or halts via delay::wait_forever() (non-primary); there is no OOM-handler fallback. - Document that init_primary/init_backup read cpu::num_cpu() internally, and the hazard that a 0 count makes WfAllocBackend::init bail out. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
When cpu_id >= active_threads(), dealloc previously returned silently, permanently leaking the block. Print cpu_id and active_threads via the allocation-free unsafe_puts/unsafe_print_hex_u64 and halt in delay::wait_forever(). We avoid panic! here because the panic handler allocates, which would re-enter this same out-of-range guard on this CPU. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Address review comment: init is `unsafe fn(&self, ...)` on a Sync type, so the initialized fast-path is not mutual exclusion and concurrent init is UB. Document the # Safety contract that callers must serialize init (kernel boot does so on a single thread before APs start). Also move the unsafe_print_hex_u64 import into the wf_alloc_backend module to avoid an unused-import warning on backends that exclude it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ntly
HeapBackend::init now returns Result<(), &'static str>. Each WfAllocBackend
init failure path (alignment/overflow guards, metadata sizing, active_threads
bounds, from_metadata_region) returns a descriptive reason instead of a silent
return, so a half-initialized heap can no longer go unnoticed until the first
allocation fails.
Talloc::init_{primary,backup}_with_num_cpu now print the reason via unsafe_puts
and halt in delay::wait_forever() on failure. We print and halt rather than
panic! because the panic handler allocates, which cannot work while the heap is
not yet up. TlsfBackend::init returns Ok(()).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The cfg gating silently selects WfAllocBackend when both features are on, ignoring heap-tlsf. Add a compile_error! so this feature-flag mistake (e.g. from an unrelated downstream crate) fails loudly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The offset was read twice (early in kernel_main to call create_acpi, and again in kernel_main2 as "step 7"), and the early read was unlabeled, making the boot step list misleading. Fetch it once in kernel_main (now labeled step 4, before ACPI which needs it) and pass it into kernel_main2, removing the duplicate read. Renumber the step list accordingly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The backup heap (early) and primary heap (late) were sized from two different CPU populations: detect_num_cpus counted MADT entries by the enabled bit, while non_primary_cpus additionally filters by WaitingForSipi and the xAPIC < 255 limit. The early count also degraded to 1 when the MADT was absent, masking the real ACPI fault. - Rename detect_num_cpus to count_usable_cpus and document that the enabled-bit MADT count is a deliberate, allocation-free upper bound of non_primary_cpus + 1 (non_primary only ever applies additional restrictive filters). It must stay allocation-free because it runs before any heap is initialized -- going through acpi.platform_info() allocates and would OOM here. - Make a missing MADT fatal instead of unwrap_or(1). - Add a boot-time check in kernel_main2 that non_primary_cpus + 1 does not exceed the backup heap's count, so any divergence fails loudly at boot rather than as an opaque allocation failure on a high cpu_id later. Verified: boots to the shell under QEMU with -smp 16 (15 APs + primary = 16, matching the backup heap count); no inconsistency or OOM. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The review noted the wf_alloc layout math had no unit tests, and that align_up is a pure, trivially testable helper. It lived inside the wf_alloc_backend module, which (like the whole heap module) is gated to no_std builds with the heap-wf-alloc feature, so the host `--features std` test build never compiled it. Extract align_up into a new always-declared `heap_util` module (its body gated to the wf backend or cfg(test)) so the std test harness compiles and runs it, mirroring how local_heap's tests run on the host. Add tests covering already-aligned, round-up, alignment-of-one, large power-of-two, and overflow (None) cases. The init rejection paths are not unit-tested: they depend on wf_alloc types and would require enabling heap-wf-alloc (and its global allocator) in the host test build. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Added in da52eab. While doing this I hit a structural snag worth noting: the So I extracted I did not unit-test the |
…ange
Document on init_{primary,backup}_with_num_cpu and is_primary_mem that the
Relaxed stores/loads of primary_start/primary_size are sound because heap init
must happen-before the allocator is shared with any other CPU (the same
single-init contract as HeapBackend::init). No other CPU can observe the window
between the two stores, so a primary-owned deallocation is never misrouted to
the backup allocator.
The kernel establishes this happens-before edge at boot:
- x86_64: the BSP's BSP_READY release store, paired with each AP's acquire fence
before its first allocation.
- aarch64: the BSP's PRIMARY_INITIALIZED SeqCst store, which each AP loads
(SeqCst) before proceeding.
Two separate atomics cannot be updated atomically by stronger orderings anyway,
so it is the init-before-sharing contract, not per-atomic ordering, that
provides safety; hence documenting rather than switching to Release/Acquire.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Took the "document that init must happen-before any allocation" option, in 471c5cd — and it turns out that's the technically accurate fix here rather than just the convenient one. The window you describe ("new
So no other CPU ever observes the intermediate state, and a primary-owned deallocation is never misrouted to the backup. I also realized that bumping the two stores/loads to Happy to additionally bump these to |
The chapter described the old TLSF-only design (Allocator/BackUpAllocator structs, init_primary/init_backup only). Update it to cover: - compile-time backend selection (heap-wf-alloc -> wf_alloc on x86_64/aarch64, else TLSF; mutually exclusive), - the HeapBackend trait and the `type Allocator = ...` alias, - the *_with_num_cpu init APIs and why wf_alloc needs an explicit CPU count (per-CPU tokens / active_threads sizing) while TLSF ignores it, - the x86_64 / aarch64 boot snippets now passing num_cpu. Also fix the "For x86_64" copy-paste in the AArch64 section and the Tallock typo. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Updated in 94bc5e2. The chapter now covers:
Also fixed the "For x86_64" copy-paste in the AArch64 section and the This was the last open item from the review — thanks for the thorough pass. |
Signed-off-by: Yuuki Takano <ytakanoster@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
This PR introduces
wf_allocas a feature-selectable heap allocator backend forawkernel_lib, while keeping TLSF as the fallback path on architectures wherewf_allocis not enabled/supported. It also updates heap allocator documentation to match current behavior.Description
heap-wf-allocselectswf_alloc.Tallocpolicy (primary/backup) unchanged:wf_allocplumbing for supported targets (x86_64,aarch64):awkernel_lib.heap-wf-allocon unsupported architectures.type Allocator = ...alias.init_*_with_num_cpu(...).kernelis aligned with this backend selection (heap-wf-allocon x86 by default, TLSF remains for non-CAS2 paths).awkernel_lib/src/heap.rsdoc comments:wf_allocinitialization flow, CPU-token mapping, interrupt-guarded alloc/dealloc, and fallback to existing OOM path when allocator init fails.AcpiTablebefore heap memory initialization for x86_64.Related links
Benchmarking:
How was this PR tested?
Tested on Qemu and physical machines
Notes for reviewers