Skip to content

[24.04-linux-nvidia-6.17] NVIDIA: SAUCE: arm_mpam: Fix mbm_total_bytes and MBM counter assignment#466

Closed
fyu1 wants to merge 2 commits into
NVIDIA:24.04_linux-nvidia-6.17-nextfrom
fyu1:24.04_linux-nvidia-6.17-next.mpam.extras.fixes4
Closed

[24.04-linux-nvidia-6.17] NVIDIA: SAUCE: arm_mpam: Fix mbm_total_bytes and MBM counter assignment#466
fyu1 wants to merge 2 commits into
NVIDIA:24.04_linux-nvidia-6.17-nextfrom
fyu1:24.04_linux-nvidia-6.17-next.mpam.extras.fixes4

Conversation

@fyu1

@fyu1 fyu1 commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

The PR fixes a MPAM issue on Gace: https://nvbugspro.nvidia.com/bug/6250699

The issue is because of recent upstream code changes and pset inheritance.

The issue doesn't exist on 7.0 bos and 7.0 lts. But 7.0 bos needs a PR (I will send it out later) to follow same mbm_total_bytes interface as this PR and 7.0 lts.

This patch fixes a few issues:

  1. To enable MBW total bytes, continue MBWU counter selection after CSU setup. This follows upstream 7.1.

  2. Use mbm_cntr_assignable for assignment enablement so occupancy and MBWU sharing a class do not break mbm_L3_assignments. This follows commit 4766d7e https://gitlab.arm.com/linux-arm/linux-bh.git mpam_abmc_v4

  3. Always bind picked MBWU to mbm_total_bytes. This follows commit b2bbf3b https://gitlab.arm.com/linux-arm/linux-bh.git mpam_abmc_v4

  4. Rebind mbm_total_bytes at monitor init to the resource that owns the MBWU class via resctrl_mon_event_set_resource(), so memory-level MSC platforms expose mbm_total_bytes under mon_MB_* and mbm_MB_assignments. This fix is necessary when only MBW total bytes (i.e. no MBW local bytes) is supported.


LP: https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.17/+bug/2157922

@nirmoy nirmoy added the help wanted Extra attention is needed label Jun 15, 2026
@github-actions

github-actions Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

PR Validation Report

Patchscan ✅ No Missing Fixes

All cherry-picked commits checked — no missing upstream fixes found.

PR Lint ✅ All checks passed

Details
Checking 2 commits...

Cherry-pick digest:
┌──────────────┬──────────────────────────────────────────────────────────────────┬────────────┬─────────┬───────────────────────────┐
│ Local        │ Referenced upstream / Patch subject                              │ Patch-ID   │ Subject │ SoB chain                 │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 334c7d71571c │ [SAUCE] arm_mpam: fix monitor capability and mbm assign for mb e │ N/A        │ N/A     │ fenghuay                  │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 50d8d0abc638 │ [SAUCE] arm_mpam: fix mbm_total_bytes and mbm counter assignment │ N/A        │ N/A     │ fenghuay                  │
└──────────────┴──────────────────────────────────────────────────────────────────┴────────────┴─────────┴───────────────────────────┘

Lint: all checks passed.

@nirmoy

nirmoy commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

BaseOS Kernel Review

Summary

No issues found across the reviewed commits.

Findings: no problems found

Latest watcher review: open review

Kernel deb build: failed (failure log, build artifacts)

Head: 334c7d71571c

This comment is maintained by nv-pr-bot. It is updated when the GitHub watcher publishes a newer review.

@nvmochs

nvmochs commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

@fyu1

Two findings with codex (may be due to lack of understanding about an underlying assumption with this MPAM code and L3 resources):

Can you clarify how the memory-level-only MBWU case keeps resctrl monitoring enabled?

With this patch, mpam_resctrl_monitor_init() binds QOS_L3_MBM_TOTAL_EVENT_ID to RDT_RESOURCE_MBA when the selected MBWU class is memory-level, and then sets r->mon_capable on MBA. But resctrl_arch_mon_capable() still returns only RDT_RESOURCE_L3.mon_capable, and that helper gates monitor domain setup and the resctrl mount monitor files.

So on a platform with mbm_total_bytes only from a memory-level MSC, and no L3 monitor event setting L3.mon_capable, it looks like mbm_total_bytes gets enabled on MBA but resctrl_arch_mon_capable() remains false, so mon_data/info files are not created.

If the intended topology always has an L3 monitor too, can you say that in the commit message? Otherwise I think this needs the missing global/exposed monitor-capable handling, or L3 needs to remain marked as the global monitor-capable resource even when mbm_total_bytes is exposed under MB.


Can you also check the auto-assign/unassign path for MB-backed mbm_total_bytes?

With this patch, mpam_resctrl_monitor_init_abmc() can set mbm_cntr_assignable and mbm_assign_on_mkdir on RDT_RESOURCE_MBA. But rdtgroup_assign_cntrs() and rdtgroup_unassign_cntrs() first gate everything on RDT_RESOURCE_L3 assignment state before they process the individual MBM events.

That means an MB-backed mbm_total_bytes event may not be auto-assigned for new groups, and assigned MB counters may not be released on group removal, even though the lower rdtgroup_assign_cntr_event()/unassign path already uses mevt->rid.

Would it be better for assign/unassign to iterate enabled MBM events and check assignment state on each event's resource, instead of pre-gating on L3?

@nvidia-bfigg nvidia-bfigg force-pushed the 24.04_linux-nvidia-6.17-next branch 2 times, most recently from 7a62271 to 51267da Compare June 19, 2026 12:02
@nirmoy

nirmoy commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Boro watcher review skipped

The GitHub watcher skips automatic boro reviews for PRs with more than 50 commits. This PR currently has 100 commits.

To run the review anyway, ask BaseOS_Kernel_Bot in #baseos-kernel:

review https://github.com/NVIDIA/NV-Kernels/pull/466

Head: aee0108482c3

This comment is maintained by nv-pr-bot. It is updated when the GitHub watcher sees a newer PR head.

@fyu1

fyu1 commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator Author

@fyu1

Two findings with codex (may be due to lack of understanding about an underlying assumption with this MPAM code and L3 resources):

Can you clarify how the memory-level-only MBWU case keeps resctrl monitoring enabled?

With this patch, mpam_resctrl_monitor_init() binds QOS_L3_MBM_TOTAL_EVENT_ID to RDT_RESOURCE_MBA when the selected MBWU class is memory-level, and then sets r->mon_capable on MBA. But resctrl_arch_mon_capable() still returns only RDT_RESOURCE_L3.mon_capable, and that helper gates monitor domain setup and the resctrl mount monitor files.

So on a platform with mbm_total_bytes only from a memory-level MSC, and no L3 monitor event setting L3.mon_capable, it looks like mbm_total_bytes gets enabled on MBA but resctrl_arch_mon_capable() remains false, so mon_data/info files are not created.

If the intended topology always has an L3 monitor too, can you say that in the commit message? Otherwise I think this needs the missing global/exposed monitor-capable handling, or L3 needs to remain marked as the global monitor-capable resource even when mbm_total_bytes is exposed under MB.

Fixed. Updated resctrl_arch_mon_capable() to scan both MBA and L3 resources to check if the resource is monitor capable.

Can you also check the auto-assign/unassign path for MB-backed mbm_total_bytes?

With this patch, mpam_resctrl_monitor_init_abmc() can set mbm_cntr_assignable and mbm_assign_on_mkdir on RDT_RESOURCE_MBA. But rdtgroup_assign_cntrs() and rdtgroup_unassign_cntrs() first gate everything on RDT_RESOURCE_L3 assignment state before they process the individual MBM events.

That means an MB-backed mbm_total_bytes event may not be auto-assigned for new groups, and assigned MB counters may not be released on group removal, even though the lower rdtgroup_assign_cntr_event()/unassign path already uses mevt->rid.

Would it be better for assign/unassign to iterate enabled MBM events and check assignment state on each event's resource, instead of pre-gating on L3?

Fixed. assign/unassign hardware counter for all assign enabled resources including MBA and L3.

@nvmochs

nvmochs commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

@fyu1 Please rebase this branch.

Also, two follow-up comments after re-reviewing the latest with codex...


I think the monitor-capable fix is still incomplete for the MB-only monitor case.

The new resctrl_arch_mon_capable() now checks MBA as well as L3, which fixes callers that go through that hook. But resctrl_mon_init() still directly checks RDT_RESOURCE_L3.mon_capable and returns before it initializes the MBA monitor resource:

      r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
      if (!r->mon_capable)
          return 0;

So on a platform where mbm_total_bytes is backed only by a memory-level MSC, MBA.mon_capable can be true while L3.mon_capable is false. In that case resctrl_mon_init() will skip resctrl_mon_resource_init(MBA), so the MB_MON monitor/assignment file flags may never be initialized.

Can you either update resctrl_mon_init() to use the new any-monitor-capable logic / initialize all mon_capable resources, or explain where L3.mon_capable is guaranteed to be set for the MB-only case?


Lower priority than the resctrl_mon_init() issue, but can you check CDP handling for MBA-backed ABMC? If cdp can be enabled with MB-backed mbm_total_bytes, MBA.mon.num_mbm_cntrs also needs to be resynced when CDP changes; otherwise the assignment code may expose/allocate too many counters.

@fyu1

fyu1 commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator Author

@fyu1 Please rebase this branch.

Also, two follow-up comments after re-reviewing the latest with codex...

I think the monitor-capable fix is still incomplete for the MB-only monitor case.

The new resctrl_arch_mon_capable() now checks MBA as well as L3, which fixes callers that go through that hook. But resctrl_mon_init() still directly checks RDT_RESOURCE_L3.mon_capable and returns before it initializes the MBA monitor resource:

      r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
      if (!r->mon_capable)
          return 0;

So on a platform where mbm_total_bytes is backed only by a memory-level MSC, MBA.mon_capable can be true while L3.mon_capable is false. In that case resctrl_mon_init() will skip resctrl_mon_resource_init(MBA), so the MB_MON monitor/assignment file flags may never be initialized.

Can you either update resctrl_mon_init() to use the new any-monitor-capable logic / initialize all mon_capable resources, or explain where L3.mon_capable is guaranteed to be set for the MB-only case?

Lower priority than the resctrl_mon_init() issue, but can you check CDP handling for MBA-backed ABMC? If cdp can be enabled with MB-backed mbm_total_bytes, MBA.mon.num_mbm_cntrs also needs to be resynced when CDP changes; otherwise the assignment code may expose/allocate too many counters.

Hi, Matt,

Every platform (x86 and ARM) has L3 allocation and monitoring. So in reality the r->mon_capable is always true.
But agree this code is not strictly correct.
Do we need to fix all of these kind of "coding" issues in 6.17? I'm checking every upstream patch to make sure there is no flag from claude/codex in LKML.

fyu1 added 2 commits June 22, 2026 19:41
This patch fixes a few issues:

1. To enable MBW total bytes, continue MBWU counter selection after
CSU setup. This follows upstream 7.1.

2. Use mbm_cntr_assignable for assignment enablement
so occupancy and MBWU sharing a class do not break mbm_L3_assignments.
This follows commit 4766d7e https://gitlab.arm.com/linux-arm/linux-bh.git mpam_abmc_v4

3. Always bind picked MBWU to mbm_total_bytes.
This follows commit b2bbf3b https://gitlab.arm.com/linux-arm/linux-bh.git mpam_abmc_v4

4. Rebind mbm_total_bytes at monitor init to the resource that owns the
MBWU class via resctrl_mon_event_set_resource(), so memory-level MSC
platforms expose mbm_total_bytes under mon_MB_* and mbm_MB_assignments.
This fix is necessary when only MBW total bytes (i.e. no MBW local
bytes) is supported.

Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
… events

When mbm_total_bytes is exposed on MBA for memory-level MSC monitors,
resctrl_arch_mon_capable() must reflect mon_capable on the backing
resource, not only L3. Also gate MBM counter auto-assign and unassign
on each enabled event's resource so MBA-backed mbm_total_bytes gets
ABMC setup on group create and cleanup on group delete.

Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
@nvmochs

nvmochs commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Hi, Matt,

Every platform (x86 and ARM) has L3 allocation and monitoring. So in reality the r->mon_capable is always true. But agree this code is not strictly correct.

Thanks, that answers the functional concern. If all 6.17 MPAM targets with MB-backed mbm_total_bytes also have L3 monitoring, then the resctrl_mon_init() early return should not break the supported platforms.

Do we need to fix all of these kind of "coding" issues in 6.17? I'm checking every upstream patch to make sure there is no flag from claude/codex in LKML.

Since 6.17 is going EOL in the near future, likely not. I think it comes down to if there is a functional impact or not. It sounds like there is not a functional concern for these two latest findings.

@fyu1 fyu1 force-pushed the 24.04_linux-nvidia-6.17-next.mpam.extras.fixes4 branch from aee0108 to 334c7d7 Compare June 22, 2026 19:54
@fyu1

fyu1 commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator Author

Hi, Matt,

Lower priority than the resctrl_mon_init() issue, but can you check CDP handling for MBA-backed ABMC? If cdp can be enabled with MB-backed mbm_total_bytes, MBA.mon.num_mbm_cntrs also needs to be resynced when CDP changes; otherwise the assignment code may expose/allocate too many counters.

Grace and Vera don't support CDP.

@fyu1

fyu1 commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator Author

Hi, Matt and Carol

@fyu1 Please rebase this branch.

Rebased.

@nvmochs

nvmochs commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

No further issues from me.

Acked-by: Matthew R. Ochs <mochs@nvidia.com>

@clsotog

clsotog commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Acked-by: Carol L Soto <csoto@nvidia.com>

@nirmoy nirmoy added has_2_acks and removed help wanted Extra attention is needed has_1_ack labels Jun 22, 2026
@nvmochs nvmochs changed the title Please MPAM fix for 6.17 hwe: NVIDIA: SAUCE: arm_mpam: Fix mbm_total_bytes and MBM counter assignment [24.04-linux-nvidia-6.17] NVIDIA: SAUCE: arm_mpam: Fix mbm_total_bytes and MBM counter assignment Jun 22, 2026
@nvmochs

nvmochs commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Merged, closing PR.

4a7f97ea70be (nnoble/nvidia-6.17-next) NVIDIA: SAUCE: arm_mpam: Fix monitor capability and MBM assign for MB events
545a2db4f054 NVIDIA: SAUCE: arm_mpam: Fix mbm_total_bytes and MBM counter assignment

@nvmochs nvmochs closed this Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants