Skip to content

Update gather reduce kernel aligning with tpu inference#4035

Draft
NuojCheng wants to merge 2 commits into
mainfrom
chengnuojjin-update-gather-reduce
Draft

Update gather reduce kernel aligning with tpu inference#4035
NuojCheng wants to merge 2 commits into
mainfrom
chengnuojjin-update-gather-reduce

Conversation

@NuojCheng
Copy link
Copy Markdown
Collaborator

@NuojCheng NuojCheng commented Jun 1, 2026

Description

This PR

  • updates ragged gather reduce kernel to match the curent vllm implementation
  • Add cost estimate on kernels for potentially better XLA scheduling
  • (tempararily) disable ragged_gather_reduce kernel for ring of attention backward pass

Tests

Please describe how you tested this change, and include any instructions and/or
commands to reproduce.

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 1, 2026

Codecov Report

❌ Patch coverage is 7.40741% with 50 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/maxtext/kernels/ragged/ragged_gather_reduce.py 8.51% 43 Missing ⚠️
src/maxtext/kernels/ragged/ragged_gather.py 0.00% 4 Missing ⚠️
src/maxtext/kernels/ragged/ragged_sort.py 0.00% 3 Missing ⚠️

📢 Thoughts on this report? Let us know!

@NuojCheng NuojCheng force-pushed the chengnuojjin-update-gather-reduce branch from 7f3aa8c to 1fbbdbb Compare June 2, 2026 03:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant