Skip to content

Add support of Grain input pipeline for DPO.#4009

Open
igorts-git wants to merge 1 commit into
mainfrom
igorts/dpo-grain-pipeline
Open

Add support of Grain input pipeline for DPO.#4009
igorts-git wants to merge 1 commit into
mainfrom
igorts/dpo-grain-pipeline

Conversation

@igorts-git
Copy link
Copy Markdown
Collaborator

@igorts-git igorts-git commented May 28, 2026

Description

Add support of Grain input reading when using the new Tunix-based DPO/ORPO.
Modify dpo.yml to have some reasonable defaults that allow invoking DPO without extra config parameters.

Correct the case where config.dpo.max_prompt_length is not set. The default value is max_target_length // 2. It should be the same value that is passed to both the input pipeline and to the Tunix DPOTrainer class. Thus, I moved its computation to types.py.

In a follow up PR I will add a detailed logits comparison test for DPO.

BUGS: b/485626968

Tests

CI tests.
Ran DPO/ORPO while reading using Grain from parquet files.

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 28, 2026

Codecov Report

❌ Patch coverage is 78.26087% with 5 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...rc/maxtext/input_pipeline/grain_data_processing.py 70.58% 2 Missing and 3 partials ⚠️

📢 Thoughts on this report? Let us know!

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

🤖 Hi @igorts-git, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

## 📋 Review Summary

This pull request adds Grain input pipeline support for DPO (Direct Preference Optimization). The implementation is consistent with the project's existing Grain-based SFT patterns and includes comprehensive unit tests for different DPO data formats.

🔍 General Feedback

  • Good Coverage: The addition of TestGrainDPOPipelineProcessing in dpo_data_processing_test.py covers key edge cases like 2-column (common prefix) and 3-column datasets.
  • Defensive Design: The validation check in src/maxtext/configs/types.py prevents unsupported configurations early.
  • Defaults: The updated defaults in dpo.yml provide a smoother "out-of-the-box" experience for Tunix-based DPO.
  • Batching Consistency: It is recommended to use the get_local_batch_size utility to ensure all global configuration flags (like real data expansion) are respected during batching.

Comment thread src/maxtext/input_pipeline/grain_data_processing.py
@igorts-git igorts-git force-pushed the igorts/dpo-grain-pipeline branch from 3b27bbf to 86f715f Compare June 2, 2026 16:39
@igorts-git igorts-git force-pushed the igorts/dpo-grain-pipeline branch from 86f715f to 89d9e3c Compare June 2, 2026 17:08
max_prompt_length: int | None = None

def __post_init__(self):
if self.max_prompt_length is None:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new logic that I added in types.py guarantees that this value is not None. However, for cases like unit tests it is still useful to have this default computed. Let' me know if you want it removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant