Skip to content

Skip Tool Normalization for GPT-OSS/Harmony Templates#1069

Open
sayanshaw24 wants to merge 5 commits into
mainfrom
sayanshaw/harmony
Open

Skip Tool Normalization for GPT-OSS/Harmony Templates#1069
sayanshaw24 wants to merge 5 commits into
mainfrom
sayanshaw/harmony

Conversation

@sayanshaw24
Copy link
Copy Markdown
Collaborator

Skip tool normalization for GPT-OSS/Harmony chat templates

Summary

GPT-OSS (Harmony) models use chat templates that access tool.function directly, expecting the raw OpenAI tool format. The existing NormalizeTools() logic unwraps the function object from tools, which breaks these templates. This PR adds auto-detection to skip normalization when the template expects the original OpenAI structure.

Problem

When tools are passed to ApplyChatTemplate, they go through NormalizeTools() which flattens the OpenAI {type: "function", function: {name, description, parameters}} format into a simpler structure. This works for Phi-4 and Qwen templates which expect flattened tools, but breaks GPT-OSS/Harmony templates which iterate over tool.function.name, tool.function.description, etc. directly.

Solution

Added a template-content-based heuristic in chat_template.cc:

  • Scan the activated template string for "tool.function" references
  • Exclude false positives from "tool_call.function" (which is a different pattern)
  • If found, set skip_tool_normalization = true and pass tools as raw JSON without unwrapping

This is backward-compatible — existing Phi-4 and Qwen templates don't match the heuristic and continue to use NormalizeTools() as before.

Files Changed

File Change
shared/api/chat_template.cc Skip NormalizeTools() for templates that access tool.function directly
test/pp_api_test/test_tokenizer_chat.cc Comprehensive test cases for Harmony tool calling (template rendering, multi-tool, tool results)

Testing

  • Unit tests: Added test cases in test_tokenizer_chat.cc covering single tool, multi-tool, and tool result round-trip scenarios with a Harmony-style chat template
  • E2E validation: Full tool-calling round-trip tested against gpt-oss-20b-generic-cpu in Foundry Local with the corresponding server-side ToolCallConfig changes to be published in upcoming PR(s):
    1. User sends chat completion request with tools → model generates Harmony-format tool call → server parses and returns OpenAI-format tool_calls response
    2. Tool result fed back → model generates natural language final answer
    • Both calls completed successfully, validating the full pipeline from tokenizer template rendering through GenAI inference to response parsing

@sayanshaw24 sayanshaw24 marked this pull request as ready for review June 4, 2026 23:02
@sayanshaw24 sayanshaw24 requested a review from a team as a code owner June 4, 2026 23:02
Copilot AI review requested due to automatic review settings June 4, 2026 23:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the chat-templating pipeline to support GPT-OSS/Harmony templates that expect the original OpenAI tools shape ({ type: "function", function: {...} }) by conditionally skipping the existing tool “flattening” (NormalizeTools()) when the activated template references tool.function.

Changes:

  • Add template-content detection in ApplyChatTemplate() to decide whether to skip tool normalization for Harmony/GPT-OSS templates.
  • Add GPT-OSS/Harmony-focused unit tests covering basic rendering, tool definitions, tool-call flow, and regression coverage for non-Harmony templates.
  • Add GPT-OSS test assets (tokenizer_config.json, chat_template.jinja) to drive the new tests.

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 3 comments.

File Description
shared/api/chat_template.cc Adds heuristic to bypass NormalizeTools() for templates that use tool.function.
test/pp_api_test/test_tokenizer_chat.cc Adds GPT-OSS/Harmony template tests and a regression test ensuring Qwen normalization still applies.
test/data/gpt-oss/tokenizer_config.json Introduces GPT-OSS tokenizer config used by the new tests.
test/data/gpt-oss/chat_template.jinja Adds a Harmony-style chat template that accesses tool.function and renders tools in a TS namespace.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +423 to +432
// Look for "tool.function" but exclude "tool_call.function" matches
size_t pos = 0;
while ((pos = tmpl_str.find("tool.function", pos)) != std::string::npos) {
// Check that this isn't part of "tool_call.function"
if (pos < 5 || tmpl_str.substr(pos - 5, 5) != "call.") {
skip_tool_normalization = true;
break;
}
pos += 13;
}
Comment on lines +443 to +445
// GPT-OSS/Harmony: parse tools as-is without normalization
json tools_json = json::parse(message_obj["tools"].get<std::string>().c_str());
message_obj["tools"] = tools_json;
Comment on lines +460 to +461
// GPT-OSS/Harmony: pass raw tools without normalization
tools_json = json::parse(tools_str.c_str());
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants