Skip to content

feat(openai-agents): stream tool-call argument deltas to the UI#378

Closed
danielmillerp wants to merge 1 commit into
nextfrom
dm/oai-tool-arg-delta-streaming
Closed

feat(openai-agents): stream tool-call argument deltas to the UI#378
danielmillerp wants to merge 1 commit into
nextfrom
dm/oai-tool-arg-delta-streaming

Conversation

@danielmillerp
Copy link
Copy Markdown
Contributor

@danielmillerp danielmillerp commented May 29, 2026

What

Stream OpenAI Agents SDK tool-call arguments live as the model generates them, so the UI shows the tool name + arguments building incrementally instead of appearing all-at-once.

Why / finding

TemporalStreamingModel.get_response already parsed the Responses API response.function_call_arguments.delta events but only accumulated them — tool calls only surfaced later (all-at-once) via the hooks layer. This wires the deltas into the same agentex streaming path the text deltas already use.

Change

models/temporal_streaming_model.py:

  • On a function_call item announce (ResponseOutputItemAddedEvent): open a streaming_task_message_context with an initial ToolRequestContent (name + call_id, empty args), stashed per output_index.
  • On each ResponseFunctionCallArgumentsDeltaEvent: push a StreamTaskMessageDelta(ToolRequestDelta(arguments_delta=...)) (name/call_id pulled from the per-call record — the delta event carries neither).
  • On item-done / end-of-loop / error: close the per-call contexts (guarded so a partial-JSON close can't crash the activity).

hooks/hooks.py:

  • on_tool_start no longer emits a duplicate ToolRequestContent (the model layer now streams it live — emitting again would double-render). on_tool_end's ToolResponseContent (the result) is unchanged.

Invariant

The returned ModelResponse / output_items / usage / response_id assembly is untouched — streaming is a side-effect only.

Validation

Existing plugin tests pass (test_streaming_model.py + test_convert_tools.py: 43 passed); ruff clean; modules import.

🤖 Generated with Claude Code

Greptile Summary

This PR wires tool-call argument deltas from the OpenAI Responses API stream directly to the UI, so the tool name and arguments build incrementally instead of appearing all at once. It also removes the now-redundant ToolRequestContent emission from the on_tool_start hook to avoid double-rendering.

  • temporal_streaming_model.py: On a function_call ResponseOutputItemAddedEvent, a streaming_task_message_context is opened with an initial ToolRequestContent(arguments={}) and stored in a new tool_call_contexts dict keyed by output_index. Each ResponseFunctionCallArgumentsDeltaEvent pushes a ToolRequestDelta through that context, and ResponseOutputItemDoneEvent (or post-loop/exception-path cleanup) closes it.
  • hooks/hooks.py: on_tool_start no longer calls stream_lifecycle_content with a ToolRequestContent; it now only emits a debug log. on_tool_end is unchanged.

Confidence Score: 4/5

The streaming side-effect is well-guarded and does not touch the ModelResponse assembly path, making the core output unchanged and the risk of regression low.

The new tool-call streaming follows the established text/reasoning context pattern closely, cleanup is handled in three places (item-done, post-loop, error path), and the on_tool_start hook removal is correct. The only rough edge is the locals() dance in the exception handler, which is functional but harder to maintain than initializing the dict before the try block.

temporal_streaming_model.py exception handler (lines 1123–1129) — the locals() pattern should be reviewed, but it does not affect correctness.

Important Files Changed

Filename Overview
src/agentex/lib/core/temporal/plugins/openai_agents/models/temporal_streaming_model.py Adds live streaming of tool-call argument deltas by opening a streaming_task_message_context per function call and pushing ToolRequestDelta updates; contexts are closed in ResponseOutputItemDoneEvent, post-loop cleanup, and an exception handler that uses locals() to guard against early-throw scenarios.
src/agentex/lib/core/temporal/plugins/openai_agents/hooks/hooks.py Removes the ToolRequestContent emission from on_tool_start (now handled by the model layer) and replaces the activity call with a debug log; on_tool_end's ToolResponseContent emission is untouched.

Sequence Diagram

sequenceDiagram
    participant OAI as OpenAI Responses API
    participant TSM as TemporalStreamingModel
    participant Ctx as streaming_task_message_context
    participant UI as Agentex UI
    participant Hook as TemporalStreamingHooks

    OAI->>TSM: ResponseOutputItemAddedEvent (function_call)
    TSM->>Ctx: "__aenter__() with ToolRequestContent(arguments={})"
    Ctx->>UI: "initial ToolRequestContent streamed"

    loop For each argument delta
        OAI->>TSM: ResponseFunctionCallArgumentsDeltaEvent(delta)
        TSM->>Ctx: "stream_update(StreamTaskMessageDelta(ToolRequestDelta))"
        Ctx->>UI: incremental argument delta
    end

    OAI->>TSM: ResponseFunctionCallArgumentsDoneEvent
    OAI->>TSM: ResponseOutputItemDoneEvent (function_call)
    TSM->>Ctx: close()

    Note over Hook: on_tool_start() only logs now
    Note over Hook: no longer emits ToolRequestContent

    Hook->>UI: "on_tool_end: stream_lifecycle_content(ToolResponseContent)"
Loading

Fix All in Cursor Fix All in Claude Code Fix All in Codex

Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
src/agentex/lib/core/temporal/plugins/openai_agents/models/temporal_streaming_model.py:1123-1129
**`locals()` cleanup is fragile and redundant with the inline init**

`tool_call_contexts` is initialized unconditionally at line 676 inside the same `try` block, so by the time any exception is raised *during streaming* it is always in scope. The only scenario where it is absent is an exception thrown before line 676 (e.g., during `responses.create()`), in which case the dict is empty anyway and no cleanup is needed. Moving `tool_call_contexts: dict[int, Any] = {}` to just before the `try` at line 541 — alongside `streaming_context = None` / `reasoning_context = None` — would let the error handler use the variable directly (`for ctx in tool_call_contexts.values():` and `tool_call_contexts.clear()`) and removes the two `locals()` calls entirely.

Reviews (1): Last reviewed commit: "feat(openai-agents): stream tool-call ar..." | Re-trigger Greptile

The TemporalStreamingModel already parsed the Responses API
function_call_arguments deltas but only accumulated them — tool calls reached
the UI all-at-once via the hooks layer. Now, as the model generates a tool
call, the model layer opens a ToolRequestContent streaming context per call
(keyed by output_index) and pushes ToolRequestDelta chunks as the arguments
arrive, closing on item-done (mirrors the existing text-delta path).

on_tool_start no longer emits a duplicate ToolRequestContent (the model layer
streams it live now); on_tool_end still emits the ToolResponseContent result.
The returned ModelResponse/output_items/usage assembly is unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@danielmillerp
Copy link
Copy Markdown
Contributor Author

Closing as a duplicate of #355 (@vkalmathscale, feat(streaming): stream tool call argument deltas in TemporalStreamingModel), which already implements tool-call argument delta streaming in the same file. Use #355. (One thing worth checking on #355: whether TemporalStreamingHooks.on_tool_start still also emits a full ToolRequestContent — if so the tool request could double-render now that the model streams it live; that hooks guard was the only extra bit here.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant