Serialize stored shell tool calls correctly on susbequent requests#41
Conversation
Greptile SummaryThis PR fixes a serialization bug where
Confidence Score: 5/5Safe to merge — the rewrite is well-scoped, only touches function-mode provider paths, and is guarded by an existing openai_keep_native_shell gate for native passthrough; a focused integration test covers the round-trip. The core logic is straightforward: iterate input items, swap two variant types, flatten array output to a string. The output text format matches the live executor for the common single-chunk case. The only discrepancy found is cosmetic — timed-out calls reconstruct exit_code 124 instead of -1 — which is unlikely to affect downstream model behavior. The render_shell_output_text timeout branch in src/services/shell_tool.rs is worth a second look given the exit-code mismatch with the live executor path. Important Files Changed
Sequence DiagramsequenceDiagram
participant Client
participant Hadrian
participant Provider
Note over Client,Provider: Turn 1 (function mode)
Client->>Hadrian: POST /responses (shell tool declared)
Hadrian->>Provider: "function_call { name: shell, arguments: {...} }"
Provider-->>Hadrian: shell executes
Hadrian-->>Client: shell_call + shell_call_output (array output, persisted)
Note over Client,Provider: Turn 2 - continuation (previous_response_id)
Client->>Hadrian: POST /responses (previous_response_id)
Hadrian->>Hadrian: output_item_to_input() replays shell_call/shell_call_output verbatim
Hadrian->>Hadrian: preprocess_shell_tools() calls rewrite_shell_history_to_function_calls()
Note right of Hadrian: ShellCall to OutputFunctionCall, ShellCallOutput to FunctionCallOutput, array output flattened to string
Hadrian->>Provider: function_call + function_call_output (string output)
Provider-->>Client: next model response
Prompt To Fix All With AIFix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
src/services/shell_tool.rs:838-843
**Timeout exit-code mismatch in reconstructed history**
The live executor writes `exit_code: {exit_for_report}` into the continuation text blob, where `exit_for_report = final_exit.unwrap_or(-1)`. A killed/timeout call has `final_exit = None`, so the model originally saw `exit_code: -1`. On continuation, `render_shell_output_text` maps `ShellCallOutcome::Timeout` to `124` — a `timeout(1)` sentinel — so the reconstructed history tells the model `exit_code: 124` rather than what it saw. The comment ("matches how the live loop reports a killed call") is therefore inaccurate. Most models tolerate any non-zero exit code for a failed call, so this is unlikely to change behavior, but the infidelity is worth noting for future multi-turn debugging.
Reviews (3): Last reviewed commit: "Review fixes" | Re-trigger Greptile |
No description provided.