Skip to content

fix(llm-client): sanitize non-ASCII dynamic request headers (PC-4776)#97

Open
cotovanu-cristian wants to merge 1 commit into
mainfrom
fix/pc-4776-httpx-unicode-headers
Open

fix(llm-client): sanitize non-ASCII dynamic request headers (PC-4776)#97
cotovanu-cristian wants to merge 1 commit into
mainfrom
fix/pc-4776-httpx-unicode-headers

Conversation

@cotovanu-cristian

@cotovanu-cristian cotovanu-cristian commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator

Summary

Fixes PC-4776. The LLM httpx wrapper crashed with UnicodeEncodeError: 'ascii' codec can't encode characters ... while injecting dynamic request headers with non-ASCII values into an outbound request — before any request left the process.

In UiPathHttpxClient.send() / UiPathHttpxAsyncClient.send(), dynamic headers were injected via request.headers.update(dynamic_headers). httpx re-encodes str header values as ASCII in _normalize_header_value and raises on any non-ASCII character, so the job faulted (miscategorized as Unknown) on its first LLM call. From SRE-610701, job 77c6f180-9ac0-4c72-a997-05d61ce73963 (12 failures).

Fix

Sanitize dynamic header values before injection (new encode_header_value / encode_header_items in utils/headers.py):

  • latin-1 (ISO-8859-1) bytes when the value is representable — this is the native HTTP/1.1 header-octet encoding, which httpx passes through verbatim and the gateway decodes losslessly.
  • ASCII percent-encoding as a fallback for values outside latin-1 (e.g. CJK/emoji), so the send path can never crash regardless of input.

Also set request.headers.encoding = "latin-1" at the injection site so httpx's own read-back / cookie-extraction path (dict(request.headers)) stays consistent and does not re-raise on the bytes downstream.

Why this encoding

latin-1 is the standard wire encoding for HTTP/1.1 header octets and is what httpx itself emits when given raw bytes — so latin-1-representable values (the SRE-610701 case is an embedded accented name/title) are transmitted exactly as intended with no transformation. Percent-encoding is reserved for the rare non-latin-1 value, where there is no lossless single-octet representation; it is universally ASCII-safe.

Tests

Extended tests/core/features/test_httpx_client.py:

  • test_non_ascii_dynamic_header_does_not_crash_send (sync) — latin-1 value reaches the wire as latin-1 bytes (was the failing reproduction).
  • test_non_latin1_dynamic_header_is_percent_encoded — CJK value falls back to ASCII percent-encoding.
  • TestUiPathHttpxAsyncClientSend::test_non_ascii_dynamic_header_does_not_crash_send (async).

All three failed with the exact UnicodeEncodeError before the fix and pass after.

Validation (in worktree)

  • ruff check — clean
  • ruff format --check — clean
  • pyright (this repo uses pyright, not mypy) — 0 errors
  • pytest tests/core/features/test_httpx_client.py — 27 passed
  • broader pytest -k header — 93 passed
  • end-to-end through a real MockTransport (exercising the cookie read-back path that also crashed): both latin-1 and CJK values reach the wire with status 200.

Note for reviewer

Confirm scope vs PC-4341, which is a different non-ASCII path (tool names, not headers). This PR only touches the httpx client's dynamic-header injection and its test.

🤖 Generated with Claude Code

The LLM httpx wrapper crashed with UnicodeEncodeError while injecting
dynamic request headers with non-ASCII values into the outbound request.
`send()` did `request.headers.update(dynamic_headers)`; httpx then
re-encodes str values as ASCII in `_normalize_header_value` and raises,
faulting the job before any request leaves the process.

Pre-encode dynamic header values before injection: latin-1 bytes (the
HTTP/1.1 wire encoding, passed through by httpx verbatim) when
representable, else ASCII percent-encoding for values outside latin-1
(e.g. CJK/emoji). Also set the request Headers encoding to latin-1 so
httpx's own read-back/cookie-extraction path stays consistent and does
not crash on the bytes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant