fix: always pass an explicit HTTP agent to avoid Node's 5s idle timeout (cut 1.1.110)#1344
Merged
Martin Torp (mtorp) merged 3 commits intoMay 29, 2026
Merged
Conversation
…timeout Node >=19's global HTTP/HTTPS agent enables keepAlive with a 5s socket timeout, which Node applies as a per-socket inactivity timeout. setupSdk only supplied an explicit agent for the proxy and SSL_CERT_FILE cases, so the common path inherited the global agent's 5s timeout even when SOCKET_CLI_API_TIMEOUT is unset. This caused upload-manifest-files to fail intermittently: the SDK streams the multipart body with Transfer-Encoding: chunked, and when the server takes >5s to parse auth/multipart before sending any response byte, the socket goes idle, Node fires the 5s timeout, and the SDK destroys the request, so the client disconnects before receiving any response. Always pass a fresh Agent (no timeout) so a request is bounded only by an explicit SOCKET_CLI_API_TIMEOUT or until interrupted. Reproduced locally against a slow mock server with no load balancer in the path.
apiFetch's https.request used the default (global) agent when no CA cert was configured, inheriting Node >=19's keepAlive 5s socket timeout — the same issue just fixed for the SDK. getHttpsAgent now always returns an explicit HttpsAgent (no timeout), covering queryApiSafe*/sendApiRequest and the direct apiFetch download paths (streaming full-scan responses, binary and tarball downloads). Bumps the version to 1.1.110 and adds the changelog entry.
getHttpsAgent now always creates an agent on first call, so its return type is HttpsAgent (was HttpsAgent | undefined) and the _httpsRequestFetch agent parameter drops | undefined. The cached _httpsAgent keeps | undefined since it is the lazy-init sentinel (undefined only before the first call). The _httpsAgentResolved flag is removed: a set _httpsAgent is itself the "resolved" signal. Pure polish from review; no behavior change.
29f49ab to
1f13e3b
Compare
Jeppe Fredsgaard Blaabjerg (jfblaa)
approved these changes
May 29, 2026
Contributor
Jeppe Fredsgaard Blaabjerg (jfblaa)
left a comment
There was a problem hiding this comment.
LGTM 👍
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
upload-manifest-files(used bysocket scan reachandsocket fix) intermittently fails for some enterprise customers: the request is torn down at almost exactly 5 seconds with no response, which the API load balancer logs asclient_disconnected_before_any_response. A server-side heartbeat shipped earlier only partially helped.Root cause
Node ≥19's global HTTP/HTTPS agent ships
{ keepAlive: true, timeout: 5000 }. Node applies thattimeoutas a per-socket inactivity timeout. The CLI made requests without an explicitagenton the common path, so they inherited the global agent and its 5s timeout — even whenSOCKET_CLI_API_TIMEOUTis unset (the 5s comes entirely from Node, not from the CLI/SDK).For the upload, the multipart body is sent
Transfer-Encoding: chunked. When the server takes >5s to handle auth + multipart parsing before sending any response byte, the socket sits idle, Node fires the 5s'timeout', and the request is destroyed — the client disconnects before any response is received.This explains every observed signal: ~5.0s latency, Node-CLI-only (Python
requestssendsContent-Length, not chunked, and has no analogous default idle timeout), andbefore_any_response(the timeout fires during the pre-handler stall, before the heartbeat can write a byte).Reproduced locally
Driving the real SDK against a slow mock HTTP server with no load balancer in the path reproduced the teardown at exactly 5.0s. A trace of
Socket.prototype.setTimeoutshowedsetTimeout(5000)originating from Node'sAgent.createConnection(the global agent), with the CLI/SDK passingtimeout=undefined. An explicit agent with no timeout completes an 8s-stall upload with200 OK.Fix
Both of the CLI's HTTP stacks now always pass an explicit
Agent(which carries no timeout), so requests are bounded only by an explicitSOCKET_CLI_API_TIMEOUTor until interrupted — restoring the CLI's documented "no timeout unless configured" intent:setupSdk(src/utils/sdk.mts) supplies an explicitAgent(by protocol) on the no-proxy/no-CA path. Because theSocketSdkbuilds its request options once, every SDK call is covered (uploadManifestFiles,getOrganizations,createOrgFullScan,getOrgFullScan,searchDependencies,batchPackageStream, …), not just the upload.apiFetchpath —getHttpsAgent(src/utils/api.mts) now always returns an explicitHttpsAgentinstead ofundefinedwhen no CA cert is configured. This coversqueryApiSafeText/queryApiSafeJson,sendApiRequest, and the directapiFetchdownload paths (streaming full-scan responses, binary/tarball downloads) — which had the identical bug and contradicted the file's own "no body timeout" comment.Proxy and
SSL_CERT_FILEpaths are unchanged (both already used explicit agents with no such timeout).Compatibility
new HttpsAgent()defaults tokeepAlive: false(pre-Node-19 behavior); fine for a short-lived CLI.SOCKET_CLI_API_TIMEOUT.Tests
sdk.test.mts: updated the obsolete "no agent by default" assertion and added a regression test assertingsetupSdkalways passes an explicit agent with notimeout.api.test.mts: updated the obsolete "no agent /agent: undefined" assertion and now assertsapiFetchuses an explicit agent with notimeout.test:unit src/utils/api.test.mts src/utils/sdk.test.mts→ 36 passed;check:tscandcheck:lintgreen.Release
package.jsonto 1.1.110.## [1.1.110] - 2026-05-29Fixed entry toCHANGELOG.md.Durable follow-up (separate, not this PR)
The same Node global-agent default would bite any other consumer of
@socketsecurity/sdk. Worth fixing the agent default in the SDK itself so non-CLI consumers are covered too.Note
Medium Risk
Changes the default HTTP agent for all CLI API traffic (SDK and raw fetch), which affects connection lifecycle globally; behavior is intentional and covered by regression tests, but any subtle networking edge cases would surface across many commands.
Overview
Fixes intermittent ~5 second client disconnects on manifest uploads (
socket scan reach,socket fix) and other slow or streaming API calls by ensuring every outbound HTTP path uses an explicit NodeAgent, not Node ≥19’s default global agent (which applies a 5s per-socket inactivity timeout even whenSOCKET_CLI_API_TIMEOUTis unset).setupSdk(sdk.mts) now always passes anagenton the default no-proxy path (HttpAgent/HttpsAgentby base URL).apiFetch(api.mts) always returns an explicitHttpsAgentfromgetHttpsAgent()instead ofundefined, so directhttps.requesttraffic no longer inherits the global timeout.Unit tests were updated and regression cases added to assert an explicit agent with no
timeoutoption. Release 1.1.110 with changelog entry.Reviewed by Cursor Bugbot for commit 1f13e3b. Configure here.