Skip to content

Configurable CLI exit behavior for API errors#211

Merged
lelia merged 8 commits into
mainfrom
lelia/cli-api-error-exit-handling
May 29, 2026
Merged

Configurable CLI exit behavior for API errors#211
lelia merged 8 commits into
mainfrom
lelia/cli-api-error-exit-handling

Conversation

@lelia
Copy link
Copy Markdown
Contributor

@lelia lelia commented May 29, 2026

Summary

Adds --exit-code-on-api-error <N> so CI pipelines can give API / infrastructure failures (timeouts, network errors, unexpected exceptions) a distinct exit code from blocking security findings (exit 1). Primary motivator: Buildkite CI soft_fail setups that want to tolerate a Socket outage without ignoring real findings.

Deliberately non-breaking. The CLI already exited 3 on these errors; this just makes that code configurable. Default behavior is unchanged — the exit code only moves when you pass the flag, and --disable-blocking keeps its existing precedence.

The breaking exit-code change — making infrastructure errors exit non-zero even under --disable-blocking, so outages stop being silently swallowed — is intentionally deferred to a future 3.0.x release. This PR represents the non-breaking 2.3.0 minor step release.

What's included

  • --exit-code-on-api-error <int> (default 3). Remap to a Buildkite soft_fail
    code, or 0 to swallow infra errors.
  • Buildkite-aware error logging: when BUILDKITE=true, infrastructure errors emit
    log section markers (^^^ +++ / --- :warning:) so the section auto-expands in the
    BK UI, plus a soft_fail hint. No effect on other CI platforms.
  • Commit-message auto-truncation at 200 chars — prevents HTTP 413s from oversized
    URL query parameters (AI-generated messages, $BUILDKITE_MESSAGE).
  • Fix: --timeout is now honored end-to-end (it was applied to the local client but
    not the SDK instance used for the diff comparison, which defaulted to 1200s).
  • Fix: --exclude-license-details now propagates to the full-scan diff request.

⚠️ CLI flag usage

--disable-blocking forces exit 0 for all outcomes and therefore overrides
--exit-code-on-api-error. For the common "tolerate outages, still block on findings"
goal, use the new flag without --disable-blocking:

steps:
  - label: ":lock: Socket Security Scan"
    command: "socketcli --exit-code-on-api-error 100 ..."   # NOT --disable-blocking
    soft_fail:
      - exit_status: 100

Combining the two would exit 0 on findings and outages — the soft_fail rule would
never match and real findings would stop blocking. The README "How these options
interact" section spells this out, and a regression test locks the precedence in.

Exit codes

Code Meaning
0 Clean scan (or --disable-blocking)
1 Blocking security finding(s)
2 Interrupted (SIGINT)
3 Infrastructure / API error — remappable via --exit-code-on-api-error

Test plan

  • Full unit + core suite — 247 passed, 2 pre-existing skips
  • --exit-code-on-api-error default 3 / custom / 0
  • --disable-blocking overrides the flag (→ 0) — regression-locked
  • Buildkite CI markers gated on BUILDKITE=true
  • commit-message truncation (passthrough / truncate / quote-then-truncate)
  • CI green on the PR

Fixes: CE-198
Refs: CE-196

lelia added 6 commits May 29, 2026 16:42
… logging

Adds a configurable exit code for API/infrastructure failures so CI pipelines
can distinguish them from blocking security findings (exit 1), without changing
any default behavior.

- New CliConfig field exit_code_on_api_error (default 3) + --exit-code-on-api-error
  flag. The CLI already exited 3 on unexpected errors; this just makes that code
  configurable (e.g. remap to a Buildkite soft_fail code, or 0 to swallow).
- New _emit_infrastructure_error helper + IS_BUILDKITE gate: emits Buildkite log
  section markers (^^^ +++ / --- ⚠️) and a soft_fail hint when running in
  Buildkite; plain log.error elsewhere so markers don't leak as literal text.
- Wire the top-level generic-exception handler in cli() through the helper and
  the configurable code.

Deliberately NON-breaking for 2.3.x:
- --disable-blocking STILL forces exit 0 for all outcomes and takes precedence
  over --exit-code-on-api-error (documented in the flag help so the two aren't
  combined by mistake).
- Default exit codes are unchanged; the exit code only changes when the user
  explicitly passes the flag.

The breaking variant (infra errors bypassing --disable-blocking, distinct
RequestTimeoutExceeded handling, exit 1 -> 3 for diff API failures) is
intentionally deferred to a future 3.0 release.

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
The --commit-message flag passes its value directly into the API request URL
as a query parameter with no length limit. AI-generated commit messages and
the common CI pattern of concatenating $BUILDKITE_BUILD_NUMBER + $BUILDKITE_MESSAGE
can easily exceed URL length limits, producing HTTP 413 errors.

The 413 originates from an infrastructure-layer URL length limit (nginx/Cloudflare),
not application-level validation -- confirmed via inspection of the Socket API route
handler, which has no constraint on commit_message (unlike committers, which enforces
<= 200 chars and returns a clean 400).

200 chars chosen as a conservative defensive ceiling given URL encoding can 2-3x
raw character count. No customer should ever want a 2000-character commit message
in their scan metadata.

A backend-side validation (returning 400 instead of 413) is filed as a follow-on
for the depscan API team.

Motivated by customer incidents (Plaid).

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
The full-scan diff comparison ignored --exclude-license-details: the flag was
applied to full-scan params and report URLs but never forwarded to the
fullscans.stream_diff request, so diff comparisons always fetched license
details regardless of the flag.

Thread it through get_added_and_removed_packages -> stream_diff via a new
include_license_details param (defaulting True to preserve current behavior).

Non-breaking: the APIFailure handling at this call site is deliberately left
as-is (exit 1, --disable-blocking -> 0). Re-routing diff APIFailures through
the top-level exit-3 path is part of the 3.0 exit-code change, not this one.

Originally from the unreleased PR #195 branch; the timeout-propagation half
already landed in the preceding commit.

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
…tting

tests/unit/test_cli_config.py
- exit_code_on_api_error default 3 / custom / zero
- commit-message truncation: passthrough under 200, truncate over 200,
  quote-strip-before-truncate

tests/unit/test_socketcli.py
- unexpected error exits 3 by default
- --exit-code-on-api-error 100 remaps the failure exit code
- --disable-blocking OVERRIDES --exit-code-on-api-error (-> 0): locks in the
  documented precedence so the soft_fail guidance can't silently regress
- KeyboardInterrupt still exits 2
- _emit_infrastructure_error: BK markers + soft_fail hint only when
  IS_BUILDKITE; traceback gated on include_traceback

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Minor bump for the new --exit-code-on-api-error flag and the supporting
non-breaking improvements (commit-message truncation, Buildkite-aware infra
error logging, --timeout / --exclude-license-details fixes).

This release is intentionally NON-breaking: default exit codes are unchanged,
the exit code only shifts when --exit-code-on-api-error is explicitly passed,
and --disable-blocking keeps its existing precedence. The breaking exit-code
behavior change (infra errors exiting non-zero even under --disable-blocking)
is deferred to a future 3.0.

CHANGELOG + README document the flag AND its interaction with --disable-blocking
(which overrides it) to reduce user error in the Buildkite soft_fail setup.

Version refs synced across pyproject.toml, socketsecurity/__init__.py, and
uv.lock (per the version-incrementation CI check).

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
@lelia lelia requested a review from a team as a code owner May 29, 2026 20:56
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 29, 2026

🚀 Preview package published!

Install with:

pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple socketsecurity==2.3.0.dev10

Docker image: socketdev/cli:pr-211

Resolve version-bump conflicts (pyproject.toml, socketsecurity/__init__.py,
uv.lock) in favor of 2.3.0, which supersedes main's 2.2.92 release (#208).
Auto-merged #208's alert-title fallback changes in core/__init__.py.

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
…xit-handling

Resolve conflicts:
- pyproject.toml / __init__.py / uv.lock: keep 2.3.0 (supersedes main's 2.2.93)
- CHANGELOG.md: keep both — 2.3.0 on top, then 2.2.93/2.2.92/2.2.91 from main
Dependency bumps from #207 (idna 3.15, urllib3, etc.) carried through;
uv lock --check passes.

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
@lelia lelia changed the title Configure exit behavior for API errors Configurable CLI exit behavior for API errors May 29, 2026
@lelia lelia merged commit cdd3bf6 into main May 29, 2026
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants