Skip to content

feat(providers): add claude_cli and codex_cli agent-CLI providers#52

Open
AbhiramDwivedi wants to merge 1 commit into
NVIDIA:mainfrom
AbhiramDwivedi:pr/b-agent-cli-provider
Open

feat(providers): add claude_cli and codex_cli agent-CLI providers#52
AbhiramDwivedi wants to merge 1 commit into
NVIDIA:mainfrom
AbhiramDwivedi:pr/b-agent-cli-provider

Conversation

@AbhiramDwivedi

@AbhiramDwivedi AbhiramDwivedi commented Jun 14, 2026

Copy link
Copy Markdown

Closes #57

What

Two providers that route Stage-2 LLM analysis through a locally-installed, already-authenticated agent CLI (claude, codex) instead of a metered HTTP API — no API key needed, the CLI's own login session is used. Activated via SKILLSPECTOR_PROVIDER=claude_cli (or codex_cli).

How

  • The LLM analyzers obtain their model from get_chat_model() and use .invoke() / .with_structured_output(schema).invoke(). For CLI providers, get_chat_model() returns a minimal ChatOpenAI-compatible adapter backed by provider.complete(); structured output is produced by prompting for JSON, then Pydantic-validating (fail-closed). llm_analyzer_base is unchanged; HTTP providers are untouched.
  • All subprocess calls go through one hardened chokepoint (providers/_agent_cli.py): shell=False; untrusted prompt via stdin only, never argv; capability stripping verified against the real CLIs (claude: --allowed-tools "" deny-by-default + --permission-mode dontAsk + --strict-mcp-config + --disable-slash-commands; codex: --sandbox read-only + --ephemeral); --dangerously-skip-permissions never used; scrubbed environment; temp CWD; per-call timeout; streamed stdout with a hard size cap (process killed on overflow — no unbounded buffering); model-label validated against argument injection; fail-closed on every error path.
  • is_available() does a real local auth probe (claude auth status / codex login status) so a report's llm_available doesn't claim availability when the CLI is logged out.

Test

Unit tests for the subprocess invariants, the bounded reader (real subprocesses: normal / overflow-kill / timeout), the adapter + structured-output parsing, and provider selection. Opt-in integration tests skip when the CLI is absent/unauthed. Verified end-to-end: a real claude_cli scan returns a parsed report with LLM-enriched findings.

🤖 Generated with Claude Code

Route Stage-2 LLM analysis through a locally-installed, already-authenticated
agent CLI (claude, codex) instead of a metered HTTP API. Activated via
SKILLSPECTOR_PROVIDER=claude_cli (or codex_cli); no API key is needed — the
CLI's own login session is used.

Transport seam
--------------
The LLM analyzers (meta_analyzer, semantic_*) obtain their model from
get_chat_model() and call .invoke() / .with_structured_output(schema).invoke()
on it; they never use chat_completion(). So the CLI transport is wired at
get_chat_model(), which for CLI providers returns AgentCLIChatModel — a minimal
ChatOpenAI-compatible adapter (invoke / ainvoke / with_structured_output) backed
by the provider's complete(). Structured output appends the JSON schema to the
prompt, then parses + Pydantic-validates the reply (fail-closed). The base class
llm_analyzer_base is unchanged. chat_completion() now routes through
get_chat_model() too, so there is a single dispatch point.

Hardened subprocess helper (providers/_agent_cli.py)
----------------------------------------------------
Single security chokepoint for both CLI providers:
- shell=False, argv list only; untrusted prompt delivered via stdin, never argv.
- Capability stripping, verified end-to-end against the real CLIs:
  claude: -p --output-format json --allowed-tools "" (deny-by-default allow-list)
  --permission-mode dontAsk --strict-mcp-config --disable-slash-commands.
  codex: exec --json --sandbox read-only --ephemeral --ignore-user-config
  --ignore-rules. --dangerously-skip-permissions is never used; --bare is not
  used (it disables keychain reads and breaks auth).
- Environment scrubbed of API/SSH/cloud creds; temp CWD; per-call timeout;
  input/output caps; fail-closed on missing binary / nonzero exit / timeout /
  unparseable output; model label validated against argument injection.
- The prompt is passed through unchanged (parity with the HTTP path); content
  hardening is the meta_analyzer's responsibility.

Providers / wiring
------------------
- providers/claude_cli, providers/codex_cli: AgentCLICapable providers
  (is_available + complete) with bundled model_registry.yaml.
- providers/base.py: AgentCLICapable protocol + has_cli_capability helper.
- providers/__init__.py: registers claude_cli/codex_cli; get_active_provider.
- llm_utils.py: get_chat_model returns the CLI adapter for CLI providers;
  is_llm_available delegates to provider.is_available(). HTTP path unchanged.

Tests
-----
- tests/unit/test_agent_cli.py: subprocess security invariants (shell=False,
  stdin-only, allow-list deny-by-default, no --bare / --dangerously-skip-
  permissions, scrubbed env, fail-closed, injection safety).
- tests/unit/test_llm_utils.py: get_chat_model CLI dispatch, adapter invoke,
  structured-output parse/validate + fail-closed, JSON extraction.
- tests/unit/test_providers.py: CLI provider selection + metadata.
- tests/integration/test_claude_cli_provider.py: opt-in, skipped when claude
  is absent/unauthed.

Verified end-to-end: a real SKILLSPECTOR_PROVIDER=claude_cli scan returns a
parsed report with LLM-enriched findings.

Docs: README + DEVELOPMENT provider/env tables updated for claude_cli/codex_cli.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Ram Dwivedi <abhiram.dwivedi@yahoo.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feature: support local agent CLIs (claude/codex) as an LLM provider without an API key

1 participant