Skip to content

feat(trace): audit-trail observability — events, query, stats, OCSF export, replay, bounded memory (#175, #176, #177, #179, #182, #213)#233

Open
dgenio wants to merge 5 commits into
mainfrom
claude/issue-triage-grouping-k70jmg
Open

feat(trace): audit-trail observability — events, query, stats, OCSF export, replay, bounded memory (#175, #176, #177, #179, #182, #213)#233
dgenio wants to merge 5 commits into
mainfrom
claude/issue-triage-grouping-k70jmg

Conversation

@dgenio

@dgenio dgenio commented Jun 18, 2026

Copy link
Copy Markdown
Owner

What changed

One cohesive change set across the TraceStore / audit-trail observability area (everything anchored on TraceStore, ActionTrace, and the kernel recording path), closing the recommended triage group.

  • models.pyActionTrace gains additive event_type (invoke/expand/deny) and reason_code (defaults preserve the original invoke meaning).
  • trace.py / new trace_query.pyTraceQuery + pure query_traces() (filter by principal, capability, event type, outcome, reason code, since-inclusive/until-exclusive window; deterministic (invoked_at, action_id) order + pagination). TraceStore is now bounded (max_entries, oldest-first eviction, loud first eviction, evicted_count).
  • stores/_protocols.py, sqlite.py, jsonl.py, memory.py, _trace_codec.pyquery() added to TraceStoreProtocol and all trace backends; revocation backends track each token's expires_at and sweep_expired() (never un-revoking a live token). Breaking: RevocationStoreProtocol.track() now takes expires_at.
  • tokens.py — threads expires_at into track; adds HMACTokenProvider.sweep_revocations().
  • new stats.pyKernelStats collector + immutable StatsSnapshot; wired at kernel choke points (grant, invoke, fallback, firewall warnings, downgrade, handle store/expand). Exposed as Kernel.stats / Kernel.reset_stats().
  • new ocsf.pytrace_to_ocsf() / traces_to_ocsf() map any record to OCSF API Activity (6003) events, AOS-enriched; pure, dependency-free.
  • new replay.pyDecisionRecord, record_decision(), replay() -> DecisionDiff (allow→deny / deny→allow / reason-code flips; rate-limit flips surfaced separately).
  • kernel/__init__.py + new kernel/_audit.py — record deny traces on PolicyDenied and expand traces on Kernel.expand(); Kernel.query_traces().
  • handles.py — expansion Frames now carry Provenance.principal_id.
  • Docs/examplesarchitecture.md, security.md, integrations.md (OCSF mapping table), capabilities.md (replay), trace_export.md, CHANGELOG.md; runnable offline examples/ocsf_export_demo.py and examples/trace_replay_demo.py (added to make example and CI).

Closes #175, #176, #177, #179, #182, #213.

Why

These six issues share one code area, data model, and implementation path — what/how-much the audit subsystem records (#175, #182), how it is read (#177, #179), and how it is exported/replayed (#176, #213) — so they are cleanest as one change. Developed in Mode B per the requester: new latitude allowed, retro-compat not required (hence the track() signature change), one combined PR.

How verified

make ci (ruff format --checkruff checkmypy src/pytest -q --cov → all examples):

  • ruff format --check — 107 files already formatted
  • ruff check src/ tests/ examples/ — All checks passed
  • mypy src/ — Success: no issues found in 57 source files
  • pytest -q715 passed, 1 skipped (new suites: test_trace_query.py, test_stats.py, test_ocsf.py, test_replay.py; extended test_trace.py, test_tokens.py, test_handles.py, test_kernel.py, test_stores_sqlite.py)
  • make ci exit code 0; both new examples run clean offline.

Tradeoffs / risks

  • Breaking change (intentional, Mode B): RevocationStoreProtocol.track() adds an expires_at parameter and the protocol gains sweep_expired() — custom revocation backends must update. In-tree backends and tests are updated.
  • OCSF schema validation is structural (field/type assertions + golden file) and offline, not against a vendored AOS schema (AOS is young; the mapping is versioned and isolated in weaver_kernel.ocsf).
  • Replay fidelity validates policy structure, not argument-dependent rules (trace args are redacted); rate-limit-dependent flips are surfaced separately because they are replay-order-sensitive.
  • A token revoked after its expiry entry was already swept leaves a benign lingering _revoked entry (the token is expired and fails the verifier's expiry check regardless); normal lifecycle (revoke while live → sweep after expiry) is fully bounded.

Scope notes

Limited to the recommended triage group. Adjacent observability items intentionally left as follow-ups: wiring KernelStats into otel.py as gauges (#179 notes this is acceptable as a follow-up), and #125 (OTel/LangWatch ActionTrace export), which partially overlaps the already-shipped instrument_kernel + export_action_traces.

🤖 Generated with Claude Code

https://claude.ai/code/session_019446VfpRWBaPqU4WX5KgTX


Generated by Claude Code

…replay, bounded memory

Implements the TraceStore / audit-trail observability group as one cohesive
change set, all anchored on TraceStore + ActionTrace + the kernel recording path:

- #175 Record handle expansions and policy denials as first-class audit events.
  ActionTrace gains additive `event_type` (invoke/expand/deny) + `reason_code`;
  Kernel.expand() records an `expand` event and fills Provenance.principal_id;
  a denied grant records a `deny` event with the stable reason code.
- #177 TraceQuery + pure query_traces() + TraceStore.query() across all backends
  (added to TraceStoreProtocol); filter by principal/capability/event/outcome/
  reason/time window with deterministic ordering and pagination.
- #179 KernelStats programmatic counters (Kernel.stats / reset_stats),
  dependency-free and lock-guarded.
- #176 OCSF/AOS SIEM export: trace_to_ocsf() / traces_to_ocsf(), pure mapping.
- #213 Policy-replay harness: DecisionRecord, record_decision(), replay() ->
  DecisionDiff, with rate-limit flips surfaced separately.
- #182 Bounded memory: TraceStore oldest-first eviction (max_entries, evicted_count,
  loud first eviction) + revocation expiry tracking and sweep that never
  un-revokes a live token (RevocationStoreProtocol.track() gains expires_at;
  adds sweep_expired()).

Docs (architecture/security/integrations/capabilities/trace_export), CHANGELOG,
two runnable offline examples, and full test coverage added. `make ci` passes
(715 passed, 1 skipped).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019446VfpRWBaPqU4WX5KgTX
Copilot AI review requested due to automatic review settings June 18, 2026 09:31

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands the audit-trail / TraceStore observability subsystem by adding new audited event kinds (deny/expand), a shared trace query surface, bounded in-memory retention + revocation sweeping, and new operator-focused tooling (KernelStats, OCSF export, policy replay), with docs/examples/tests updated accordingly.

Changes:

  • Extend ActionTrace with additive event_type and reason_code, and record deny/expand traces from kernel choke points.
  • Add TraceQuery + query_traces() and implement TraceStoreProtocol.query() across in-memory/SQLite/JSONL backends.
  • Add bounded memory mechanisms: TraceStore(max_entries, evicted_count) and revocation expires_at tracking + sweep_expired() (plus KernelStats, OCSF export, and replay harness).

Reviewed changes

Copilot reviewed 37 out of 37 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/test_trace.py Adds eviction/bounding tests for TraceStore and export field assertions.
tests/test_trace_query.py New unit/integration tests for trace filtering, ordering, and pagination.
tests/test_tokens.py Adds tests for revocation expiry sweeping and bounded growth.
tests/test_stores_sqlite.py Updates revocation tracking API and adds SQLite sweep-expired test.
tests/test_stats.py New tests for KernelStats collector and kernel integration counters.
tests/test_replay.py New tests for policy decision recording + deterministic replay diffs.
tests/test_ocsf.py New tests for OCSF/AOS mapping shape, determinism, and no-args leakage.
tests/test_kernel.py Adds kernel-level tests for deny/expand trace recording and querying.
tests/test_handles.py Verifies expansion frames now include Provenance.principal_id.
src/weaver_kernel/trace.py Adds bounded TraceStore, exports event_type/reason_code, and integrates query support.
src/weaver_kernel/trace_query.py Introduces TraceQuery and pure deterministic query_traces() implementation.
src/weaver_kernel/tokens.py Threads expires_at into revocation tracking; adds provider sweep_revocations().
src/weaver_kernel/stores/sqlite.py Implements trace query; adds token expiry table + sweep-expired revocation cleanup.
src/weaver_kernel/stores/memory.py Adds expiry tracking + lazy/explicit revocation sweeping to bound memory.
src/weaver_kernel/stores/jsonl.py Implements trace query using shared query_traces() semantics.
src/weaver_kernel/stores/_trace_codec.py Decodes persisted event_type/reason_code with backward-compatible defaults.
src/weaver_kernel/stores/_protocols.py Extends protocols: TraceStoreProtocol.query(), RevocationStoreProtocol.track(expires_at) + sweep_expired().
src/weaver_kernel/stats.py New KernelStats collector and immutable StatsSnapshot.
src/weaver_kernel/replay.py New policy decision record + replay/diff harness with rate-limit separation.
src/weaver_kernel/ocsf.py New pure mapping from ActionTrace → OCSF API Activity (6003), AOS-enriched.
src/weaver_kernel/models.py Adds TraceEventType and new ActionTrace fields with defaults.
src/weaver_kernel/kernel/_stream.py Updates streaming pipeline to bump stats for handles/invocations.
src/weaver_kernel/kernel/_invoke.py Threads fallback signal + adds invocation/handle stats increments.
src/weaver_kernel/kernel/_audit.py New helpers to build/store deny and expansion traces.
src/weaver_kernel/kernel/init.py Records deny/expand traces, exposes Kernel.query_traces(), and wires Kernel.stats.
src/weaver_kernel/handles.py Fills expansion Provenance.principal_id from the expanding principal.
src/weaver_kernel/init.py Re-exports new public APIs (query, stats, OCSF export, replay types/functions).
Makefile Adds new examples to make example.
examples/trace_replay_demo.py Offline runnable replay demo showing allow→deny flip behavior.
examples/ocsf_export_demo.py Offline runnable demo exporting traces to OCSF and printing stats snapshot.
docs/trace_export.md Documents new exported event_type/reason_code and related query/OCSF sections.
docs/security.md Documents new audited event types and bounded retention/sweeping behavior.
docs/integrations.md Adds SIEM export section and mapping table for OCSF/AOS.
docs/capabilities.md Documents replay harness usage and determinism/fidelity caveats.
docs/architecture.md Documents audited event types, query API semantics, retention bounding, and kernel counters.
CHANGELOG.md Adds changelog entries covering new audit/query/stats/OCSF/replay/bounding features.
.github/workflows/ci.yml Runs new examples in CI.

Comment thread src/weaver_kernel/trace.py
Comment thread src/weaver_kernel/trace_query.py
Comment thread src/weaver_kernel/kernel/_stream.py
Comment thread src/weaver_kernel/stores/memory.py
Comment thread src/weaver_kernel/stores/memory.py
claude and others added 4 commits June 18, 2026 13:06
…atetimes, fix stream redaction count, refresh export docstring

- trace_query: treat naive TraceQuery.since/until as UTC so filtering against
  the always-aware ActionTrace.invoked_at never raises TypeError.
- stores/memory: normalize naive expires_at (track) and now (sweep_expired) to
  UTC, matching SQLiteRevocationStore, preventing naive-vs-aware comparison errors.
- kernel/_stream: count a redaction event if *any* streamed frame carried a
  warning (apply_stream attaches warnings per chunk), not just the final frame.
- trace: export_action_trace docstring now describes deny/expand events instead
  of claiming denials never produce a trace.

Regression tests added for naive datetime handling (query + revocation) and the
streaming redaction-count fix. make ci passes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019446VfpRWBaPqU4WX5KgTX
… grant-time

Addresses audit-pass findings on PR #233 (no functional/behavioral regressions):

- grant(): record the "deny" audit trace best-effort. A trace-store write
  failure inside the PolicyDenied handler previously could mask the denial
  with a storage error; the denial already fails closed (no token issued), so
  the write is now wrapped and logged, and PolicyDenied always propagates.
- _audit.py: tighten the module docstring so its auditability claim matches
  behavior — only grant-time policy denials are recorded as "deny" traces;
  expansion-time access failures remain exceptions/logs, not traces.

Docstrings updated to match. Deferred (documented tradeoffs): O(n) query() on
durable backends and single-warning trace eviction.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01DcmCN68NkKQd9Ja9Awz7vb
…semantics

Audit follow-ups (no behavior change):
- Kernel.expand(): note the expansion trace is intentionally NOT best-effort
  (unlike the denial trace) so a served expansion is never left unaudited (I-02).
- execute_with_fallback(): document that only DriverError counts as a failed
  attempt; an unregistered driver is skipped and does not set fell_back.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwRZZvDVMaW5LpRJpGpBDa
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Record handle expansions and policy denials in the audit trail

3 participants