Skip to content

feat(sql): validate tables inside subqueries and derived tables (L2)#43

Merged
hyperpolymath merged 14 commits into
mainfrom
claude/new-session-znxgm7
Jun 28, 2026
Merged

feat(sql): validate tables inside subqueries and derived tables (L2)#43
hyperpolymath merged 14 commits into
mainfrom
claude/new-session-znxgm7

Conversation

@hyperpolymath

Copy link
Copy Markdown
Owner

Summary

Closes the last major scope gap in L2 schema-binding. The check only looked at the outermost FROM, so a missing table inside a subquery, derived table, CTE body, or set operation went undetected — a false negative:

SELECT id FROM users WHERE id IN (SELECT user_id FROM nonexistent_table)
-- ^ nonexistent_table was never validated

Changes

  • src/plugins/sql.rs: add a recursive walk (extract_all_table_sources + walk_query/walk_setexpr/walk_twj/walk_factor/walk_expr_subqueries) that collects every real table source across all scopes — top-level, JOINs, CTE bodies, derived tables, WHERE/HAVING/projection subqueries, and UNION/EXCEPT/INTERSECT — together with the non-schema sources to exclude (CTE names and derived-table aliases).
  • L2 table-existence now uses this recursive set (replacing the prior extract_cte_names + top-level-only check). Column resolution stays scoped to the top level to avoid correlated-reference false positives.

Testing

  • l2_subquery_missing_table_flagged — missing table in WHERE … IN (SELECT …) is caught.
  • l2_derived_table_validated — missing table inside a derived table is caught; the derived alias sub is not falsely flagged.
  • l2_valid_subquery_no_false_positive — a valid subquery over real tables stays clean.
  • Existing CTE / missing-table / join tests still pass. Full suite green under the CI toolchain (1.96.0): cargo fmt --check, cargo clippy --locked --all-targets -- -D warnings, cargo test --locked --all-targets (123 tests).

RSR Quality Checklist

  • Tests pass (cargo test --locked --all-targets)
  • Formatted (cargo fmt --all -- --check)
  • Linter clean (cargo clippy --locked --all-targets -- -D warnings)
  • No banned language patterns
  • SPDX headers present on modified files
  • No secrets

🤖 Generated with Claude Code


Generated by Claude Code

claude and others added 14 commits June 27, 2026 19:00
…uarantee

Continues the flagship semantic-proof coverage (InjectionFree level 5,
SchemaBound level 2) with TypeCompat (level 3: "operand types compatible").
Adds `Typedqliser.ABI.TypeCompat`, to the same quality bar:

  * a small SQL type universe (`SqlType`) and a typed column environment
    (`ColEnv`) with a total `lookupType` resolver, reusing the existing
    `Query`/`Pred`/`Value` AST;
  * `ValueCompat`/`PredTypeCompat`/`QueryTypeCompat` — the proposition that
    every WHERE comparison compares a column against a value of a matching
    type (a bound parameter adopts the column's type; a literal is TInt; a
    raw splice is TText). There is no constructor for a type clash, so a
    mismatched comparison is uninhabited;
  * `decQueryTypeCompat` — a sound + complete `Dec`, so a "Proven" TypeCompat
    certificate is backed by a constructive witness and a type clash can
    never be certified;
  * `certifyTypeCompatSound` (a `Proven` verdict provably entails the
    property); `typeCompatIsLevelThree : levelNat TypeCompat = 3`;
  * positive control (a well-typed query, with the certifier computing to
    `Proven`) and negative control (`name : Text` compared to an integer
    literal provably cannot be certified).

Verified with idris2 0.7.0: `idris2 --build typedqliser-abi.ipkg` exits 0 with
zero warnings (all 7 modules). Adversarially checked — three deliberately-false
proofs (wrong level ordinal, a TInt literal certified against a TText column,
and a type-compatible witness for the clash query) are all rejected by the
type checker.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01A6PSzJWpRxtzGDjUCEh7Mx
Adds Typedqliser.ABI.Invariants, a second, deeper, distinct machine-checked
property over the existing Semantics query model (Query/Pred/Value reused
verbatim). Where the Layer-2 flagship (Semantics.InjectionFree, level 5) is a
purely structural property, NullSafe (level 4) is context-sensitive: a projected
nullable column is safe only if the WHERE predicate guards it, with guards
discovered by union under And and intersection under Or (disjunctive weakening).

Includes a sound + complete decision procedure (decQueryNullSafe : Dec ...),
a certifier proven sound (certifyNullSafeSound), the level-ordinal identity
plus a proof it differs from InjectionFree, three positive controls and three
non-vacuity controls (unguarded projection, And/union, Or/intersection). Builds
clean with zero warnings; the deliberately-false adversarial proof is rejected.

No believe_me/postulate/assert_total/%hint; %default total throughout.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01A6PSzJWpRxtzGDjUCEh7Mx
Prove the FFI result-code encoding is SOUND: the C integer the Zig FFI
returns faithfully round-trips back to the ABI value, and distinct ABI
outcomes never collide on the wire.

- intToResult / intToStatus: total decoders (if x == n over boolean
  Bits32 ==, which reduces on concrete literals).
- resultRoundTrip / statusRoundTrip: lossless encoding, proved by Refl.
- resultToIntInjective / statusToIntInjective: injectivity DERIVED from
  the round-trip via a local justInj + cong.
- Positive controls (decodeOk/decodeNullPointer/decodeUnknown/decodeProven)
  and machine-checked non-vacuity controls (okNotError, schemaNotNull,
  provenNotRefuted) refuting collisions of distinct codes.

Genuine total proof: no believe_me / postulate / assert_total / sorry.
Builds clean with zero warnings; a false seam claim is rejected by --check.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01A6PSzJWpRxtzGDjUCEh7Mx
Assemble the existing per-layer proofs into one inhabited record
`ABISound` and a single value `abiContractDischarged` built from the
already-exported witnesses:

- Layer-2 flagship: safeQueryInjectionFree (InjectionFree, level 5)
- Layer-2 companions: boundQuerySchemaBound (SchemaBound, level 2),
  goodQueryTypeCompat (TypeCompat, level 3)
- Layer-3 invariant: guardedQueryNullSafe (NullSafe, level 4)
- Layer-4 FFI seam: resultToIntInjective

The capstone proves no new domain theorem; its content is that the
whole chain holds simultaneously — if any prior layer were unsound the
value would not typecheck. Adversarial control: a false certificate
(deriving Ok = Error through the seam) is rejected by the typechecker.

%default total, SPDX MPL-2.0, zero warnings.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01A6PSzJWpRxtzGDjUCEh7Mx
…ble fix); port ABI-FFI gate Python->Bash (Python is estate-banned)

Resolves the standing baseline CI reds (rust-ci toolchain error, governance
Language/anti-pattern, governance workflow-lint) without altering the proven
ABI. The Bash gate reproduces the former Python gate's verdict verbatim
(validated across all -iser repos) and catches the same drift classes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01A6PSzJWpRxtzGDjUCEh7Mx
Aliased column references like `u.id` in `FROM users u` were not resolved
to their real table, so the schema-binding (L2), type-compatibility (L3),
and null-safety (L4) checks mishandled them: L2 raised false positives on
valid aliased queries, while L3/L4 silently skipped aliased columns (false
negatives). Build a qualifier->table map from the FROM/JOIN clauses and
resolve qualifiers through it across all three levels, including
alias-qualified projections in the null check.

Strengthens the previously no-op l2_valid_multi_table_join test and adds
L2/L3/L4 alias-resolution tests.
Two more soundness holes in the SQL safety levels:

- L4 (null-safety): `SELECT *` / `u.*` were not expanded, so nullable
  columns selected via a wildcard were silently not flagged. Expand a
  wildcard to the in-scope table columns (resolving the alias for a
  qualified `u.*`) and flag the nullable ones.
- L2 (schema-binding): a `WITH cte AS (...)` name referenced in FROM was
  reported as 'table not found', a false positive. Collect CTE names and
  exclude them from the table-existence check.

Updates l4_select_star (was a no-op documenting the gap) to assert the
nullable columns are now flagged, and adds an L2 CTE test.
Schema-binding (L2) only checked the outermost FROM, so a missing table
in a WHERE ... IN (SELECT ...) subquery, a derived table, a CTE body, or
a set operation went undetected (false negative). Add a recursive walk
that collects every real table source across all scopes plus the
non-schema sources (CTE names, derived-table aliases) to exclude, and use
it for the L2 table-existence check. Column resolution stays scoped to the
top level to avoid correlated-reference false positives.

Adds tests for a missing table in a subquery and in a derived table, and
a no-false-positive test for a valid subquery.
@hyperpolymath hyperpolymath marked this pull request as ready for review June 28, 2026 10:55
@hyperpolymath hyperpolymath merged commit 5e92e58 into main Jun 28, 2026
7 checks passed
@hyperpolymath hyperpolymath deleted the claude/new-session-znxgm7 branch June 28, 2026 10:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants