From de70fdf6a63109cfbc6d0c97cddd5b9d625ad5e2 Mon Sep 17 00:00:00 2001 From: Connor McDonald Date: Thu, 2 Jul 2026 12:42:21 +0200 Subject: [PATCH] docs: reposition around agents + OKF + freshness, strip em dashes Lead the README and docs landing with the agents-first story: OKF makes docs portable and accessible, Surface governs their freshness, and together they measurably improve how well agents perform. Promote the OKF section to a co-equal pillar (up near "how it works"), reframe "why it matters" as the positive, benchmark-backed payoff, and drop the remaining accusatory framing ("stale docs make agents fail", "completely false documentation"). Also replace em dashes with hyphens across the published docs, readme, and hubs (excludes CHANGELOG, dogfood-log, STRATEGY, surface-proposal as historical/internal records). Prose-only: surf check stays green, no anchor hashes touched. Co-Authored-By: Claude Opus 4.8 (1M context) --- AGENTS.md | 34 +++++----- CONTRIBUTING.md | 14 ++-- README.md | 96 +++++++++++++++------------- docs/examples.md | 6 +- docs/getting-started/install.md | 8 +-- docs/getting-started/quickstart.md | 8 +-- docs/guides/authoring-hubs.md | 74 ++++++++++----------- docs/guides/ci-integration.md | 12 ++-- docs/guides/okf.md | 32 +++++----- docs/guides/stats.md | 8 +-- docs/index.md | 61 +++++++++--------- docs/phases/00-toolchain-scaffold.md | 6 +- docs/phases/01-anchor-resolution.md | 12 ++-- docs/phases/02-canonical-hashing.md | 12 ++-- docs/phases/03-hub-format.md | 12 ++-- docs/phases/04-surf-lint.md | 10 +-- docs/phases/05-surf-check.md | 14 ++-- docs/phases/06-surf-verify.md | 12 ++-- docs/phases/07-distribution.md | 12 ++-- docs/phases/OVERVIEW.md | 14 ++-- docs/phases/README.md | 16 ++--- docs/reference/commands.md | 36 +++++------ docs/reference/configuration.md | 20 +++--- docs/reference/faq.md | 12 ++-- docs/reference/hash-recipes.md | 46 ++++++------- docs/reference/how-it-works.md | 14 ++-- hubs/anchor.md | 4 +- hubs/cli-check.md | 14 ++-- hubs/cli-for.md | 8 +-- hubs/cli-git.md | 6 +- hubs/cli-lint.md | 10 +-- hubs/cli-reference.md | 4 +- hubs/cli-scaffold.md | 4 +- hubs/cli-stats.md | 10 +-- hubs/cli-suggest.md | 6 +- hubs/cli-verify.md | 4 +- hubs/cli-workspace.md | 6 +- hubs/config.md | 2 +- hubs/hash.md | 12 ++-- hubs/hub-format.md | 12 ++-- hubs/rename.md | 2 +- hubs/resolve.md | 4 +- 42 files changed, 360 insertions(+), 349 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index ef57df0..f330ba4 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -15,12 +15,12 @@ Guidance for AI coding agents working in this repo. (Humans: see [`CONTRIBUTING.md`](./CONTRIBUTING.md) and [`docs/`](./docs/index.md).) > This file is itself a hub (see `surf.toml`): the one sealed claim below anchors to the lint rule -> that polices this very file — a small bit of dogfooding. Everything else here is plain +> that polices this very file - a small bit of dogfooding. Everything else here is plain > instruction, deliberately not anchored. Surface is a deterministic gate that surfaces divergence between docs and code: you anchor a sentence to the code it describes, and `surf check` blocks when that code's logic changes out -from under the prose. **This repo dogfoods Surface on its own source** — the gate runs on +from under the prose. **This repo dogfoods Surface on its own source** - the gate runs on `surf-core`/`surf-cli`. ## Where the context lives: `hubs/` @@ -28,11 +28,11 @@ from under the prose. **This repo dogfoods Surface on its own source** — the g [`hubs/`](./hubs/) is the governed context for this codebase. Each hub is markdown prose describing an invariant, with frontmatter anchoring the claim to a specific symbol. Unlike code -comments, **hub prose is sealed by `surf check`** — if the anchored code changed since a human +comments, **hub prose is sealed by `surf check`** - if the anchored code changed since a human last confirmed the prose, the gate fails. So the hubs are trustworthy and current in a way comments are not, and they are the fastest accurate way to understand a part of the system. -**Read only the hubs you need — not the whole directory.** The hubs together describe the entire +**Read only the hubs you need - not the whole directory.** The hubs together describe the entire codebase; reading all of them is wasteful context. They are split per module (`hubs/cli-check.md`, `hubs/hash.md`, `hubs/resolve.md`, …), and each starts with a one-line `summary:` in its frontmatter. Scan the filenames and summaries, then open only the hub(s) @@ -44,41 +44,41 @@ directory exists; by convention it points at the directory rather than duplicati individual hubs (so agents search, not read everything). Caveat (the tool's own honest limit): a green gate means *the anchored code hasn't changed -since last verified* — not that every sentence is true, and nothing about code no hub anchored. +since last verified* - not that every sentence is true, and nothing about code no hub anchored. If you read a hub claim, sanity-check it against the code it points at before relying on it. -## When you add or change a feature — keep the hubs honest +## When you add or change a feature - keep the hubs honest Run the loop (binary builds to `target/debug/surf`; see `CONTRIBUTING.md` for build commands): 1. Make the change. -2. `surf lint` — every anchor must resolve. Consider the advisory granularity warnings +2. `surf lint` - every anchor must resolve. Consider the advisory granularity warnings (over/under-anchoring); they are nudges, not blocks. -3. `surf check` — if you touched code a hub anchors, it will report `DIVERGED`. Re-read the +3. `surf check` - if you touched code a hub anchors, it will report `DIVERGED`. Re-read the claim. If the prose **still holds**, `surf verify` re-seals it (writes the new hash); if the prose is **now false**, fix the prose first, then verify. 4. Added public behavior? First reach for an *existing* system claim: extend its prose, or add the new symbol as another site under its multi-site `at:` list. Write a brand-new claim only - when the behavior is genuinely its own. A hub is an onboarding doc, not a per-function log — + when the behavior is genuinely its own. A hub is an onboarding doc, not a per-function log - the under-coverage warning lists undocumented symbols, but consolidating them into one coarse, multi-anchor claim beats one claim per function (`surf lint` will nudge a claim-log the other way). When you update a hub, update its *prose* to stay accurate, not just the hash. 5. Record user-facing changes in [`CHANGELOG.md`](./CHANGELOG.md) under `[Unreleased]`. 6. Hit a *notable* dogfooding moment? Log it in [`docs/dogfood-log.md`](./docs/dogfood-log.md). - This is the repo eating its own dogfood, so it produces good material — capture it while it's + This is the repo eating its own dogfood, so it produces good material - capture it while it's fresh. The bar is "a reader would find this interesting," **not** every change: the gate catching a real contract drift, a lint false-positive, surprising friction, an invariant that was hard to express. Add a dated entry (newest first, template at the bottom of the file); skip routine changes that worked as expected. -Do not blindly `surf verify` to make the gate green — that is the rubber-stamping failure the +Do not blindly `surf verify` to make the gate green - that is the rubber-stamping failure the tool exists to prevent. Verify means "I read the prose and it is still true." ## Pointers -- [`hubs/`](./hubs/) — governed context, split per module; read only the hub(s) you need. -- [`CHANGELOG.md`](./CHANGELOG.md) — what changed; update `[Unreleased]`. -- [`docs/dogfood-log.md`](./docs/dogfood-log.md) — dated notes from using Surface on itself; add notable moments. -- [`docs/index.md`](./docs/index.md) — documentation map (guides, reference, concepts). -- [`CONTRIBUTING.md`](./CONTRIBUTING.md) — build, test, format, lint commands and layout. -- [`docs/surface-proposal.md`](./docs/surface-proposal.md) — the product spec (the `§` references in hubs). +- [`hubs/`](./hubs/) - governed context, split per module; read only the hub(s) you need. +- [`CHANGELOG.md`](./CHANGELOG.md) - what changed; update `[Unreleased]`. +- [`docs/dogfood-log.md`](./docs/dogfood-log.md) - dated notes from using Surface on itself; add notable moments. +- [`docs/index.md`](./docs/index.md) - documentation map (guides, reference, concepts). +- [`CONTRIBUTING.md`](./CONTRIBUTING.md) - build, test, format, lint commands and layout. +- [`docs/surface-proposal.md`](./docs/surface-proposal.md) - the product spec (the `§` references in hubs). diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index bed607c..2f4c75f 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -28,7 +28,7 @@ cargo run -q -p surf-cli -- check # anchored spans match their stored hashes ``` If you change a symbol that a hub anchors (see `hubs/`), `check` will block until you either -revert or — if the change is intended and the prose still holds — re-stamp it: +revert or - if the change is intended and the prose still holds - re-stamp it: ```sh cargo run -q -p surf-cli -- verify "surf-core/src/hash.rs > emit" @@ -36,21 +36,21 @@ cargo run -q -p surf-cli -- verify "surf-core/src/hash.rs > emit" ## Layout -- `surf-core/` — pure parse/resolve/hash logic, no I/O (also the future WASM target). -- `surf-cli/` — the `surf` binary: workspace discovery, the commands, all I/O. -- `docs/phases/` — how the MVP was built, one self-contained file per phase. Start with +- `surf-core/` - pure parse/resolve/hash logic, no I/O (also the future WASM target). +- `surf-cli/` - the `surf` binary: workspace discovery, the commands, all I/O. +- `docs/phases/` - how the MVP was built, one self-contained file per phase. Start with `docs/phases/OVERVIEW.md`. The product spec is `docs/surface-proposal.md`. -- `docs/index.md` — the documentation overview; `docs/getting-started/`, `docs/guides/`, and +- `docs/index.md` - the documentation overview; `docs/getting-started/`, `docs/guides/`, and `docs/reference/` hold the user-facing pages. `AGENTS.md` is the on-ramp for AI coding agents. Keep `surf-core` free of I/O so it stays reusable; put filesystem/git work in `surf-cli`. **Docs source of truth.** This repo's `docs/` is canonical. The Starlight docs site ([`Connorrmcd6/surface-site`](https://github.com/Connorrmcd6/surface-site), -surface.gradientdev.xyz) is generated *from* these pages — edit docs here, never only on the site. +surface.gradientdev.xyz) is generated *from* these pages - edit docs here, never only on the site. On every `v*` release tag, the release workflow dispatches to surface-site, which regenerates its docs from `docs/` and `CHANGELOG.md` and opens a sync PR (a human merges it to deploy). So a -release ships the docs that were merged before the tag — land doc edits with the code. +release ships the docs that were merged before the tag - land doc edits with the code. `docs/reference/commands.md` is governed by `hubs/cli-reference.md`, anchored to the clap `Command` enum in `surf-cli/src/main.rs`: change a command or flag and `surf check` blocks until diff --git a/README.md b/README.md index 6192ade..39e272e 100644 --- a/README.md +++ b/README.md @@ -3,13 +3,15 @@ NOTE TO THE BUILDING AGENT ========================== This README is the GitHub front door: a pitch + a compact quickstart, nothing more. The full, canonical docs live in docs/ (this repo) and are published to surface.gradientdev.xyz. When you -add reference detail, put it in docs/ and link to it — do NOT re-inline command/config/technical +add reference detail, put it in docs/ and link to it - do NOT re-inline command/config/technical reference here. -Positioning: Surface is "a new way to document and govern code for fast-moving codebases" — -documentation governed like code. Lead with the real story: a context file that's accurate the -day it's written and rots as the code moves, because nobody knows it exists or where to find it. -Do NOT use the old accusatory "your documentation is lying" framing. +Positioning: agents are now first-class readers of your docs. OKF (Google's open format) makes +those docs portable and accessible; Surface governs their freshness; together they measurably +improve how well agents perform. Lead with that. "Documentation governed like code" (anchor a +sentence to code, block the build when the code's logic drifts) is the mechanism, the how - state +it, don't headline it. Do NOT use the old accusatory "your documentation is lying" / "stale docs +make agents fail" framing; state the positive payoff and the honest limits instead. Two rules to preserve: 1. Keep the honesty. The "What Surface does NOT do" section is a feature, not a disclaimer. Do @@ -23,12 +25,14 @@ opinionated. Short lines. No "revolutionary," no "seamless." # Surface -**Documentation, governed like code.** +**Portable, always-fresh docs for agents and humans.** -You anchor a sentence to the code it describes. When that code's logic changes, Surface fails the -build until a human re-confirms the sentence still holds — the same way a broken test blocks a -merge. For fast-moving codebases where humans and agents both read the docs and neither can tell a -current doc from a rotted one. +Agents read your docs on every run, and they can't tell a current doc from a rotted one. Surface +closes that gap with two open layers: [OKF](docs/guides/okf.md) (Google's vendor-neutral format) +makes docs portable and accessible, and Surface governs their freshness - you anchor a sentence to +the code it describes, and the build fails until a human re-confirms it whenever that code's logic +changes. Documentation, governed like code. Portable *and* fresh docs measurably improve how well +agents perform. Deterministic. No model, no network, no API key in the core. @@ -38,17 +42,18 @@ Deterministic. No model, no network, no API key in the core. ## The problem -You write a context file for your codebase — an architecture note, an `AGENTS.md`, a hub for the +You write a context file for your codebase - an architecture note, an `AGENTS.md`, a hub for the auth flow. The day you write it, it's accurate. Then the code moves. Someone refactors the function you described; the behavior changes on purpose, -the tests get updated, CI goes green, the PR merges. Everything is correct — except the paragraph +the tests get updated, CI goes green, the PR merges. Everything is correct - except the paragraph that *described* that function. Nobody touched it, for two ordinary reasons: they didn't know it existed, and there was no standard place to look. It now says something untrue. -Nothing failed. Nothing fired. The only thing that broke is the explanation the next engineer — and -every agent on every run — will trust and reason from. A codebase can be fully green on tests and -full of confident, completely false documentation, and nothing in your toolchain catches it. +Nothing failed. Nothing fired. The only thing that broke is the explanation the next engineer - and +every agent on every run - will trust and reason from. A codebase can be fully green on tests while +its docs quietly describe code that no longer works the way they say, and nothing in your toolchain +catches the gap. Surface closes that gap two ways: **`hubs/`** give documentation a standard home so people and agents actually find it, and **`surf check`** governs the prose like a test so it can't silently @@ -59,7 +64,7 @@ rot. You anchor a sentence to the code it's about: ```yaml -# hubs/auth.md (a "hub" — frontmatter + prose, lives wherever you like) +# hubs/auth.md (a "hub" - frontmatter + prose, lives wherever you like) anchors: - claim: "refresh rotation is single-use; reuse triggers global logout" at: "src/auth/refresh.ts > rotateRefreshToken" @@ -78,29 +83,39 @@ Quiet on cosmetics, loud on logic. Reformatting, comments, and consistent rename flipped operator, a relaxed comparison, or a dropped `await` does. The full mechanism is in [How the gate works](docs/reference/how-it-works.md). +## Speaks OKF + +A hub is a conformant [Open Knowledge Format](docs/guides/okf.md) concept - Google's vendor-neutral +standard for knowledge as markdown + frontmatter. OKF standardizes how knowledge is written down but +deliberately omits freshness; that's exactly what Surface adds. **Surface = OKF + the freshness OKF +leaves out.** Your hubs drop into any OKF consumer (Knowledge Catalog, the OKF visualizer, Obsidian, +git-backed doc editors), which read the prose and ignore the `anchors:` Surface governs. See +[Surface and OKF](docs/guides/okf.md). + ## Why it matters -**Stale docs make AI agents fail. Surface finds them before they do.** +**Fresh, portable docs measurably improve how well agents perform.** -AI coding agents trust your docs. So when a doc is out of date, the agent confidently does the wrong -thing — even when the real code is right there. We measured it: across multiple models from three -providers, agents working from a stale doc got the task wrong far more often than agents given no doc -at all, and a more capable model was no more resistant. Accurate docs — or just surfacing the drift — -fixed it. +An agent reads your docs on every run and reasons from them as if they were true. We measured what +that's worth: across multiple models from three providers, agents given an accurate doc completed +the task far more often than agents working from a stale one - a stale doc was even worse than no +doc at all, and a more capable model was no more resistant. Just surfacing the drift recovered most +of the gap. OKF makes the docs portable enough to reach every agent; Surface keeps them fresh +enough to trust. That's from a [pre-registered, deterministically-graded benchmark](https://github.com/Connorrmcd6/surface-bench) (3,250 graded completions, multi-turn agents, no LLM judge). It measures drift of exactly the kind `surf check` catches; the author of the benchmark also authors Surface, and a null result on any -hypothesis was reportable — see the [write-up](https://github.com/Connorrmcd6/surface-bench/blob/main/PAPER.md) +hypothesis was reportable - see the [write-up](https://github.com/Connorrmcd6/surface-bench/blob/main/PAPER.md) for the full method, limitations, and data. ## Quickstart ```sh surf init # writes surf.toml + creates hubs/ -surf new auth # creates hubs/auth.md — add a claim and point at: at a symbol +surf new auth # creates hubs/auth.md - add a claim and point at: at a symbol surf lint # does every anchor resolve to exactly one symbol? -surf check # the gate — a new claim is "unverified" until you seal it +surf check # the gate - a new claim is "unverified" until you seal it surf verify # you read the prose and confirmed it; seal the hash ``` @@ -125,7 +140,7 @@ Read this part. It's the difference between a tool you trust and one that burns human should re-read the prose. A green check means "nothing drifted since the last sign-off," not "everything is correct." - **It only watches what you anchored.** A change in a file no hub points at can still invalidate a - documented invariant; Surface won't see it. That's security review and taint analysis — a + documented invariant; Surface won't see it. That's security review and taint analysis - a different discipline. - **It is not a retrieval system.** It doesn't search, embed, or serve context. It optimizes a different thing: *trust* in what you retrieve. @@ -135,22 +150,13 @@ Surface's JSON output. The core never depends on it. More in [What Surface does NOT do](docs/index.md#what-surface-does-not-do) and [Is Surface for you?](docs/index.md#is-surface-for-you). -## Speaks OKF - -A hub is a conformant [Open Knowledge Format](docs/guides/okf.md) concept — Google's vendor-neutral -standard for knowledge as markdown + frontmatter. OKF standardizes how knowledge is written down but -deliberately omits freshness; that's exactly what Surface adds. **Surface = OKF + the freshness OKF -leaves out.** Your hubs drop into any OKF consumer (Knowledge Catalog, the OKF visualizer, Obsidian, -git-backed doc editors), which read the prose and ignore the `anchors:` Surface governs. See -[Surface and OKF](docs/guides/okf.md). - ## Install -Most repos never install the binary — they run the GitHub Action: +Most repos never install the binary - they run the GitHub Action: ```yaml # .github/workflows/surface.yml -- uses: actions/checkout@v4 # plain checkout — do NOT set fetch-depth: 0 +- uses: actions/checkout@v4 # plain checkout - do NOT set fetch-depth: 0 - uses: Connorrmcd6/surface@v0.7.0 ``` @@ -162,7 +168,7 @@ curl --proto '=https' --tlsv1.2 -fsSL https://raw.githubusercontent.com/Connorrm Prebuilt binaries for macOS (Apple Silicon) and Linux (x86_64); build from source on other Unix arches. Windows is **not supported** (the anchor grammar requires forward-slash paths). -Full options — pre-commit hook, `cargo install`, the architecture matrix — in +Full options - pre-commit hook, `cargo install`, the architecture matrix - in [Install](docs/getting-started/install.md). ## Documentation @@ -171,10 +177,10 @@ Full docs at **[surface.gradientdev.xyz](https://surface.gradientdev.xyz)** (sou [`docs/`](docs/index.md)): - [Quickstart](docs/getting-started/quickstart.md) · [Install](docs/getting-started/install.md) -- [Authoring hubs](docs/guides/authoring-hubs.md) — claims, anchor grammar, granularity, the verify loop. -- [Surface and OKF](docs/guides/okf.md) — hubs as conformant Open Knowledge Format concepts. -- [CI integration](docs/guides/ci-integration.md) — the Action, the pre-commit hook, scoping a PR. -- [Examples](docs/examples.md) — a minimal hub in each supported language. +- [Authoring hubs](docs/guides/authoring-hubs.md) - claims, anchor grammar, granularity, the verify loop. +- [Surface and OKF](docs/guides/okf.md) - hubs as conformant Open Knowledge Format concepts. +- [CI integration](docs/guides/ci-integration.md) - the Action, the pre-commit hook, scoping a PR. +- [Examples](docs/examples.md) - a minimal hub in each supported language. - Reference: [Commands](docs/reference/commands.md) · [Configuration](docs/reference/configuration.md) · [How the gate works](docs/reference/how-it-works.md) · [FAQ](docs/reference/faq.md) Release history is in [`CHANGELOG.md`](CHANGELOG.md). AI agents working in this repo: see @@ -182,11 +188,11 @@ Release history is in [`CHANGELOG.md`](CHANGELOG.md). AI agents working in this ## Benchmark -The agent-impact benchmark — measuring how much documentation accuracy changes an agent's task -performance — lives in its own repo: **[Connorrmcd6/surface-bench](https://github.com/Connorrmcd6/surface-bench)**. +The agent-impact benchmark - measuring how much documentation accuracy changes an agent's task +performance - lives in its own repo: **[Connorrmcd6/surface-bench](https://github.com/Connorrmcd6/surface-bench)**. It consumes the `surf` binary's output but has no inbound dependency on this core. (It previously lived under `bench/` here.) --- -The naming isn't decoration: the *gradient* of a field is everywhere perpendicular to its level *surfaces* — the direction of change, and the thing the change is measured against. Surface reports **divergence** between what your docs claim and what your code does. +The naming isn't decoration: the *gradient* of a field is everywhere perpendicular to its level *surfaces* - the direction of change, and the thing the change is measured against. Surface reports **divergence** between what your docs claim and what your code does. diff --git a/docs/examples.md b/docs/examples.md index 4213392..6c618ef 100644 --- a/docs/examples.md +++ b/docs/examples.md @@ -53,13 +53,13 @@ anchors: at: api/client.py > Client > _request ``` -Decorators are transparent for *resolution* — `@retry` above `def _request` doesn't change which -symbol the anchor finds — but they are part of the hashed span, so a decorator swap +Decorators are transparent for *resolution* - `@retry` above `def _request` doesn't change which +symbol the anchor finds - but they are part of the hashed span, so a decorator swap (`@retry` → `@retry_with_jitter`) **fires**. - Change the backoff base or the cap → **fires.** -Non-callables anchor too — module constants, type aliases, and class attributes: +Non-callables anchor too - module constants, type aliases, and class attributes: ```yaml anchors: diff --git a/docs/getting-started/install.md b/docs/getting-started/install.md index 45ad6ef..914f1c0 100644 --- a/docs/getting-started/install.md +++ b/docs/getting-started/install.md @@ -4,7 +4,7 @@ description: Install Surface via npm, the GitHub Action, the pre-commit hook, th --- Surface is a single static binary. The quickest way to get it locally is **npm** (below). In CI, -most repos never install it directly — they consume the **GitHub Action** or the **pre-commit +most repos never install it directly - they consume the **GitHub Action** or the **pre-commit hook**, which fetch the binary for you. There's also a `curl | sh` installer and a from-source build. @@ -16,7 +16,7 @@ npx @gradient-tools/surface check ``` A thin shim package pulls in the prebuilt binary for your platform via `optionalDependencies` -(macOS Apple Silicon and Linux x86_64) — there is no `postinstall` download step. On an +(macOS Apple Silicon and Linux x86_64) - there is no `postinstall` download step. On an unsupported platform the shim errors and points you at the from-source build. ## GitHub Action @@ -30,7 +30,7 @@ jobs: surface: runs-on: ubuntu-latest steps: - - uses: actions/checkout@v4 # plain checkout — do NOT set fetch-depth: 0 + - uses: actions/checkout@v4 # plain checkout - do NOT set fetch-depth: 0 - uses: Connorrmcd6/surface@v0.7.0 ``` @@ -57,7 +57,7 @@ Prebuilt binaries are published for **macOS (Apple Silicon)** and **Linux (x86_6 macOS or other Unix architectures, build from source. Each release publishes a `.sha256` alongside every binary, and the install script verifies -it before installing — a missing checksum or a mismatch aborts the install rather than running an +it before installing - a missing checksum or a mismatch aborts the install rather than running an unverified binary. ## Platform support diff --git a/docs/getting-started/quickstart.md b/docs/getting-started/quickstart.md index 2926ee1..9faf72e 100644 --- a/docs/getting-started/quickstart.md +++ b/docs/getting-started/quickstart.md @@ -3,7 +3,7 @@ title: Quickstart description: Set up a workspace, anchor a claim to code, and drive the init → new → lint → check → verify loop. --- -Set up the workspace, then scaffold a hub — a markdown file whose frontmatter anchors sentences to +Set up the workspace, then scaffold a hub - a markdown file whose frontmatter anchors sentences to code: ```sh @@ -30,7 +30,7 @@ Then drive the loop: ```sh surf lint # does every anchor resolve to exactly one symbol? -surf check # the gate — a brand-new claim is "unverified" until you seal it +surf check # the gate - a brand-new claim is "unverified" until you seal it ``` ``` @@ -38,7 +38,7 @@ UNVERIFIED hubs/auth.md :: src/auth/refresh.ts > rotateRefreshToken run `surf verify` ``` -You've read the prose and confirmed it's true, so seal it — this writes the hash back into the +You've read the prose and confirmed it's true, so seal it - this writes the hash back into the frontmatter (`verify` only touches that one line): ```sh @@ -58,7 +58,7 @@ surf check: 1 divergence(s). The merge is blocked (non-zero exit) until someone re-reads the sentence. If it still holds, `surf verify` re-seals; if it's now false, fix the prose first. Reformatting, comments, or renaming -a local variable do **not** trip it — only logic does. +a local variable do **not** trip it - only logic does. Machine-readable output for tooling and the optional reviewer plugin: diff --git a/docs/guides/authoring-hubs.md b/docs/guides/authoring-hubs.md index 644e348..a44a6b1 100644 --- a/docs/guides/authoring-hubs.md +++ b/docs/guides/authoring-hubs.md @@ -24,20 +24,20 @@ refs: [] Prose a human (or agent) reads to understand this domain. ``` -- **`claim`** — one sentence stating an invariant. Write what must stay true, not how the code +- **`claim`** - one sentence stating an invariant. Write what must stay true, not how the code is structured. A claim that restates the implementation rots as fast as a comment. -- **`at`** — the anchor: where the claim's logic lives (grammar below). -- **`hash`** — the seal. Absent until you `surf verify`; the gate treats a hashless claim as +- **`at`** - the anchor: where the claim's logic lives (grammar below). +- **`hash`** - the seal. Absent until you `surf verify`; the gate treats a hashless claim as *unverified*. -- **`refs`** — hub composition: paths to *other hubs* this one builds on, written relative to this +- **`refs`** - hub composition: paths to *other hubs* this one builds on, written relative to this hub (`./resolve.md`), optionally `> symbol` to point at one claim within the target (`./resolve.md > resolve_nodes`, matched against that claim's `at:` anchor). `surf lint` blocks a ref that doesn't resolve to a hub, points at this hub, or names a claim the target lacks. The `check` gate also **propagates staleness one hop**: when a hub you `ref` has an open divergence, - this hub fails too (a `referenced_stale` divergence) — review the dependency and re-verify. Only + this hub fails too (a `referenced_stale` divergence) - review the dependency and re-verify. Only *direct* refs propagate; a chain `A → B → C` stops at one hop. -- **`covers`** — advisory file-scope globs; parsed and lint-validated but never affects - `surf check`. Leave it empty unless you have a reason — the feature that consumes it isn't +- **`covers`** - advisory file-scope globs; parsed and lint-validated but never affects + `surf check`. Leave it empty unless you have a reason - the feature that consumes it isn't shipped. Where hubs live is configured by the `hubs` glob in `surf.toml` (default `hubs/*.md`); keep them @@ -47,7 +47,7 @@ central or co-locate them with code (`["**/_hub.md"]`). The most common failure mode is writing a hub like a **claim-log**: one claim per function, each restating what a single symbol does, with a thin heading and no real prose. That's a changelog of -symbols, not a briefing — and it makes the verify loop a rubber-stamp, because nothing connects the +symbols, not a briefing - and it makes the verify loop a rubber-stamp, because nothing connects the claims to a *system*. A good hub is the opposite: **prose first**, documenting a system, with a handful of **coarse @@ -64,15 +64,15 @@ Concretely, a good claim describes *a behavior of the system* and seals every sp depends on: ```yaml -- claim: commission is the only multi-level payout — it walks the referral graph up to three +- claim: commission is the only multi-level payout - it walks the referral graph up to three ancestors, pays REFERRAL_COMMISSION_RATES[tier][level], and skips self-edges at: - backend/referral-commission.service.ts > ReferralCommissionService > buildCommissionRecords - packages/constants/ReferralCommission.ts > REFERRAL_COMMISSION_RATES # one invariant, two sites ``` -Write the prose a reader needs to onboard — the single most important distinction, how the pieces -fit (`## sections`, tables), and a **Boundary** note on what the gate does *not* cover — then anchor +Write the prose a reader needs to onboard - the single most important distinction, how the pieces +fit (`## sections`, tables), and a **Boundary** note on what the gate does *not* cover - then anchor the invariants with as few claims as the behavior allows. `surf lint` nudges the other way when a hub drifts into claim-log shape (see below). @@ -86,10 +86,10 @@ starter hub: surf suggest "src/**/*.ts" # or --format json for tooling ``` -It only suggests — it never writes a file or stamps a hash. The output is a **list of undocumented +It only suggests - it never writes a file or stamps a hash. The output is a **list of undocumented symbols, not a list of claims to write**: it groups the symbols by file and emits a multi-site `at:` skeleton so the default shape steers you toward coarse, consolidated claims. Paste it into a -hub (or `surf new `), then **group related symbols into a few system-level claims** — write +hub (or `surf new `), then **group related symbols into a few system-level claims** - write real prose, list the sites each behavior spans under one `at:`, and delete what you don't need before `surf verify`. Treat it as a checklist of undocumented surface, not a mandate to write one claim per symbol (see [a hub is an onboarding doc](#a-hub-is-an-onboarding-doc) and granularity below). @@ -105,7 +105,7 @@ src/service.ts > TokenService > rotate - **One segment** points at a top-level symbol: `src/m.rs > parse_anchor`. - **Nested segments** walk into scopes: a type and its `impl`/methods share a name, so `Type` alone may be ambiguous while `Type > method` is unique. Methods are addressed with - `>` segments, **not** a dot — write `TokenService > rotate`, not `TokenService.rotate` + `>` segments, **not** a dot - write `TokenService > rotate`, not `TokenService.rotate` (the same applies to a TS/JS class method like `EffectiveTierService > getForUsers`). - **Non-callables** anchor too, not just functions: in Python, module constants, type aliases (`X = Literal[...]`, `type X = ...`), and class attributes (`Class > attr`); in Rust/Go, @@ -115,12 +115,12 @@ src/service.ts > TokenService > rotate `src/api.ts > handler@2`. Python `@overload` sets are the exception: consecutive stubs plus their implementation resolve as *one* symbol, so the bare name works and the hash covers every signature. -- **Multiple sites (the default for a system claim)** — a real invariant usually lives in more +- **Multiple sites (the default for a system claim)** - a real invariant usually lives in more than one place. An `at:` list combines its sites into one hash, so the claim is stale if *any* listed span changes. Reach for this **first**: one coarse claim sealing a behavior across the - 2–3 places it lives is the shape of a good hub — not one claim per symbol. + 2–3 places it lives is the shape of a good hub - not one claim per symbol. ```yaml - - claim: a refresh token is accepted at most once — rotation issues a new one and the old is + - claim: a refresh token is accepted at most once - rotation issues a new one and the old is rejected everywhere it's checked at: - src/auth/refresh.ts > rotateRefreshToken @@ -128,7 +128,7 @@ src/service.ts > TokenService > rotate ``` Run `surf lint` to confirm every anchor resolves to exactly one symbol. Ambiguous or vanished -anchors **block**; a symbol that was merely renamed — or a file that git reports has moved — only +anchors **block**; a symbol that was merely renamed - or a file that git reports has moved - only **warns** and points you at `surf verify --follow`. ## Choosing granularity @@ -137,25 +137,25 @@ This is the central tension (proposal §8): - **Under-anchor** → real drift slips through, because the changed logic wasn't anchored. - **Over-anchor** → every incidental edit re-triggers verification, and humans start - rubber-stamping `verify` without reading — which defeats the tool. + rubber-stamping `verify` without reading - which defeats the tool. `surf lint` emits advisory warnings (never blocking) to nudge you toward the middle: -- **Near-whole-file span** — the anchored symbol covers most of its file. Anchor a narrower +- **Near-whole-file span** - the anchored symbol covers most of its file. Anchor a narrower symbol so unrelated edits don't trip the claim. -- **Too many anchors in one hub** — split the hub; a long verify list invites rubber-stamping. -- **Uncovered public function** — a public function in a file the hub already anchors has no +- **Too many anchors in one hub** - split the hub; a long verify list invites rubber-stamping. +- **Uncovered public function** - a public function in a file the hub already anchors has no claim. Either add one, or accept it as intentionally undocumented. -- **Claim-log shape** — a hub with several claims that *never* use a multi-site `at:` reads as one +- **Claim-log shape** - a hub with several claims that *never* use a multi-site `at:` reads as one claim per symbol. Consolidate related claims into fewer coarse ones (see [a hub is an onboarding doc](#a-hub-is-an-onboarding-doc)). -- **Thin prose** — a multi-claim hub whose body is a stub. A hub is an onboarding doc; add prose +- **Thin prose** - a multi-claim hub whose body is a stub. A hub is an onboarding doc; add prose that frames the system, not just claims that anchor its symbols. Rule of thumb: anchor the **smallest symbol whose logic the sentence is actually about.** If a claim sits on a large symbol where user-facing copy changes often, set `ignore_literals: true` -on it — string-literal *content* is then excluded from its hash, so a copy tweak no longer +on it - string-literal *content* is then excluded from its hash, so a copy tweak no longer re-opens the claim while logic edits (operators, numbers, structure) still do. Prefer a narrower anchor first; reach for `ignore_literals` when the span genuinely must stay coarse. @@ -180,38 +180,38 @@ surf verify --follow # renamed symbol OR moved file: re-point the anc ``` Verifying without reading is the failure mode the whole tool exists to prevent. A green gate -promises only "nothing anchored changed since last sign-off" — never that the prose is true. +promises only "nothing anchored changed since last sign-off" - never that the prose is true. ## Where claims can live -A hub isn't a special file *type* — it's **any file the `hubs` glob matches that parses as a hub** +A hub isn't a special file *type* - it's **any file the `hubs` glob matches that parses as a hub** (a `---`-fenced `anchors:` frontmatter block + a markdown body). Claims don't have to live under `hubs/`: add any file to the glob in `surf.toml`, give it the frontmatter, and `surf` treats it like any other hub. The same is true for `AGENTS.md` or `CLAUDE.md`. -The common question is `AGENTS.md` — the *imperative* operating instructions for coding agents, +The common question is `AGENTS.md` - the *imperative* operating instructions for coding agents, versus hubs, which are *declarative* domain briefings. There are two approaches, and **a central `hubs/` directory is the recommended default.** -### Recommended — keep hubs and `AGENTS.md` separate +### Recommended - keep hubs and `AGENTS.md` separate Keep the two concerns apart: don't copy hub prose into `AGENTS.md`. Instead, give `AGENTS.md` a pointer block that sends agents to the hubs directory to search for what they need: ```markdown -Context lives in [`hubs/`](./hubs/) — read only the hub(s) you need. +Context lives in [`hubs/`](./hubs/) - read only the hub(s) you need. ``` When that block is present, `surf lint` checks it links the configured hubs directory and that the -directory exists. It deliberately does **not** enumerate individual hubs — that would push an agent +directory exists. It deliberately does **not** enumerate individual hubs - that would push an agent to read everything instead of the one hub it needs. This keeps `AGENTS.md` lean, stops the declarative and imperative content from drifting into each other, and keeps verification metadata out of the file agents read as instructions. -### Alternative — fold claims into `AGENTS.md` / `CLAUDE.md` +### Alternative - fold claims into `AGENTS.md` / `CLAUDE.md` If you'd rather keep the instructions and the verified claims about them in one file, add it to the glob and give it hub frontmatter: @@ -238,21 +238,21 @@ anchors: `AGENTS.md` as anchoring into it. Three things to weigh: - **The whole file must parse as a hub.** The `---` frontmatter has to be the top block and unknown - fields are rejected — you can't sprinkle a claim mid-document. + fields are rejected - you can't sprinkle a claim mid-document. - **The frontmatter is part of what the agent reads.** Most agent runners load `AGENTS.md` / `CLAUDE.md` as raw text and don't strip YAML frontmatter, so the `anchors:` block lands in the agent's context as a little extra noise. It's small and structured, but if you want `AGENTS.md` to stay purely instructions, prefer the recommended approach above. - **It couples the file to code structure.** Renaming an anchored symbol trips the gate on - `AGENTS.md` — that's the point of Surface, but it means an agent-docs file now participates in CI. + `AGENTS.md` - that's the point of Surface, but it means an agent-docs file now participates in CI. ### When *not* to anchor a file -"Any file can be a hub" doesn't mean every file should be. Pitch and marketing prose — a -`README.md` especially — is a poor fit: +"Any file can be a hub" doesn't mean every file should be. Pitch and marketing prose - a +`README.md` especially - is a poor fit: - **The claims are coarse.** A README describes behavior in broad strokes that span many symbols, - so an anchor either covers a near-whole-file span or trips on incidental edits — the + so an anchor either covers a near-whole-file span or trips on incidental edits - the over-anchoring trap from [Choosing granularity](#choosing-granularity). - **It usually duplicates hub prose.** The invariants a README restates should already be claimed in the hubs anchored to the real code; anchoring the README just gives you a second copy to keep diff --git a/docs/guides/ci-integration.md b/docs/guides/ci-integration.md index 1baa1d9..a39631f 100644 --- a/docs/guides/ci-integration.md +++ b/docs/guides/ci-integration.md @@ -4,7 +4,7 @@ description: Run the gate in CI via the GitHub Action or the pre-commit hook, th --- `surf check` is the gate: it exits non-zero when an anchored span diverged, so it blocks a merge -the same way a failing test does. Most repos never install the binary — they run the Action or +the same way a failing test does. Most repos never install the binary - they run the Action or the pre-commit hook. ## GitHub Action @@ -18,7 +18,7 @@ jobs: surface: runs-on: ubuntu-latest steps: - - uses: actions/checkout@v4 # plain checkout — do NOT set fetch-depth: 0 + - uses: actions/checkout@v4 # plain checkout - do NOT set fetch-depth: 0 - uses: Connorrmcd6/surface@v0.7.0 ``` @@ -29,12 +29,12 @@ bot, set `args: check --format json`. ### Checkout depth The verdict hashes your **working tree** and compares it to the hash committed in the -frontmatter — it does not need git history, so a plain `actions/checkout@v4` is enough. **Do not +frontmatter - it does not need git history, so a plain `actions/checkout@v4` is enough. **Do not set `fetch-depth: 0`.** The advisory `old_code`/`magnitude` fields use a single `git show` of the base ref; with no history available the verdict is unchanged and those fields are simply omitted. The one exception: if you diff-scope with `--base ` (below), fetch enough history to reach -the merge base — a shallow `git fetch ` is plenty, still not `fetch-depth: 0`. +the merge base - a shallow `git fetch ` is plenty, still not `fetch-depth: 0`. ## pre-commit @@ -54,10 +54,10 @@ This runs the same gate locally at commit time, catching drift before it reaches By default `check` evaluates every claim in every hub. On large repos or big PRs you can narrow it: -- **`--base `** — evaluate only claims whose anchored files changed since the merge base +- **`--base `** - evaluate only claims whose anchored files changed since the merge base with `` (e.g. `surf check --base origin/main`). This also recovers the advisory `old_code`/`magnitude` from that ref. -- **`--files `** — evaluate only claims whose anchored file(s) match a comma-separated +- **`--files `** - evaluate only claims whose anchored file(s) match a comma-separated glob (e.g. `surf check --files "src/auth/**"`). Both filters intersect when combined. With neither flag, the full check runs (enrichment against diff --git a/docs/guides/okf.md b/docs/guides/okf.md index 551d213..9440405 100644 --- a/docs/guides/okf.md +++ b/docs/guides/okf.md @@ -11,11 +11,13 @@ be consumed by another without translation. OKF is deliberately minimal. Its spec requires exactly one frontmatter field (`type`), recommends a few more (`title`, `description`, `resource`, `tags`, `timestamp`), and mandates that consumers **preserve unknown keys** and **never reject** a document for extra fields, unknown types, or broken -links. And it explicitly leaves several things undefined — most importantly, **freshness**: OKF has +links. And it explicitly leaves several things undefined - most importantly, **freshness**: OKF has no notion of "has the thing this describes changed?", no verification status, no anchoring to code, no drift detection. -That gap is exactly what Surface fills. +That gap is exactly what Surface fills. The payoff lands hardest with agents, which read your docs +on every run: OKF makes the docs portable enough to reach every agent, Surface keeps them fresh +enough to trust, and portable-plus-fresh docs measurably improve how well those agents perform. ## Surface = OKF + freshness @@ -31,7 +33,7 @@ title: Orders # OKF: display name description: One row per order # OKF: preserved (Surface keeps it in `extra`) tags: [sales, revenue] # OKF timestamp: 2026-05-28T14:30:00Z # OKF: last *modified* -anchors: # Surface extension — OKF readers ignore it +anchors: # Surface extension - OKF readers ignore it - claim: an order is immutable once `status = shipped` at: src/orders/model.ts > Order > freeze hash: 2:9b1c33ade8f1 @@ -45,36 +47,36 @@ anchors: # Surface extension — OKF readers ignore it Prose a human or agent reads to understand this table… ``` -- **An OKF consumer** — Google's Knowledge Catalog, the OKF visualizer, [Nansidian](#doc-systems), - Obsidian — reads this as a normal concept and silently ignores `anchors`. +- **An OKF consumer** - Google's Knowledge Catalog, the OKF visualizer, [Nansidian](#doc-systems), + Obsidian - reads this as a normal concept and silently ignores `anchors`. - **Surface** reads `anchors` and governs freshness: when `src/orders/model.ts > Order > freeze` changes, [`surf check`](../reference/commands.md) fails until a human re-confirms the claim. - The two timestamps are different on purpose: OKF's `timestamp` is *last modified*; Surface's - per-claim `verified_at` is *last attested against the code* — the freshness OKF can't express. + per-claim `verified_at` is *last attested against the code* - the freshness OKF can't express. ## What conformance means here A hub is a conformant OKF concept when it carries a `type`. Surface makes this cheap: - `surf new` scaffolds hubs with `type: concept` already set. -- Hubs written before OKF (no `type`) still parse — Surface treats a missing `type` as `concept` +- Hubs written before OKF (no `type`) still parse - Surface treats a missing `type` as `concept` in memory. They are byte-unchanged on disk; run a future `surf migrate` (or add `type:` by hand) to make them OKF-conformant *on disk*. - Any extra frontmatter key (OKF's `description`/`resource`, a doc system's `author`/`created`/ - `pinned`) is preserved verbatim on round-trip — Surface never drops what it doesn't recognize. + `pinned`) is preserved verbatim on round-trip - Surface never drops what it doesn't recognize. The one boundary worth stating plainly: > **Surface only fact-checks concepts that describe code.** A concept anchored to a code symbol is > governed. A concept with no `anchors` (a BigQuery table's business meaning, an RFC, a playbook) -> is a valid, rendered, **ungoverned** OKF concept — it passes the gate untouched. Verifying +> is a valid, rendered, **ungoverned** OKF concept - it passes the gate untouched. Verifying > non-code resources (e.g. a table's schema against the live warehouse) is future work; today the > deterministic gate is scoped to code. ## Bundles OKF ships knowledge as a **bundle**: a directory tree where each file is a concept, the path is its -identity, and two filenames are reserved — `index.md` (a directory listing for progressive +identity, and two filenames are reserved - `index.md` (a directory listing for progressive disclosure) and `log.md` (a change history). Point Surface at a bundle with `bundles` in `surf.toml`: @@ -86,19 +88,19 @@ bundles = ["knowledge/sales"] # expands to knowledge/sales/**/*.md Reserved files are recognized and skipped for governance (they hold no claims), so a bundle's `index.md`/`log.md` never trip the gate. `surf lint` additionally checks OKF cross-links in a hub's -body and **warns** (never blocks — OKF tolerates broken links) on a dangling `.md` target. +body and **warns** (never blocks - OKF tolerates broken links) on a dangling `.md` target. ## Doc systems Because an OKF bundle is just markdown in a directory, it drops into any doc system that reads -markdown — [Obsidian](https://obsidian.md/) vaults, Notion imports, and git-backed editors. The +markdown - [Obsidian](https://obsidian.md/) vaults, Notion imports, and git-backed editors. The intended integration is a **CI gate on the git repos those systems sync**: run [`surf check`](../guides/ci-integration.md) as a GitHub Action over the doc repo, so the freshness gate guards the code-anchored subset of the knowledge base wherever the docs are edited. ## See also -- [Authoring hubs](./authoring-hubs.md) — a hub is an onboarding doc, not a claim-log (the same +- [Authoring hubs](./authoring-hubs.md) - a hub is an onboarding doc, not a claim-log (the same shape OKF's prose-first concepts encourage). -- [Configuration](../reference/configuration.md) — the `bundles` glob and frontmatter fields. -- [How the gate works](../reference/how-it-works.md) — what Surface hashes and why. +- [Configuration](../reference/configuration.md) - the `bundles` glob and frontmatter fields. +- [How the gate works](../reference/how-it-works.md) - what Surface hashes and why. diff --git a/docs/guides/stats.md b/docs/guides/stats.md index b2b4299..60d45b3 100644 --- a/docs/guides/stats.md +++ b/docs/guides/stats.md @@ -7,7 +7,7 @@ description: How surf stats computes the rubber-stamp and in-place update rates, `surf stats` answers the two falsifiable questions from the proposal's success/kill criteria (§9.2): *is the gate being routed around, and do docs travel with the code?* Both are computed -deterministically from git history — no model, no network. +deterministically from git history - no model, no network. ```sh surf stats # all history @@ -17,12 +17,12 @@ surf stats --since 2026-01-01 --until 2026-04-01 --format json ## The two rates -**Rubber-stamp rate** — of *re-stamp events* (a commit changed a claim's stored `hash:` to a new +**Rubber-stamp rate** - of *re-stamp events* (a commit changed a claim's stored `hash:` to a new value), the share where the claim's **prose was left untouched**. Re-sealing without re-reading is the signal that distinguishes a working gate from one being clicked through. A rising rate is a kill signal. -**In-place update rate** — of *claim-touch events* (a commit changed a file a claim anchors), the +**In-place update rate** - of *claim-touch events* (a commit changed a file a claim anchors), the share where the claim's stored hash was **updated in the same commit**. Docs that move with the code score high; drift scores low. @@ -48,7 +48,7 @@ These are heuristics, surfaced rather than hidden: a merge-commit workflow attributes work to the individual commits instead. - **A claim's identity is its `at:` site(s).** Re-pointing a claim to a different anchor reads as a new claim, not an update of the old one. -- **The in-place denominator counts any change to an anchored file** — including comment or +- **The in-place denominator counts any change to an anchored file** - including comment or formatting edits that wouldn't actually diverge the claim. So the *true* in-place rate is at least the reported one; the number is a floor, not a point estimate. - **History must be reachable.** On a shallow clone or outside a repo, `stats` errors (non-zero) diff --git a/docs/index.md b/docs/index.md index 82ad265..38fdb2d 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,34 +1,37 @@ --- title: What is Surface? -description: Surface governs documentation like code. Anchor a sentence to the code it describes; when that code's logic changes, the build fails until a human re-confirms the sentence. +description: Portable, always-fresh docs for agents and humans. OKF makes your docs portable and accessible; Surface governs their freshness by anchoring sentences to code and blocking the build when that code's logic drifts. --- -**Documentation, governed like code.** +**Portable, always-fresh docs for agents and humans.** -You anchor a sentence to the code it describes. When that code's logic changes, `surf check` -fails the build until a human re-confirms the sentence still holds — the same way a broken test -blocks a merge. Deterministic: no model, no network, no API key in the core. +Agents read your docs on every run, and they can't tell a current doc from a rotted one. Two open +layers fix that: [OKF](./guides/okf.md) (Google's vendor-neutral format) makes docs portable and +accessible, and Surface governs their freshness - you anchor a sentence to the code it describes, +and `surf check` fails the build until a human re-confirms it whenever that code's logic changes. +Documentation, governed like code. Portable *and* fresh docs measurably improve how well agents +perform. Deterministic: no model, no network, no API key in the core. > **Docs source of truth.** These pages (the repo's `docs/` tree) are canonical. The docs site at -> [surface.gradientdev.xyz](https://surface.gradientdev.xyz) is generated *from* them — edit docs +> [surface.gradientdev.xyz](https://surface.gradientdev.xyz) is generated *from* them - edit docs > here, not on the site. ## The problem -You write a context file for your codebase — an architecture note, an `AGENTS.md`, a hub for the +You write a context file for your codebase - an architecture note, an `AGENTS.md`, a hub for the auth flow. The day you write it, it's accurate. Then the code moves. Someone refactors the function you described; the behavior changes on -purpose, the tests get updated to match, CI goes green, the PR merges. Everything is correct — +purpose, the tests get updated to match, CI goes green, the PR merges. Everything is correct - except the paragraph that *described* that function. Nobody touched it, for two ordinary reasons: they didn't know it existed, and there was no standard place to look for it. It now says something that is no longer true. -Nothing failed. Nothing fired. The only thing that broke is the explanation the next engineer — -and every agent on every run — will trust and reason from. A codebase can be fully green on tests -and full of confident, well-written, completely false documentation. That second failure quietly -poisons everyone who reads it, and nothing in your toolchain catches it. +Nothing failed. Nothing fired. The only thing that broke is the explanation the next engineer - +and every agent on every run - will trust and reason from. A codebase can be fully green on tests +while its docs quietly describe code that no longer works the way they say, and nothing in your +toolchain catches the gap. Surface closes that gap two ways: **`hubs/`** gives documentation a standard home so people and agents actually find it, and **`surf check`** governs the prose like a test so it can't silently @@ -45,8 +48,8 @@ apart at exactly the moment someone updates one and forgets the other. | **code correct** | fine | **← nothing else catches this** | | **code broken** | your tests catch this | both might fire | -The bottom-left cell is what tests are for. The top-right cell — code that works fine but no -longer does what your docs claim — has no owner. You can't write a unit test for "the README still +The bottom-left cell is what tests are for. The top-right cell - code that works fine but no +longer does what your docs claim - has no owner. You can't write a unit test for "the README still describes this accurately," because the thing that drifted is human-language understanding, and tests don't speak that language. Surface owns that cell. @@ -55,7 +58,7 @@ tests don't speak that language. Surface owns that cell. You anchor a sentence to the code it's about: ```yaml -# auth/_hub.md (a "hub" — frontmatter + prose, lives next to the code it describes) +# auth/_hub.md (a "hub" - frontmatter + prose, lives next to the code it describes) anchors: - claim: "refresh rotation is single-use; reuse triggers global logout" at: "src/auth/refresh.ts > rotateRefreshToken" @@ -71,7 +74,17 @@ stored from the last time a human confirmed the sentence was true. precise report: which hub, which claim, old code vs. new code. It's a tamper-evident seal on the logic of exactly the code your docs claim to describe. Quiet on -cosmetics, loud on logic — see [How the gate works](./reference/how-it-works.md). +cosmetics, loud on logic - see [How the gate works](./reference/how-it-works.md). + +## Interoperable: Surface speaks OKF + +A hub is a **conformant [Open Knowledge Format](./guides/okf.md) concept** - Google's vendor-neutral +standard for knowledge as markdown + frontmatter. OKF standardizes *how knowledge is written down* +but deliberately omits **freshness**: it has no notion of whether the thing a document describes has +changed. That omission is precisely what Surface is. So the relationship is clean: **Surface = OKF + +the freshness OKF leaves out.** Your hubs drop into any OKF consumer (Google's Knowledge Catalog, +the OKF visualizer, Obsidian, git-backed doc editors), which read the knowledge and ignore the +`anchors:` Surface governs. See [Surface and OKF](./guides/okf.md). ## What Surface does NOT do @@ -83,7 +96,7 @@ Read this part. It's the difference between a tool you trust and one that burns meaning isn't mechanically decidable. - **It only watches what you anchored.** If a change in a file no hub points at quietly invalidates a documented invariant, Surface will not see it. Catching that is security review and taint - analysis — a different discipline. Surface guards the spans you chose to describe, nothing more. + analysis - a different discipline. Surface guards the spans you chose to describe, nothing more. - **It is not a retrieval system.** It doesn't search, embed, or serve context. There are good tools for that. Surface optimizes a different thing: *trust* in what you retrieve. @@ -91,28 +104,18 @@ If you want the fuzzy "is this claim still true" judgment, that lives in an **op plugin that reads Surface's JSON output. The core never depends on it. Pull every plugin out and the gate blocks and passes exactly the same. -## Interoperable: Surface speaks OKF - -A hub is a **conformant [Open Knowledge Format](./guides/okf.md) concept** — Google's vendor-neutral -standard for knowledge as markdown + frontmatter. OKF standardizes *how knowledge is written down* -but deliberately omits **freshness**: it has no notion of whether the thing a document describes has -changed. That omission is precisely what Surface is. So the relationship is clean: **Surface = OKF + -the freshness OKF leaves out.** Your hubs drop into any OKF consumer (Google's Knowledge Catalog, -the OKF visualizer, Obsidian, git-backed doc editors), which read the knowledge and ignore the -`anchors:` Surface governs. See [Surface and OKF](./guides/okf.md). - ## Is Surface for you? Honestly? Maybe not. Roughly, it earns its keep when > **codebase complexity × change velocity × (humans + agents) reading it** -is high. A small, slow, simple codebase doesn't need this — your team can just read the code, and +is high. A small, slow, simple codebase doesn't need this - your team can just read the code, and two well-kept markdown files beat the whole apparatus. Use Surface where rebuilding the mental model from source is genuinely expensive and the code moves fast enough to drift. One thing pushes the math toward "yes": **AI agents.** A human onboards onto a domain once and -amortizes the cost over months. An agent re-onboards every session and amortizes nothing — it's a +amortizes the cost over months. An agent re-onboards every session and amortizes nothing - it's a new hire on its first day, every day, paying the full cost of wrong context on every invocation. A bigger context window doesn't fix this; it lets an agent read every line and still confidently derive a *wrong* model, because it can't tell a deliberate invariant from incidental code. If your diff --git a/docs/phases/00-toolchain-scaffold.md b/docs/phases/00-toolchain-scaffold.md index f8ab9d2..b52c09b 100644 --- a/docs/phases/00-toolchain-scaffold.md +++ b/docs/phases/00-toolchain-scaffold.md @@ -1,10 +1,10 @@ -# Phase 0 — Toolchain & workspace scaffold +# Phase 0 - Toolchain & workspace scaffold **Goal:** a compiling, CI-checked empty workspace and a working Rust toolchain. **Proposal refs:** §10 (language/layout), §9.1.5 (config discovery marker). -**Depends on:** — (start here) +**Depends on:** - (start here) **Status:** done @@ -24,7 +24,7 @@ - `surf-core`: library crate, **no I/O deps** (tree-sitter added in Phase 1). - `surf-cli`: binary crate, binary name `surf`, `path` dep on `surf-core`. Deps: `clap` (derive), `anyhow`, `serde` + `serde_json` (JSON report), `serde_yaml` - (frontmatter — added when Phase 3 lands, fine to add now). + (frontmatter - added when Phase 3 lands, fine to add now). 4. **CLI skeleton:** `clap` parser with subcommands `lint`, `check`, `verify`, each stubbed to print "not implemented" and exit non-zero; `--version` wired to the crate version. The top-level `--help` should already carry the §7 scope disclaimer (gate checks named diff --git a/docs/phases/01-anchor-resolution.md b/docs/phases/01-anchor-resolution.md index f74db21..9e8bdbe 100644 --- a/docs/phases/01-anchor-resolution.md +++ b/docs/phases/01-anchor-resolution.md @@ -1,8 +1,8 @@ -# Phase 1 — Anchor resolution via tree-sitter (the load-bearing primitive) +# Phase 1 - Anchor resolution via tree-sitter (the load-bearing primitive) **Goal:** given source text + an `at:` anchor string, return the **exact node span** of the named symbol, deterministically, for TypeScript and Rust. Build it standalone against -fixtures — no markdown, no CLI surface yet. +fixtures - no markdown, no CLI surface yet. **Proposal refs:** §6.1 (AST-via-tree-sitter, why not ctags/LSP/formatter), §6.3 (anchor grammar). @@ -18,17 +18,17 @@ fixtures — no markdown, no CLI surface yet. ## Why this is the riskiest phase The entire value of the tool is firing on the *right* change (§6). Reliable polyglot spans must come from a parser, not ctags (line, not span) or an LSP (reintroduces a CI -dependency). Grammars are compiled **into the binary** and version-pinned — this is the +dependency). Grammars are compiled **into the binary** and version-pinned - this is the reproducibility root (§6.1). Get spans wrong and every downstream phase inherits it. ## Steps 1. **Bundle grammars** in `surf-core`: `tree-sitter`, `tree-sitter-typescript`, `tree-sitter-rust`. **Pin exact versions** and record them (a grammar bump re-hashes - everything — §11.4). No language server, no formatter, nothing but the binary at runtime. + everything - §11.4). No language server, no formatter, nothing but the binary at runtime. 2. **Language detection** by extension: `.ts`/`.tsx` → TypeScript, `.rs` → Rust. Model as an extensible enum so more grammars are additive. -3. **Anchor grammar parser** (§6.3) — pure string parsing, unit-tested in isolation: +3. **Anchor grammar parser** (§6.3) - pure string parsing, unit-tested in isolation: - Qualified path: `src/auth/refresh.ts > TokenService > rotate` (resolve through the symbol tree). - Positional fallback: `... > rotate @2` for the Nth same-named sibling when names genuinely collide. 4. **Symbol-tree resolver:** walk the parse tree, match the qualified path through nesting @@ -36,7 +36,7 @@ reproducibility root (§6.1). Get spans wrong and every downstream phase inherit byte + line span. Distinguish three outcomes the higher layers depend on: - exactly one match → `Ok(span)`. - zero → `NotFound`. - - multiple with no `@N` → `Ambiguous` (lint must reject — §6.3). + - multiple with no `@N` → `Ambiguous` (lint must reject - §6.3). ## Files touched - `surf-core/Cargo.toml` (grammar deps) diff --git a/docs/phases/02-canonical-hashing.md b/docs/phases/02-canonical-hashing.md index 95fedf1..915eb6b 100644 --- a/docs/phases/02-canonical-hashing.md +++ b/docs/phases/02-canonical-hashing.md @@ -1,7 +1,7 @@ -# Phase 2 — AST-canonical hashing + advisory magnitude +# Phase 2 - AST-canonical hashing + advisory magnitude **Goal:** given a resolved node (Phase 1), produce a canonical hash that is **quiet on -rename / reformat / comments** and **loud on a flipped operator** (§6.1, table row 4) — the +rename / reformat / comments** and **loud on a flipped operator** (§6.1, table row 4) - the exact sensitivity profile a correctness gate needs. **Proposal refs:** §6.1 (canonical AST hash), §6.2 (why not similarity; magnitude is advisory only). @@ -14,7 +14,7 @@ exact sensitivity profile a correctness gate needs. > (not in tree), identifiers **alpha-renamed** to positional placeholders (`#0`, `#1`, …) so > a consistent rename is quiet but a *swap* of two names is loud; operators, keywords, > punctuation, and literal values kept verbatim. Hash = SHA-256 truncated to 12 hex chars. -> `Magnitude` (Small/Medium/Large via token Levenshtein) is advisory only — there is no code +> `Magnitude` (Small/Medium/Large via token Levenshtein) is advisory only - there is no code > path from it to a verdict (§6.2). `resolve.rs` refactored to expose `parse_tree` / > `resolve_node` so the hasher reuses resolution. @@ -25,14 +25,14 @@ exact sensitivity profile a correctness gate needs. comments, and trivia are not in the tree, so they fall out for free. Hash the stream with a stable algorithm (SHA-256), surface a short hex (e.g. `9b1c33a`, matching the §6 example). -2. **Rename-quietness:** normalize identifier *positions*, not names (§6.1) — replace +2. **Rename-quietness:** normalize identifier *positions*, not names (§6.1) - replace identifiers with positional placeholders so a pure rename yields the **same** hash, while an operator or structural change does not. This is what keeps the gate from firing on the commonest refactor. 3. **Advisory tree-edit magnitude** (§6.2): a cheap structural diff between two subtrees → a small integer or category (`small` / `rename-shaped` / `large`). Ships in the JSON report only. **It never gates.** The forbidden rule, explicitly: never "fail only if hash - changed *and* magnitude > threshold" — that would hide the single-operator logic flip, + changed *and* magnitude > threshold" - that would hide the single-operator logic flip, the highest-value catch. ## Files touched @@ -44,4 +44,4 @@ Golden tests, both languages: - (a) reformat / whitespace / comment-only change → **same** hash. - (b) pure rename → **same** hash. - (c) `+`→`-`, `<`→`<=`, deleted `await` → **different** hash. -- magnitude is populated and plausible (rename → `rename-shaped`/small; large rewrite → `large`) but is asserted to be *reporting only* — no code path lets it affect pass/fail. +- magnitude is populated and plausible (rename → `rename-shaped`/small; large rewrite → `large`) but is asserted to be *reporting only* - no code path lets it affect pass/fail. diff --git a/docs/phases/03-hub-format.md b/docs/phases/03-hub-format.md index 9655232..dd9e879 100644 --- a/docs/phases/03-hub-format.md +++ b/docs/phases/03-hub-format.md @@ -1,11 +1,11 @@ -# Phase 3 — Hub format + frontmatter parser (the contract) +# Phase 3 - Hub format + frontmatter parser (the contract) -**Goal:** define and parse the hub document — frontmatter schema + prose body — plus +**Goal:** define and parse the hub document - frontmatter schema + prose body - plus `surf.toml` config discovery. This is the contract everything else binds to. **Proposal refs:** §6 (claim shape), §6.3 (`at:` as list), §9.1.1 (hub format), §9.1.5 (config marker), §9.3 (`refs` deferred), §9.1 (`covers` absent). -**Depends on:** Phase 0. (Independent of 1/2 — can proceed in parallel.) +**Depends on:** Phase 0. (Independent of 1/2 - can proceed in parallel.) **Status:** done @@ -22,14 +22,14 @@ 1. **Schema** (serde structs): - `summary`: string. - `anchors`: list of `{ claim: string, at: string | [string], hash: string }`. - `at:` accepts a scalar **or a list** — a claim is stale if *any* listed span changes (§6.3). - - `refs`: parsed but **inert** in the MVP — forward-declared only (§9.3). Do not resolve it. + `at:` accepts a scalar **or a list** - a claim is stale if *any* listed span changes (§6.3). + - `refs`: parsed but **inert** in the MVP - forward-declared only (§9.3). Do not resolve it. - **`covers` is absent** (§9.1): it is consumed only by the deferred reviewer plugin, so asking authors to write it now is ceremony. Forward-declared in the proposal, not shipped. - Body: the markdown prose after the frontmatter fence. 2. **Frontmatter split:** leading `---`-fenced YAML block + remaining prose body. Tolerant, clear errors on a missing/malformed fence. -3. **Typed validation errors** surfaced as results `lint` (Phase 4) can render — not panics. +3. **Typed validation errors** surfaced as results `lint` (Phase 4) can render - not panics. 4. **Config model + discovery:** `surf.toml` walked up from `cwd` to the nearest marker, like `git` / `ruff` (§9.1.5). Config holds the hub glob(s); default `hubs/*.md`. diff --git a/docs/phases/04-surf-lint.md b/docs/phases/04-surf-lint.md index ed3de57..85e10d5 100644 --- a/docs/phases/04-surf-lint.md +++ b/docs/phases/04-surf-lint.md @@ -1,9 +1,9 @@ -# Phase 4 — `surf lint` +# Phase 4 - `surf lint` **Goal:** frontmatter is well-formed; every `at:` resolves to **exactly one** node; renames are handled per §6.4. Composes Phases 1–3. -**Proposal refs:** §9.1.2 (lint), §6.3 (exactly-one resolution), §6.4 (renames are an MVP problem), §11.3 (granularity guidance — minimal here). +**Proposal refs:** §9.1.2 (lint), §6.3 (exactly-one resolution), §6.4 (renames are an MVP problem), §11.3 (granularity guidance - minimal here). **Depends on:** Phases 1, 3. @@ -13,7 +13,7 @@ renames are handled per §6.4. Composes Phases 1–3. > (Block/Warn); `run` prints them and sets the exit code (block → failure, warn-only → 0). > **Rename detection is hash-based, not git-based** (deviation from the doc's git+similarity > sketch): `surf-core/src/rename.rs::find_renamed` walks every definition and matches the -> claim's stored hash — because the canonical hash alpha-renames identifiers, a renamed-but- +> claim's stored hash - because the canonical hash alpha-renames identifiers, a renamed-but- > unchanged symbol still matches. Deterministic, no network, no git. A *file* rename makes > the path unreadable and surfaces as a Block ("cannot read … (file moved or removed?)"). > Needs `collect_all_defs` (resolve.rs) + `hash_node` (hash.rs). @@ -24,7 +24,7 @@ renames are handled per §6.4. Composes Phases 1–3. 2. For each anchor, resolve via Phase 1: - `Ambiguous` → error, tell the author to add `@N`. - `NotFound` → see rename handling below. -3. **Rename handling (§6.4)** — hits the MVP constantly, not deferrable: +3. **Rename handling (§6.4)** - hits the MVP constantly, not deferrable: - **Renamed but clearly present** (git rename detection + high AST similarity of the moved node via the Phase 2 magnitude) → **warn**, not block; suggest `surf verify --follow` to re-point and re-hash in one step. @@ -35,7 +35,7 @@ renames are handled per §6.4. Composes Phases 1–3. ## Files touched - `surf-cli/src/lint.rs` - `surf-cli/src/main.rs` (wire subcommand) -- git rename detection helper (shell out to `git` or a plumbing call) — likely `surf-core` or a cli helper +- git rename detection helper (shell out to `git` or a plumbing call) - likely `surf-core` or a cli helper ## Verify Fixture hub + fixture source covering each case: diff --git a/docs/phases/05-surf-check.md b/docs/phases/05-surf-check.md index 61dfa46..920de93 100644 --- a/docs/phases/05-surf-check.md +++ b/docs/phases/05-surf-check.md @@ -1,10 +1,10 @@ -# Phase 5 — `surf check` (the gate — the one load-bearing piece) +# Phase 5 - `surf check` (the gate - the one load-bearing piece) **Goal:** AST-canonical-hash each anchored span (Phase 2), compare to the stored per-anchor `hash` in frontmatter, and **block only on a documented span that has diverged** (§9.1.3). -Emit `--format json` — the seam every future plugin attaches to (§5). +Emit `--format json` - the seam every future plugin attaches to (§5). -**Proposal refs:** §5 (the boundary — JSON is the contract), §6 (per-symbol, not per-file), §6.3 (list `at:`), §9.1.3 (the gate), §9.1 (CI step everyone gets wrong). +**Proposal refs:** §5 (the boundary - JSON is the contract), §6 (per-symbol, not per-file), §6.3 (list `at:`), §9.1.3 (the gate), §9.1 (CI step everyone gets wrong). **Depends on:** Phases 2, 3. @@ -24,13 +24,13 @@ Emit `--format json` — the seam every future plugin attaches to (§5). 1. For each anchor: resolve (P1) → canonical hash (P2) → compare to the stored `hash`. A list `at:` is stale if **any** listed span diverges (§6.3). 2. **Exit codes:** `0` clean, non-zero on any divergence (this is what blocks CI). -3. **`--format json`** (§5) — the frozen contract. Per diverged claim emit: +3. **`--format json`** (§5) - the frozen contract. Per diverged claim emit: `{ hub, claim, at, old_hash, new_hash, old_code, new_code, prose, magnitude }`. Everything optional (reviewer plugin, etc.) plugs in here; the core never depends on it. Keep the human (default) format readable; JSON is opt-in. -4. **CI scoping — the step everyone gets wrong (§9.1):** the gate hashes the *working-tree* +4. **CI scoping - the step everyone gets wrong (§9.1):** the gate hashes the *working-tree* span and compares to the hash committed in frontmatter. It needs the checkout, **not** - full history — do **not** `fetch-depth: 0`. The only thing a base ref buys is scoping + full history - do **not** `fetch-depth: 0`. The only thing a base ref buys is scoping (re-check only anchors whose files changed in the PR); a shallow fetch of the merge base covers that. @@ -43,5 +43,5 @@ Emit `--format json` — the seam every future plugin attaches to (§5). - Seed `hubs/auth.md`; change a documented span → `check` exits non-zero and the JSON names the right claim with old/new code and a populated `magnitude`. - Change an **un-anchored** span in the *same file* → `check` stays green. (Per-symbol, not - per-file — the core promise of §6.) + per-file - the core promise of §6.) - `--format json` output validates against the documented contract. diff --git a/docs/phases/06-surf-verify.md b/docs/phases/06-surf-verify.md index f3af424..722b52a 100644 --- a/docs/phases/06-surf-verify.md +++ b/docs/phases/06-surf-verify.md @@ -1,6 +1,6 @@ -# Phase 6 — `surf verify` +# Phase 6 - `surf verify` -**Goal:** the human escape hatch (§8) — re-hash after a human confirms the prose still +**Goal:** the human escape hatch (§8) - re-hash after a human confirms the prose still holds. `--follow` re-points a renamed anchor and re-hashes in one step. **Proposal refs:** §8 (escape hatch, "I looked, still true"), §6.4 (`--follow` for renames), §9.1.4. @@ -11,24 +11,24 @@ holds. `--follow` re-points a renamed anchor and re-hashes in one step. > `surf-cli/src/verify.rs` (+ `main.rs`). `surf verify [] [--follow]` re-hashes anchors > and writes the hash back. Writes are **surgical** via `surf-core::set_anchor_hash` / -> `set_anchor_at` (minimal-diff line editor in `hub.rs`) — only the touched line changes, and +> `set_anchor_at` (minimal-diff line editor in `hub.rs`) - only the touched line changes, and > an unchanged hash is a no-op (byte-identical). `--follow` re-points a renamed single-site, > single-segment anchor (via `find_renamed`) then re-hashes. Unresolvable anchors are skipped > (exit non-zero) unless `--follow` recovers them. > > **Model fix carried in from Phase 5:** a list `at:` now produces **one combined hash per > claim** (`combine_site_hashes`, single-site = identity), so check/verify treat a multi-site -> claim as one unit (stale if any span changes) — previously check looped per-site against a +> claim as one unit (stale if any span changes) - previously check looped per-site against a > single stored hash, which was wrong. ## Steps 1. `surf verify [selector]` (hub / anchor selector): re-resolve (P1), re-hash (P2), and - **write the new `hash` back into the frontmatter** — the explicit "I looked, still true". + **write the new `hash` back into the frontmatter** - the explicit "I looked, still true". With symbol-scoped AST anchoring this should be needed only *occasionally*; running it on most PRs is a smell that anchors are too coarse (§8). 2. `--follow`: when the symbol was renamed, update the `at:` path to the new - name/location, then re-hash — clearing the Phase 4 rename warning in one step (§6.4). + name/location, then re-hash - clearing the Phase 4 rename warning in one step (§6.4). 3. Preserve frontmatter key order / formatting on write as much as practical (minimize diff noise; authors will review these writes). diff --git a/docs/phases/07-distribution.md b/docs/phases/07-distribution.md index b6eb3ce..7556029 100644 --- a/docs/phases/07-distribution.md +++ b/docs/phases/07-distribution.md @@ -1,4 +1,4 @@ -# Phase 7 — Distribution & CI integration +# Phase 7 - Distribution & CI integration **Goal:** make the gate actually run in a repo. Ship first the channels repos actually consume (most *consume the Action*, they don't "install"). @@ -18,7 +18,7 @@ consume (most *consume the Action*, they don't "install"). > > **Deferred (per §10, not yet needed):** aarch64-linux build + musl static target; npm/pip/ > brew channels. **Not locally verifiable:** the Action/release path needs a tag push + -> network — the CI self-check exercises the same `surf check` gate on our own PRs instead. +> network - the CI self-check exercises the same `surf check` gate on our own PRs instead. > To cut a release: bump the workspace version and push a `v*` tag. ## Steps @@ -26,13 +26,13 @@ consume (most *consume the Action*, they don't "install"). 1. **Static binary** per `(os, arch)`, built in release CI: - start with `aarch64-apple-darwin` (this dev machine); - add `x86_64-apple-darwin` and `x86_64-unknown-linux-gnu` / `-musl` (Linux is what CI runs). -2. **GitHub Action wrapper** (`action.yml`) — the primary distribution channel (§10). Runs - `surf check` on PRs. **Correct checkout: shallow, not `fetch-depth: 0`** (§9.1) — only a +2. **GitHub Action wrapper** (`action.yml`) - the primary distribution channel (§10). Runs + `surf check` on PRs. **Correct checkout: shallow, not `fetch-depth: 0`** (§9.1) - only a shallow merge-base fetch if PR-scoping is enabled. 3. **pre-commit hook** definition (`.pre-commit-hooks.yaml`) so repos can run `surf check`/`lint` locally. -4. **`curl | sh` installer** (`install.sh`) — detects `(os, arch)`, downloads the matching release binary. +4. **`curl | sh` installer** (`install.sh`) - detects `(os, arch)`, downloads the matching release binary. 5. **Defer** npm (shim + per-platform `optionalDependencies`, never a `postinstall` - downloader), pip (`maturin` wheels), and brew (§10) — don't ship channels nobody uses yet. + downloader), pip (`maturin` wheels), and brew (§10) - don't ship channels nobody uses yet. ## Files touched - `.github/workflows/release.yml` (build + upload per-target binaries) diff --git a/docs/phases/OVERVIEW.md b/docs/phases/OVERVIEW.md index 2fc8ef8..f11eaf8 100644 --- a/docs/phases/OVERVIEW.md +++ b/docs/phases/OVERVIEW.md @@ -1,4 +1,4 @@ -# Surface MVP — Build Plan (Overview) +# Surface MVP - Build Plan (Overview) > Full context for the phased build. Each `NN-*.md` file in this directory is a single, > self-contained phase; this file is the whole plan for when you need wider context while @@ -15,26 +15,26 @@ This is a **greenfield build**. The working directory contains only `docs/`; it git repo and **Rust is not installed**. We are on Apple Silicon macOS (`arm64`) with Homebrew, `git`, and `gh` available. -The MVP is the smallest thing that tests the hypothesis (§9.1): entirely deterministic — +The MVP is the smallest thing that tests the hypothesis (§9.1): entirely deterministic - **no LLM, no network, no API key**. We build exactly the five MVP pieces, then stop. The load-bearing risk is the AST-canonical hashing primitive (§6.1), so we de-risk that first with raw fixtures before any markdown or CLI surface exists. ## Locked decisions - **Grammars:** TypeScript + Rust. Rust lets Surface dogfood itself (its own source becomes hubs). -- **Layout:** Cargo workspace — `surf-core` (pure parse/resolve/hash, no I/O) + `surf-cli` (clap binary). Keeps the §10 WASM/IDE reuse path free and makes the core unit-testable in isolation. +- **Layout:** Cargo workspace - `surf-core` (pure parse/resolve/hash, no I/O) + `surf-cli` (clap binary). Keeps the §10 WASM/IDE reuse path free and makes the core unit-testable in isolation. - **Rust install:** official `rustup` via `curl | sh` (version-pinned, matches the "version-pinned grammar shipped *in* the binary" reproducibility story better than Homebrew's rolling `rust`). -## Scope guardrails (from the proposal — do NOT build these in MVP) +## Scope guardrails (from the proposal - do NOT build these in MVP) - No `refs` resolver, no `surf index` catalog, no MCP service, **no reviewer plugin**, no `covers` field (§9.1, §9.3). -- No similarity-score gating — the gate is a boolean AST hash; tree-edit magnitude is **advisory JSON only**, never adjudicates (§6.2). -- The gate promises only "the named span is unchanged since last verified" — **not** system-wide invariants (§7). This must be loud in `--help` / README, never oversold. +- No similarity-score gating - the gate is a boolean AST hash; tree-edit magnitude is **advisory JSON only**, never adjudicates (§6.2). +- The gate promises only "the named span is unchanged since last verified" - **not** system-wide invariants (§7). This must be loud in `--help` / README, never oversold. ## Phase map | Phase | Title | Depends on | |---|---|---| -| 0 | Toolchain & workspace scaffold | — | +| 0 | Toolchain & workspace scaffold | - | | 1 | Anchor resolution via tree-sitter | 0 | | 2 | AST-canonical hashing + advisory magnitude | 1 | | 3 | Hub format + frontmatter parser | 0 | diff --git a/docs/phases/README.md b/docs/phases/README.md index 332c3c5..396def8 100644 --- a/docs/phases/README.md +++ b/docs/phases/README.md @@ -1,6 +1,6 @@ -# Surface MVP — Phases +# Surface MVP - Phases -Phased build of the Surface MVP. Read [`OVERVIEW.md`](./OVERVIEW.md) first — it carries the +Phased build of the Surface MVP. Read [`OVERVIEW.md`](./OVERVIEW.md) first - it carries the context, locked decisions, and scope guardrails every phase assumes. The product spec is [`../surface-proposal.md`](../surface-proposal.md). @@ -23,26 +23,26 @@ line as you go. ## Language support TypeScript/TSX, **JavaScript/JSX**, Rust, **Python**, **Go**. JavaScript reuses the TS family -via the TSX grammar (which parses plain JS and JSX) — zero new resolver/hash code. Python rides +via the TSX grammar (which parses plain JS and JSX) - zero new resolver/hash code. Python rides the generic scope-set resolver; Go has a dedicated resolver (`resolve_go`) because its symbols are flat and methods attach by receiver (`Type > Method`). Adding a language is additive across `lang.rs` / `resolve.rs` / `hash.rs`. ## Locked decisions (see OVERVIEW for rationale) - **Grammars:** TypeScript + Rust + Python + Go (Surface dogfoods its own Rust source); JS/JSX via the TSX grammar. -- **Layout:** Cargo workspace — `surf-core` (pure, no I/O) + `surf-cli` (clap binary). +- **Layout:** Cargo workspace - `surf-core` (pure, no I/O) + `surf-cli` (clap binary). - **Rust install:** `rustup` via `curl | sh`, toolchain pinned in `rust-toolchain.toml`. ## Post-MVP additions -- **`surf init`** (`surf-cli/src/init.rs`) — bootstraps a workspace (writes `surf.toml` + +- **`surf init`** (`surf-cli/src/init.rs`) - bootstraps a workspace (writes `surf.toml` + `hubs/`); the one command that runs before discovery since it creates the marker. -- **`surf new `** (`surf-cli/src/new.rs`) — scaffolds a lint-clean hub under the +- **`surf new `** (`surf-cli/src/new.rs`) - scaffolds a lint-clean hub under the configured hubs dir. Both are small authoring-ergonomics extras beyond the 8 phases (lower the §8 claim-maintenance cost); not in the original proposal scope. -## Scope guardrails — NOT in the MVP +## Scope guardrails - NOT in the MVP - No `refs` resolver, `surf index`, MCP service, reviewer plugin, or `covers` field. - No similarity-score gating. Boolean AST hash only; tree-edit magnitude is advisory JSON. -- Gate promises only "named span unchanged since last verified" — never system-wide invariants. +- Gate promises only "named span unchanged since last verified" - never system-wide invariants. diff --git a/docs/reference/commands.md b/docs/reference/commands.md index e785cd9..0fe3491 100644 --- a/docs/reference/commands.md +++ b/docs/reference/commands.md @@ -1,54 +1,54 @@ --- title: Commands -description: The surf CLI — init, new, suggest, for, lint, check, and verify, with their flags and exit behavior. +description: The surf CLI - init, new, suggest, for, lint, check, and verify, with their flags and exit behavior. --- -- **`surf init`** — bootstrap a workspace: write `surf.toml` and create the hubs directory +- **`surf init`** - bootstrap a workspace: write `surf.toml` and create the hubs directory (idempotent). -- **`surf new `** — scaffold a new empty hub under your hubs directory. -- **`surf suggest [--all] [--format human|json]`** — scan source globs for public - *callables* no hub anchors yet — top-level functions plus **Python class methods and Go - methods** (as `file > Type > method` anchors) — and print a copy-pasteable starter hub. - Suggestions only — never writes or stamps. Coverage is keyed on the whole anchor path, so +- **`surf new `** - scaffold a new empty hub under your hubs directory. +- **`surf suggest [--all] [--format human|json]`** - scan source globs for public + *callables* no hub anchors yet - top-level functions plus **Python class methods and Go + methods** (as `file > Type > method` anchors) - and print a copy-pasteable starter hub. + Suggestions only - never writes or stamps. Coverage is keyed on the whole anchor path, so anchoring one method doesn't suppress its siblings. A glob that matches **no files** is reported on stderr (so a typo doesn't read as a clean "all anchored"); `suggest` exits non-zero only when *every* glob was empty. The default is callables-only to avoid over-anchoring fatigue; **`--all`** - additionally proposes the non-callable targets anchoring already supports — top-level classes, - module-level constants and type aliases, and class attributes (Python) — so they're discoverable + additionally proposes the non-callable targets anchoring already supports - top-level classes, + module-level constants and type aliases, and class attributes (Python) - so they're discoverable (see [Authoring hubs](../guides/authoring-hubs.md)). -- **`surf for [symbol] [--format human|json]`** — reverse lookup: list every hub + claim +- **`surf for [symbol] [--format human|json]`** - reverse lookup: list every hub + claim anchored into ``, so you can pull up the documentation governing a file *before* you edit it (the inverse of `suggest`). An optional trailing `symbol` narrows to anchors whose first - segment matches. Read-only and always exits 0 — a query, not a gate. `--format json` emits a + segment matches. Read-only and always exits 0 - a query, not a gate. `--format json` emits a versioned envelope (`{version, path, matches}`) for agents. -- **`surf lint [--format human|json]`** — validate frontmatter and that every `at:` resolves to +- **`surf lint [--format human|json]`** - validate frontmatter and that every `at:` resolves to exactly one symbol. Blocks on ambiguous or vanished anchors; **warns** (and suggests `verify --follow`) on a symbol that was merely renamed, or a file that git reports has moved. Also emits advisory granularity warnings (never blocking): a near-whole-file anchor span, a hub with too many anchors, and public functions in an anchored file that no claim covers. -- **`surf check [--format human|json] [--base ] [--files ]`** — the gate. +- **`surf check [--format human|json] [--base ] [--files ]`** - the gate. AST-canonical-hash each anchored span and compare to the stored hash; non-zero exit on any divergence. By default every claim is checked. `--base ` scopes to claims whose anchored files changed since the merge base **and** recovers the advisory `old_code` / `magnitude` fields from that ref (omit it for a full check with enrichment against `HEAD`). `--files ` scopes to claims whose anchored file(s) match a comma-separated glob (e.g. `surf-core/**`). `--format json` emits the [versioned report envelope](./how-it-works.md#the-json-seam). -- **`surf verify [] [--follow] [--format human|json]`** — re-seal after you've confirmed the +- **`surf verify [] [--follow] [--format human|json]`** - re-seal after you've confirmed the prose still holds; writes the hash into the frontmatter. `` limits to one anchor. `--follow` re-points a single-segment anchor whose **symbol** was renamed, or whose **file** was moved - (detected via git), and re-hashes in one step — only when the code is otherwise unchanged. -- **`surf stats [--since ] [--until ] [--format human|json]`** — adoption metrics from + (detected via git), and re-hashes in one step - only when the code is otherwise unchanged. +- **`surf stats [--since ] [--until ] [--format human|json]`** - adoption metrics from git history (advisory, never a gate): the **rubber-stamp rate** (re-stamps that changed a claim's stored hash but left its prose untouched) and the **in-place update rate** (commits touching an anchored file that re-sealed the claim in the same commit). One commit = one PR - (merges excluded). Heuristic by design — see the [stats guide](../guides/stats.md). Errors + (merges excluded). Heuristic by design - see the [stats guide](../guides/stats.md). Errors (non-zero) if git history is unavailable. ## Per-claim options A claim's frontmatter can carry options beyond `claim`/`at`/`hash`: -- **`ignore_literals: true`** — exclude string-literal *content* from this claim's hash, so a copy +- **`ignore_literals: true`** - exclude string-literal *content* from this claim's hash, so a copy edit inside the anchored span no longer re-opens the gate. Logic edits (operators, numbers, structure) are still caught. The stored hash is computed in this mode, so the option travels with the claim. See [Authoring hubs](../guides/authoring-hubs.md) and the [FAQ](./faq.md). diff --git a/docs/reference/configuration.md b/docs/reference/configuration.md index 82cf500..2f47981 100644 --- a/docs/reference/configuration.md +++ b/docs/reference/configuration.md @@ -3,8 +3,8 @@ title: Configuration description: surf.toml marks the workspace and globs your hubs; the supported languages; and what the gate needs from CI. --- -A `surf.toml` at the repo root marks the workspace — `surf` walks up from the current directory to -find it, like `git` or `ruff` — and globs your hubs: +A `surf.toml` at the repo root marks the workspace - `surf` walks up from the current directory to +find it, like `git` or `ruff` - and globs your hubs: ```toml hubs = ["hubs/*.md"] @@ -13,7 +13,7 @@ hubs = ["hubs/*.md"] Point the glob wherever your hubs live: keep them central (`hubs/*.md`) or co-locate them with code (e.g. `["**/_hub.md"]`). -`hubs` is a list, so you can combine locations — the matches are unioned, then sorted and +`hubs` is a list, so you can combine locations - the matches are unioned, then sorted and de-duplicated, so overlapping globs are safe: ```toml @@ -21,7 +21,7 @@ hubs = ["hubs/*.md", "docs/hubs/*.md", "**/_hub.md"] ``` Any file a glob matches is treated as a hub if it parses as one (frontmatter `anchors:` block + -markdown body), so claims can live in *any* file — not just files under `hubs/`. The list can pull +markdown body), so claims can live in *any* file - not just files under `hubs/`. The list can pull in files that aren't named like hubs, e.g. `AGENTS.md` or `CLAUDE.md` (see [Where claims can live](../guides/authoring-hubs.md#where-claims-can-live) for the recommended layout and the trade-offs). @@ -32,8 +32,8 @@ layout and the trade-offs). ## OKF bundles -To govern an [Open Knowledge Format](../guides/okf.md) bundle — a directory *tree* of concept files -rather than a flat folder — list its root(s) under `bundles`. Each root expands to `/**/*.md`: +To govern an [Open Knowledge Format](../guides/okf.md) bundle - a directory *tree* of concept files +rather than a flat folder - list its root(s) under `bundles`. Each root expands to `/**/*.md`: ```toml hubs = ["hubs/*.md"] # optional; flat layout @@ -41,7 +41,7 @@ bundles = ["knowledge/sales"] # an in-repo OKF bundle (recursively) ``` `hubs` and `bundles` are unioned. OKF reserved files (`index.md`, `log.md`) swept up by a bundle -glob are recognized and **skipped for governance** — they hold no claims, so a missing frontmatter +glob are recognized and **skipped for governance** - they hold no claims, so a missing frontmatter fence in them never blocks the gate. ## Frontmatter fields @@ -56,7 +56,7 @@ governance fields: | `summary` | Surface | The onboarding one-liner (optional). Distinct from OKF `description`. | | `tags` | OKF | Cross-cutting tags. | | `timestamp` | OKF | Last *modified* (ISO 8601). Distinct from a claim's `verified_at` (last *attested*). | -| `anchors` | Surface | The claims — the governed part. See [Authoring hubs](../guides/authoring-hubs.md). | +| `anchors` | Surface | The claims - the governed part. See [Authoring hubs](../guides/authoring-hubs.md). | | `refs` / `covers` | Surface | Composition edges and advisory coverage globs. | | *(any other key)* | OKF / tools | Preserved verbatim (e.g. OKF `description`/`resource`, a doc system's `author`/`created`). | @@ -72,13 +72,13 @@ TypeScript (`.ts`, `.tsx`, `.mts`, `.cts`), JavaScript/JSX (`.js`, `.jsx`, `.mjs (`.rs`), Python (`.py`, `.pyi`), and Go (`.go`). Grammars are compiled into the binary and version-pinned, so a hash computed on your laptop and in CI always agree. -In Python, `at:` resolves callables (functions, methods, classes) **and** non-callables — module +In Python, `at:` resolves callables (functions, methods, classes) **and** non-callables - module constants, type aliases (`X = Literal[...]`, `type X = ...`), and class attributes (`Class > attr`). ## CI The gate hashes your working tree and compares it to the hash committed in the frontmatter. It -needs the checkout, **not** the history — do **not** set `fetch-depth: 0`. (The advisory +needs the checkout, **not** the history - do **not** set `fetch-depth: 0`. (The advisory `old_code` / `magnitude` use a single `git show` of the base ref; with no git available the verdict is unchanged, those fields are just omitted.) See [CI integration](../guides/ci-integration.md). diff --git a/docs/reference/faq.md b/docs/reference/faq.md index 6965bb8..789cf56 100644 --- a/docs/reference/faq.md +++ b/docs/reference/faq.md @@ -1,22 +1,22 @@ --- title: FAQ -description: Common questions about Surface — how it differs from tests, string literals, CI cost, languages, and what a green check promises. +description: Common questions about Surface - how it differs from tests, string literals, CI cost, languages, and what a green check promises. --- -**Isn't this just tests?** No — see the 2×2 in [What is Surface?](../index.md#what-surface-does-that-tests-dont). +**Isn't this just tests?** No - see the 2×2 in [What is Surface?](../index.md#what-surface-does-that-tests-dont). A test asserts that *behavior* matches an expectation written in code; Surface asserts that *prose* still matches the code it describes. Different expectation, different failure mode, and they drift apart exactly when someone updates one and forgets the other. -**Why not just put doc comments next to the code?** Co-located comments still rot silently — +**Why not just put doc comments next to the code?** Co-located comments still rot silently - nothing gates them. Surface is the gate; your prose can live wherever you like, but the seal is what's enforced in CI. -**Does it slow CI down?** No. It parses and hashes a handful of spans — I/O-bound, not +**Does it slow CI down?** No. It parses and hashes a handful of spans - I/O-bound, not compute-bound. No model, no network, no API key. **Will editing a string literal trip the gate?** By default, yes. Literal *values* are part of the -hashed logic, so changing a string — even user-facing copy — inside an anchored span fires a +hashed logic, so changing a string - even user-facing copy - inside an anchored span fires a divergence. "Cosmetic" means whitespace, comments, and consistent renames, not "edits that feel unimportant." If copy churn re-opens a claim too often, you have two options: anchor a narrower symbol, or set `ignore_literals: true` on that claim to exclude string-literal content from its @@ -26,5 +26,5 @@ hash (logic edits are still caught). See [Authoring hubs](../guides/authoring-hu grammars. More are a build-time addition to the binary, never a runtime dependency. **What does a green check actually promise?** That nothing you anchored has changed since it was -last verified — *not* that your docs are true, and nothing at all about code you didn't anchor. See +last verified - *not* that your docs are true, and nothing at all about code you didn't anchor. See [What Surface does NOT do](../index.md#what-surface-does-not-do). diff --git a/docs/reference/hash-recipes.md b/docs/reference/hash-recipes.md index 83e2edf..a0a05a8 100644 --- a/docs/reference/hash-recipes.md +++ b/docs/reference/hash-recipes.md @@ -1,19 +1,19 @@ --- title: Hash recipes -description: The versioned canonicalization recipes behind every stored stamp — what each one does, how stamps are labelled, and how to migrate. +description: The versioned canonicalization recipes behind every stored stamp - what each one does, how stamps are labelled, and how to migrate. --- A **stamp** is the value `surf verify` writes into a claim's `hash:` field and `surf check` compares against. It is produced by a **recipe**: the exact rules for turning a resolved span into a canonical token stream (see [How the gate works](./how-it-works.md), step 2). Changing those rules changes the output for unchanged code, which would silently invalidate every stamp in the -wild — so each recipe has a number, and every stamp records the recipe that produced it. +wild - so each recipe has a number, and every stamp records the recipe that produced it. ## Stamp format ``` -hash: 2:f1075e760a17 # v2 stamp — explicit prefix -hash: f1075e760a17 # bare 12-hex — implicitly v1 (written before recipes were numbered) +hash: 2:f1075e760a17 # v2 stamp - explicit prefix +hash: f1075e760a17 # bare 12-hex - implicitly v1 (written before recipes were numbered) ``` `surf check` reads the prefix, verifies the span under that recipe, and: @@ -34,24 +34,24 @@ Upgrading surf does **not** mass-flag your repo. v1 stamps keep verifying in v1 surf verify ``` -once, which re-stamps every anchor under the current recipe — including v1 anchors whose hash +once, which re-stamps every anchor under the current recipe - including v1 anchors whose hash still matches (the one narrow case where `verify` rewrites an otherwise-unchanged stamp). After that single pass the whole repo is on v2. > Forced re-verify is deliberately *not* automatic on upgrade. `verify` stamps whatever the code > is *now*; if a repo already contains drift that v1 missed, a blind re-stamp would launder it -> green. v1-compat keeps the gate honest *through* the migration — `check` can still tell +> green. v1-compat keeps the gate honest *through* the migration - `check` can still tell > "unchanged under the old recipe" (pass) from "actually changed" (block). ## Recipes -### v1 — original (surf ≤ 0.6.x; bare-hex stamps) +### v1 - original (surf ≤ 0.6.x; bare-hex stamps) Walk the resolved span's syntax tree into tokens: - whitespace and comments are absent from the tree → ignored; - every **identifier** is alpha-renamed to a positional placeholder (`#0`, `#1`, …) in order of - first occurrence — a *consistent* rename hashes identically, swapping two names does not; + first occurrence - a *consistent* rename hashes identically, swapping two names does not; - operators, keywords, punctuation, and literal **values** are kept verbatim; - a Python **decorator name** is kept verbatim (`@cache` → `@lru_cache` is loud). @@ -59,10 +59,10 @@ SHA-256 of the token stream, truncated to 12 hex. **Known blind spot (#77, closed by v2):** because *every* identifier is alpha-renamed, re-pointing a span at a different single-occurrence external symbol (`PointsTier.TIER_1` → `TIER_2`, `b.Del` → -`b.Keep`) yields a byte-identical stream — the claim's prose silently becomes false while the gate +`b.Keep`) yields a byte-identical stream - the claim's prose silently becomes false while the gate stays green. This is exactly what the v2 bound/free split fixes. -### v2 — the bound/free split (surf ≥ 0.7.0; `2:` prefix) +### v2 - the bound/free split (surf ≥ 0.7.0; `2:` prefix) v1 alpha-renames *every* identifier, which is its blind spot: an identifier occurring once maps to the same placeholder no matter what it names, so re-pointing a span at a *different* single-occurrence @@ -70,23 +70,23 @@ external symbol is byte-identical and silently passes. v2 fixes this by splitting identifiers into **bound** and **free**: -- **Bound** — names *declared inside the hashed span*: the symbol's own name, parameters, locals, +- **Bound** - names *declared inside the hashed span*: the symbol's own name, parameters, locals, loop/range/comprehension variables, `with`/`catch` aliases, generic parameters, and destructuring binders. These are **alpha-renamed** exactly as in v1, so a consistent local rename still hashes - identically — rename tolerance (§6.1) is preserved. -- **Free** — everything else: external members, call targets, types, enum/constant references, + identically - rename tolerance (§6.1) is preserved. +- **Free** - everything else: external members, call targets, types, enum/constant references, object/destructuring keys, decorator names, JSX tags. These are emitted **verbatim** (`kind:text`), so re-pointing at a different symbol is loud *even when the name occurs once*. This closes the #77 class in general, not just for member accesses: `PointsTier.TIER_1` → `TIER_2`, `getHighest` → `getLowest`, a bare `helper(x)` → `other(x)`, a parameter type `Foo` → `Bar`, and an object key `{ alpha }` → `{ beta }` all now change the hash. It also **subsumes** the two special -cases the older design carried — a decorator name (#8) and a member-access name (the #140 first cut) +cases the older design carried - a decorator name (#8) and a member-access name (the #140 first cut) are simply free identifiers now; no dedicated branch is needed for either. (The member-access positions keep one dedicated check so they stay verbatim even when their text collides with a bound -local — `x` the parameter vs `obj.x` the field — since that position can never *be* the binding.) +local - `x` the parameter vs `obj.x` the field - since that position can never *be* the binding.) -Binding detection is tree-sitter-only — there is no scope analysis — so it is **fail-closed**: +Binding detection is tree-sitter-only - there is no scope analysis - so it is **fail-closed**: a position not positively recognized as a binding defaults to *free* (verbatim). The two error directions are not symmetric: misclassifying bound→free is a *visible* false positive (a benign rename trips the gate, a human sees it); free→bound is the *invisible* miss this whole recipe exists @@ -112,9 +112,9 @@ to prevent. So when in doubt, free wins. **Accepted approximation (the residue).** Without scope analysis, a match-arm / pattern identifier is indistinguishable from a unit-variant *reference* (`Some(x)` binds `x`; `None` references a -variant — same syntax). v2 leaves all such pattern identifiers **free**. Fail-closed cuts both +variant - same syntax). v2 leaves all such pattern identifiers **free**. Fail-closed cuts both ways: a unit-variant swap in a match arm is *caught* (the safe direction), but renaming a match-arm -catch-all *binding* is also loud — an accepted false positive, not a bug. This is the one benign +catch-all *binding* is also loud - an accepted false positive, not a bug. This is the one benign edit class v2 does not keep silent; a future scope-aware pass could reclaim it. The limit is pinned in `surf-core/tests/differential_hash.rs`. @@ -126,7 +126,7 @@ identifiable and every dropped recipe errors with a remedy rather than a generic | Recipe | Stamp form | Shipped | Status | Remedy if rejected | |---|---|---|---|---| | v1 | bare 12-hex | surf ≤ 0.6.x | **supported** (N-1) until 0.8.0 | run `surf verify` to upgrade to v2 | -| v2 | `2:` + 12-hex | surf ≥ 0.7.0 | **current** | — | +| v2 | `2:` + 12-hex | surf ≥ 0.7.0 | **current** | - | | `N:` for unknown N | `N:` + hex | a newer surf | rejected (fails closed) | upgrade surf to a build that knows recipe N | - **Identification never expires.** The prefix is plain data; any future surf can name the recipe of @@ -134,19 +134,19 @@ identifiable and every dropped recipe errors with a remedy rather than a generic will be, v1. - **N-1 support, at most one legacy mode.** surf verifies the current recipe and exactly one back. v1 compatibility ships in 0.7.0 and is **removed in 0.8.0**; after that a bare-hex stamp is a hard, - named error ("stamped by surf < 0.7 — re-stamp with `surf verify`, or check with surf 0.7.x + named error ("stamped by surf < 0.7 - re-stamp with `surf verify`, or check with surf 0.7.x first"), never a silent DIVERGED. A legacy recipe is retained *only* while it is expressible as a - mode of the current code (v1 ≡ v2 with "every identifier bound" — one flag, no frozen copy). If a + mode of the current code (v1 ≡ v2 with "every identifier bound" - one flag, no frozen copy). If a future recipe cannot express its predecessor that cheaply, that is the signal to drop compat and require stepping through an intermediate release. ## Policy (for maintainers) -- **Any** change to canonical output is a new recipe number — no exceptions. An innocent-looking +- **Any** change to canonical output is a new recipe number - no exceptions. An innocent-looking refactor of the tokenizer that changes one byte of output is silently a new recipe wearing an old number, which corrupts every stamp in the wild. Two layers make that break loud: - **Golden fixtures** (`surf-core/tests/golden_hash.rs`) pin each recipe's exact digest for - representative symbols per language — both v1 (frozen forever) and v2. + representative symbols per language - both v1 (frozen forever) and v2. - **Differential harness** (`surf-core/tests/differential_hash.rs`) re-runs the v1-vs-v2 A/B on every build: zero benign-rename regressions, 100% catch on the semantic (free-swap) corpus. Any future recipe change reruns the same gate. diff --git a/docs/reference/how-it-works.md b/docs/reference/how-it-works.md index 1245dd0..afae7e9 100644 --- a/docs/reference/how-it-works.md +++ b/docs/reference/how-it-works.md @@ -1,13 +1,13 @@ --- title: How the gate works -description: Locate, canonicalize, hash, compare — the four steps behind surf check, and the versioned JSON seam every plugin reads. +description: Locate, canonicalize, hash, compare - the four steps behind surf check, and the versioned JSON seam every plugin reads. --- The gate runs in four steps. 1. **Locate.** tree-sitter parses the file and resolves the `at:` path (a qualified `file > A > B` path, with `@N` for genuine name collisions) to the exact node span. A scope is treated as a - *set* of nodes, so a type and its `impl`/methods — which share a name — disambiguate by path: + *set* of nodes, so a type and its `impl`/methods - which share a name - disambiguate by path: `Type` alone is ambiguous, `Type > method` is unique. In Python the path also resolves non-callables: module constants, type aliases, and class attributes. 2. **Canonicalize.** Walk that span's syntax tree into a token stream. Whitespace and comments @@ -16,17 +16,17 @@ The gate runs in four steps. parameters, locals, loop/destructuring binders) is alpha-renamed to a positional placeholder, so a *consistent* local rename yields the same tokens; a **free** name (external members, call targets, types, enum/constant references, object keys, decorators) is kept verbatim, so - re-pointing a span at a *different* symbol — `PointsTier.TIER_1` → `TIER_2`, `getHighest` → - `getLowest`, `@cache` → `@lru_cache` — changes the hash even when the name occurs once. (This + re-pointing a span at a *different* symbol - `PointsTier.TIER_1` → `TIER_2`, `getHighest` → + `getLowest`, `@cache` → `@lru_cache` - changes the hash even when the name occurs once. (This bound/free split is the **v2** recipe; see [Hash recipes](./hash-recipes.md).) 3. **Hash.** SHA-256 of that stream, truncated to 12 hex. A list `at:` combines its sites into one hash, so the claim is stale if *any* listed span changes. 4. **Compare** against the stamp stored in the frontmatter (written by `surf verify`). The stamp - carries its recipe — a v2 stamp is prefixed `2:`, a bare hex stamp is an older v1 — and is + carries its recipe - a v2 stamp is prefixed `2:`, a bare hex stamp is an older v1 - and is verified under *its own* recipe, so existing v1 stamps keep passing until `surf verify` upgrades them. Equal → pass; different → block. -Quiet on cosmetics, loud on logic — and **reproducible**, because the parser ships *inside* the +Quiet on cosmetics, loud on logic - and **reproducible**, because the parser ships *inside* the binary and is version-pinned. There is no separate formatter or language server in CI to skew the result. @@ -60,7 +60,7 @@ envelope**: Per diverged claim: `hub`, `claim`, `at`, `kind` (`changed` | `unverified` | `unresolvable`), `old_hash`, `new_hash`, `old_code`, `new_code`, `prose`, `magnitude`, and a `detail` string on an -unresolvable claim. `magnitude` (`small` / `medium` / `large`) is advisory triage only — it helps a +unresolvable claim. `magnitude` (`small` / `medium` / `large`) is advisory triage only - it helps a human decide which blocked claim to read first, and it **never** affects pass/fail. **Stability.** `version` is the contract version. Within a major version the shape is diff --git a/hubs/anchor.md b/hubs/anchor.md index 8141959..2e9c48d 100644 --- a/hubs/anchor.md +++ b/hubs/anchor.md @@ -1,5 +1,5 @@ --- -summary: The `at:` anchor grammar parser — qualified paths plus the `@N` positional selector. +summary: The `at:` anchor grammar parser - qualified paths plus the `@N` positional selector. anchors: - claim: > An anchor is a file path followed by `>`-separated symbol segments; a segment may carry @@ -13,4 +13,4 @@ refs: [] # Anchor grammar `parse_anchor` turns an `at:` string into a `file` plus ordered `Segment`s. It is pure string -parsing — resolution against a real tree happens later in `resolve`. +parsing - resolution against a real tree happens later in `resolve`. diff --git a/hubs/cli-check.md b/hubs/cli-check.md index 1ed77f0..bf0e656 100644 --- a/hubs/cli-check.md +++ b/hubs/cli-check.md @@ -1,5 +1,5 @@ --- -summary: surf check — the gate. Hash each anchored span, compare to the stored hash, block on divergence. Optionally scope to changed files. +summary: surf check - the gate. Hash each anchored span, compare to the stored hash, block on divergence. Optionally scope to changed files. anchors: - claim: > Per claim: resolve and hash every site under the stored stamp's own recipe (v1/v2), @@ -14,7 +14,7 @@ anchors: verified_commit: 7c5aabe74da3b56ff680044aeb3b20747b606479 - claim: > Scoping is opt-in and intersective: with neither --base nor --files every claim is checked. - A claim is in scope when any of its anchored files matches each active filter — the --base + A claim is in scope when any of its anchored files matches each active filter - the --base changed-files set (merge-base..working-tree) and/or the --files globs. A bad ref or non-repo yields no changed set, falling back to a full check rather than checking nothing. Each glob records whether it ever matched an anchored file (tallied before the --base filter), so a @@ -25,8 +25,8 @@ anchors: The gate fails closed for concepts: a concept hub whose frontmatter won't parse yields an Unresolvable divergence (blocking the run) rather than being silently skipped, so a frontmatter typo can't pass as clean. OKF reserved files (index.md/log.md) are the exception - — they carry no claims, so they are skipped entirely and never block even without frontmatter. - After the per-claim walk it propagates refs one hop — a hub that directly references a stale + - they carry no claims, so they are skipped entirely and never block even without frontmatter. + After the per-claim walk it propagates refs one hop - a hub that directly references a stale hub (or a stale claim within one) inherits a ReferencedStale divergence, built only from base divergences so a chain stops at the first hop. Alongside the divergences it returns the --files patterns that matched no anchored file (run warns on stderr for each and exits @@ -45,15 +45,15 @@ refs: # surf check -`check` is the gate — the one command CI runs. **The distinction to hold onto:** the verdict is +`check` is the gate - the one command CI runs. **The distinction to hold onto:** the verdict is *purely a function of anchored code and stored hashes*. It reads no git, so the same tree always produces the same answer; the git helpers in [`cli-git.md`](./cli-git.md) only feed the advisory `old_code`/`magnitude` in the `--format json` report and never change pass/fail. `check_claim` is the per-claim verdict; `check_workspace` walks every hub, and `Scope` narrows -which claims it evaluates when `--base` or `--files` is given — opt-in and intersective, falling +which claims it evaluates when `--base` or `--files` is given - opt-in and intersective, falling back to a full check rather than checking nothing. Any divergence (including a *concept* hub whose -frontmatter won't parse — the gate fails closed) makes `run` exit non-zero. OKF reserved files +frontmatter won't parse - the gate fails closed) makes `run` exit non-zero. OKF reserved files (`index.md`/`log.md`) hold no claims, so they are skipped rather than governed. A hub also fails when a hub it [`refs`](./hub-format.md) is stale: composition propagates one hop (#4), so the gate that flags a dependency flags everything built on it. diff --git a/hubs/cli-for.md b/hubs/cli-for.md index 21da904..c8f5eeb 100644 --- a/hubs/cli-for.md +++ b/hubs/cli-for.md @@ -1,9 +1,9 @@ --- -summary: surf for — reverse lookup of hubs/claims anchored into a file; read-only query. +summary: surf for - reverse lookup of hubs/claims anchored into a file; read-only query. anchors: - claim: > run normalizes the queried path to workspace-root-relative form, then verifies it is a - regular file on disk — a nonexistent/mistyped path, a directory, or a trailing slash errors + regular file on disk - a nonexistent/mistyped path, a directory, or a trailing slash errors (exit 1) rather than reporting "no hubs anchor", so a typo can't read as safe-to-edit. For a real file it finds the matching claims and prints them grouped by hub (human) or as a versioned {version, path, matches} envelope (JSON), always exiting 0 whether or not anything @@ -11,7 +11,7 @@ anchors: at: surf-cli/src/for_path.rs > run hash: 2:991c3bcc234c - claim: > - find collects every claim whose anchored file equals the queried path (matched on path only — + find collects every claim whose anchored file equals the queried path (matched on path only - no source parse), optionally narrowed to anchors whose first segment is the given symbol. Malformed hubs are skipped rather than erroring, and results are sorted by hub then anchor. at: surf-cli/src/for_path.rs > find @@ -24,4 +24,4 @@ refs: [] Delivers the discoverability half of the thesis: a fast way to pull up the claims governing a file before touching its logic. `run` normalizes the queried path to workspace-root-relative form, calls `find`, and prints matches grouped by hub (human) or as a versioned `{version, path, -matches}` envelope (JSON). No model, no network, no source parse — purely a read over the hub set. +matches}` envelope (JSON). No model, no network, no source parse - purely a read over the hub set. diff --git a/hubs/cli-git.md b/hubs/cli-git.md index 4a054e2..f99b7f0 100644 --- a/hubs/cli-git.md +++ b/hubs/cli-git.md @@ -1,5 +1,5 @@ --- -summary: Best-effort git queries for scoping and rename-following — advisory only, the gate never depends on them. +summary: Best-effort git queries for scoping and rename-following - advisory only, the gate never depends on them. anchors: - claim: > Every query here is best-effort and advisory: each returns None/empty when git can't answer @@ -38,7 +38,7 @@ refs: # git helpers -A thin wrapper over `git` via `std::process::Command` — no `git2` dependency. +A thin wrapper over `git` via `std::process::Command` - no `git2` dependency. **The one distinction that matters:** these only *enrich* the gate; they never decide it. `check`'s verdict is computed from anchored code alone, so a missing or broken git environment degrades the @@ -50,5 +50,5 @@ The five helpers split by job: `changed_files` diff-scopes `surf check --base`; in `lint`/`verify` (symbol renames are [`rename.md`](./rename.md)). The first claim seals the contract they all share; the rest pin down the non-trivial mechanics. -**Boundary:** nothing here is part of the deterministic verdict, and none of these mutate the repo — +**Boundary:** nothing here is part of the deterministic verdict, and none of these mutate the repo - they only read git state. diff --git a/hubs/cli-lint.md b/hubs/cli-lint.md index 4f48dd5..39d70d7 100644 --- a/hubs/cli-lint.md +++ b/hubs/cli-lint.md @@ -1,16 +1,16 @@ --- -summary: surf lint — anchors must resolve to one symbol (renames warn); plus advisory granularity warnings. +summary: surf lint - anchors must resolve to one symbol (renames warn); plus advisory granularity warnings. anchors: - claim: > lint produces a Finding per anchor site: ambiguous or vanished anchors block, while a - renamed-but-present symbol (stored-hash match) only warns and points at verify --follow — + renamed-but-present symbol (stored-hash match) only warns and points at verify --follow - as does a file that git reports has moved. Block-level findings set a non-zero exit; warnings alone keep exit 0. at: surf-cli/src/lint.rs > lint_site hash: 2:97f0946e74b0 - claim: > Advisory granularity guidance (§8), never blocking: lint_under_coverage flags public - symbols — top-level functions and methods — in an already-anchored file that no claim + symbols - top-level functions and methods - in an already-anchored file that no claim covers. Coverage is workspace-wide: a symbol anchored by any hub is covered, so a second hub touching the same file is never nagged about symbols another hub owns, and each uncovered symbol is reported once against the file's first anchoring hub. It runs only on @@ -20,7 +20,7 @@ anchors: - claim: > AGENTS.md enforcement is opt-in (§11.6): only when the file carries a surf:hubs marker block does lint require it to link the configured hubs directory (which must exist), - blocking otherwise. It points agents at the directory to search — never enumerating + blocking otherwise. It points agents at the directory to search - never enumerating individual hubs, which would push an agent to read everything. at: surf-cli/src/lint.rs > lint_agents_pointer hash: 2:ac139b65f5f0 @@ -32,6 +32,6 @@ refs: [] `lint_workspace` loads every hub and runs `lint_site` over each anchor; `run` prints the findings and chooses the exit code. Beyond resolution, lint emits advisory warnings (§8) that nudge granularity: a near-whole-file span (`lint_coarse_span`), too many anchors in one hub, -and public symbols — functions and methods — with no covering claim anywhere in the workspace +and public symbols - functions and methods - with no covering claim anywhere in the workspace (`lint_under_coverage`). It also validates the `AGENTS.md` pointer block (`lint_agents_pointer`). diff --git a/hubs/cli-reference.md b/hubs/cli-reference.md index 4601772..7c731c4 100644 --- a/hubs/cli-reference.md +++ b/hubs/cli-reference.md @@ -1,12 +1,12 @@ --- -summary: The CLI command/flag surface — the clap `Command` enum that `docs/reference/commands.md` documents. +summary: The CLI command/flag surface - the clap `Command` enum that `docs/reference/commands.md` documents. anchors: - claim: > The CLI exposes exactly these subcommands with these flags: init; new ; lint [--format]; check [--format] [--base ] [--files ]; verify [] [--follow] [--format]; suggest [--all] [--format]; for [symbol] [--format]; stats [--since ] [--until ] [--format]. Adding, removing, or renaming a command or - flag, or changing a default, diverges this anchor — re-read docs/reference/commands.md + flag, or changing a default, diverges this anchor - re-read docs/reference/commands.md before sealing. at: surf-cli/src/main.rs > Command hash: 2:1af394872add diff --git a/hubs/cli-scaffold.md b/hubs/cli-scaffold.md index e32b78d..9b2e01e 100644 --- a/hubs/cli-scaffold.md +++ b/hubs/cli-scaffold.md @@ -1,8 +1,8 @@ --- -summary: surf init / surf new — bootstrap a workspace and scaffold lint-clean hubs. +summary: surf init / surf new - bootstrap a workspace and scaffold lint-clean hubs. anchors: - claim: > - init writes surf.toml + creates hubs/ in the cwd, and is idempotent — an existing + init writes surf.toml + creates hubs/ in the cwd, and is idempotent - an existing surf.toml is left untouched. at: surf-cli/src/init.rs > run hash: 2:640471b94678 diff --git a/hubs/cli-stats.md b/hubs/cli-stats.md index e71b52d..1444306 100644 --- a/hubs/cli-stats.md +++ b/hubs/cli-stats.md @@ -1,5 +1,5 @@ --- -summary: surf stats — git-history adoption metrics (rubber-stamp + in-place rates); advisory, never a gate. +summary: surf stats - git-history adoption metrics (rubber-stamp + in-place rates); advisory, never a gate. anchors: - claim: > run computes the two metrics and prints them human-readable or as a versioned envelope; it @@ -9,7 +9,7 @@ anchors: hash: 2:7bcce388adbb - claim: > compute reads the whole since/until window from one streamed git log and scores each - non-merge commit, propagating hub claim state incrementally — a commit inherits its first + non-merge commit, propagating hub claim state incrementally - a commit inherits its first parent's state unless it touched a hub path, and merges carry state but never count. A rubber-stamp event is an already-sealed claim whose stored hash value changed in a commit; it counts toward the rubber-stamp numerator only when the claim's prose was unchanged. A @@ -27,7 +27,7 @@ refs: [] The proposal's adopt/kill signals (§9.2), computed deterministically from git history. `compute` reads the whole window from a single streamed `git log` (#72) and propagates the hub claim set -incrementally — a hub is re-read (`git show`) only at commits that touched it, so the spawn count -scales with hub edits, not history length. Heuristics — one commit per PR, `at:`-site claim -identity, an in-place denominator that counts any anchored-file edit — are documented in +incrementally - a hub is re-read (`git show`) only at commits that touched it, so the spawn count +scales with hub edits, not history length. Heuristics - one commit per PR, `at:`-site claim +identity, an in-place denominator that counts any anchored-file edit - are documented in [the stats guide](../docs/guides/stats.md). diff --git a/hubs/cli-suggest.md b/hubs/cli-suggest.md index 03d5915..66c07b1 100644 --- a/hubs/cli-suggest.md +++ b/hubs/cli-suggest.md @@ -1,16 +1,16 @@ --- -summary: surf suggest — propose anchors for unanchored public symbols; read-only, never stamps. +summary: surf suggest - propose anchors for unanchored public symbols; read-only, never stamps. anchors: - claim: > surf suggest is read-only: run scans the given globs, lists each public symbol no hub already anchors, and prints them (a starter hub in human mode, or JSON). By default the surface is callables (top-level functions plus Python/Go/TypeScript methods); --all - additionally proposes the non-callable targets resolve accepts — Python top-level classes, + additionally proposes the non-callable targets resolve accepts - Python top-level classes, module-level constants and type aliases, and class attributes; Go exported const/var/type declarations; TypeScript exported classes and non-callable const/let/var. It warns on stderr for any glob that matched no files, notes when --all scanned Rust files it cannot affect, and exits non-zero only when every glob was empty. It - never writes a file and never computes or stamps a hash — the author edits the claims and + never writes a file and never computes or stamps a hash - the author edits the claims and verifies. at: surf-cli/src/suggest.rs > run hash: 2:e1710f880435 diff --git a/hubs/cli-verify.md b/hubs/cli-verify.md index 066f592..ec02e20 100644 --- a/hubs/cli-verify.md +++ b/hubs/cli-verify.md @@ -1,10 +1,10 @@ --- -summary: surf verify — re-seal a claim after a human confirms the prose, with optional --follow. +summary: surf verify - re-seal a claim after a human confirms the prose, with optional --follow. anchors: - claim: > For each claim, plan_claim re-hashes every site (combined) under the current recipe when all resolve, returning Unchanged only when the stored stamp already matches that recipe's - stamp, else Hash to re-stamp — so one pass also upgrades a still-matching v1 stamp to v2. + stamp, else Hash to re-stamp - so one pass also upgrades a still-matching v1 stamp to v2. Under --follow, a site that no longer resolves re-points a renamed single-segment anchor via find_renamed; a site whose file is unreadable asks git where it moved and re-points the path (only when the code is otherwise unchanged under the stored recipe). Otherwise it skips diff --git a/hubs/cli-workspace.md b/hubs/cli-workspace.md index e32b266..3323480 100644 --- a/hubs/cli-workspace.md +++ b/hubs/cli-workspace.md @@ -1,5 +1,5 @@ --- -summary: Workspace discovery and hub enumeration — the I/O layer over the pure config parser. +summary: Workspace discovery and hub enumeration - the I/O layer over the pure config parser. anchors: - claim: > discover walks up from a starting directory to the nearest surf.toml (like git/ruff), @@ -26,7 +26,7 @@ This is the I/O layer that sits over the pure config parser ([`config.md`](./con the project and turns the hub globs into concrete files, so every other command works in terms of a resolved root rather than the caller's current directory. -`discover` is what makes `surf` runnable from any subdirectory — it walks up to the nearest +`discover` is what makes `surf` runnable from any subdirectory - it walks up to the nearest `surf.toml` (the same root-finding git and ruff use) and errors if none is found, so a stray invocation outside a project fails loudly instead of silently governing nothing. The resolved root is the base every anchor path is joined against, and `hub_paths` enumerates the hubs by globbing the @@ -34,5 +34,5 @@ configured `hubs` patterns and expanding any OKF `bundles` roots (each as ` surf.toml parses into a Config whose hubs default to ["hubs/*.md"]; unknown keys are diff --git a/hubs/hash.md b/hubs/hash.md index 4eea9ff..7d5411a 100644 --- a/hubs/hash.md +++ b/hubs/hash.md @@ -1,5 +1,5 @@ --- -summary: AST-canonical hashing — quiet on cosmetics, loud on logic — and per-claim combination. +summary: AST-canonical hashing - quiet on cosmetics, loud on logic - and per-claim combination. anchors: - claim: > The canonical token stream drops comments and keeps operators, keywords, and literal @@ -13,7 +13,7 @@ anchors: at: surf-core/src/hash.rs > emit hash: 2:ac52f23c70c8 - claim: > - Under v2 only names bound inside the span are alpha-renamed — the symbol's own name, + Under v2 only names bound inside the span are alpha-renamed - the symbol's own name, parameters, locals, loop/range/comprehension variables, with/catch aliases, generic params, and destructuring binders. Detection is tree-sitter-only and fail-closed: a position not positively recognized as a binding defaults to free (verbatim). @@ -21,7 +21,7 @@ anchors: hash: 2:20fd6172cf43 - claim: > The property/field component of a member-access expression is kept verbatim even when its - text collides with a bound local, since that position can never be the binding — matched + text collides with a bound local, since that position can never be the binding - matched structurally per family (kind + parent kind + the parent's named field). at: surf-core/src/hash.rs > is_member_access_name hash: 2:de12739eeb09 @@ -31,7 +31,7 @@ anchors: at: surf-core/src/hash.rs > is_identifier hash: 2:25ca2f219009 - claim: > - A claim's hash is the combination of its per-site hashes — a single site is the identity, + A claim's hash is the combination of its per-site hashes - a single site is the identity, multiple sites combine order-sensitively, so the claim is stale if any listed span changes. at: surf-core/src/hash.rs > combine_site_hashes hash: 2:cbbbbc3b2237 @@ -49,12 +49,12 @@ gate compares; `Magnitude` alongside it is advisory and never gates. alpha-renamed to positional placeholders, so a consistent rename or a reflow doesn't trip a claim, while operators, keywords, and literal values stay verbatim, so a real logic edit does. Which identifiers get alpha-renamed is the recipe's job: v1 renames them all; v2 (the bound/free split, -#77) renames only **bound** names — params, locals, the symbol's own name — and emits every +#77) renames only **bound** names - params, locals, the symbol's own name - and emits every **free** identifier (external members, call targets, types, constants, decorators) verbatim, so re-pointing a span at a different symbol is loud even when the name occurs once. A claim's hash is the order-sensitive combination of its per-site hashes, which is what lets one multi-site claim go stale when any of its spans changes. See [hash recipes](../docs/reference/hash-recipes.md) for the versioned canonicalization and migration. -**Boundary:** hashing decides *that* something changed, never *whether the prose is still true* — +**Boundary:** hashing decides *that* something changed, never *whether the prose is still true* - that judgment is the human's at [`surf verify`](./cli-verify.md). diff --git a/hubs/hub-format.md b/hubs/hub-format.md index 849f797..5178c0e 100644 --- a/hubs/hub-format.md +++ b/hubs/hub-format.md @@ -6,10 +6,10 @@ anchors: frontmatter is a superset of an OKF concept: `type` (defaulted to `concept`, so pre-OKF hubs stay valid), `title`, `tags`, `timestamp` sit alongside Surface's `anchors`/`refs`/`covers`, and every other key (OKF `description`/`resource`, a doc system's `author`/`created`) is - preserved verbatim in `extra` — unknown *frontmatter* keys are kept, not rejected, per OKF. + preserved verbatim in `extra` - unknown *frontmatter* keys are kept, not rejected, per OKF. Inside an anchor item `at:` is a scalar or list, `hash` is optional until verified, and unknown keys there ARE still rejected (a per-anchor typo fails closed). parse_hub resolves - neither refs nor covers — acting on them is lint/check's job. + neither refs nor covers - acting on them is lint/check's job. at: - surf-core/src/hub.rs > parse_hub - surf-core/src/hub.rs > Frontmatter @@ -21,7 +21,7 @@ anchors: - claim: > verify writes fields back surgically: set_anchor_field (which set_anchor_hash wraps) locates the Nth anchor item and replaces/inserts only that one key's line, so an unchanged write is - byte-identical — the same primitive stamps hash, id, and verified_* provenance. + byte-identical - the same primitive stamps hash, id, and verified_* provenance. at: - surf-core/src/hub.rs > set_anchor_field - surf-core/src/hub.rs > set_anchor_hash @@ -45,7 +45,7 @@ human or agent reads). `parse_hub` is the contract everything else binds to. **A hub is an OKF concept, plus freshness.** The frontmatter is a *superset* of an [Open Knowledge Format](../docs/guides/okf.md) concept: it carries OKF's `type`/`title`/`tags`/ `timestamp` (and preserves any other key in `extra`, since OKF requires consumers to keep unknown -fields), so a hub is a conformant OKF concept that any OKF reader can consume — while Surface's +fields), so a hub is a conformant OKF concept that any OKF reader can consume - while Surface's `anchors` add the freshness OKF omits. That is why `deny_unknown_fields` is *off* for the frontmatter (a typo'd key is caught by a `surf lint` warning instead of a hard error) but stays *on* for each anchor item, where an unknown key is a genuine mistake that should fail closed. @@ -53,11 +53,11 @@ frontmatter (a typo'd key is caught by a `surf lint` warning instead of a hard e **The distinction that drives the design:** a human reviews every write, so edits must be *surgical*. Writes go through the line-level editor (`set_anchor_field`, which `set_anchor_hash` -wraps, plus `set_anchor_at`) rather than re-serializing the frontmatter — re-serializing would +wraps, plus `set_anchor_at`) rather than re-serializing the frontmatter - re-serializing would reorder keys, reflow scalars, and drop the preserved `extra` ordering, burying the one changed line in a noisy diff. An unchanged write is therefore byte-identical, which is what keeps a no-op `surf verify` from churning the file (and what lets it stamp `id`/`verified_*` provenance only when the hash actually changed). -**Boundary:** this module is pure parsing and text editing — it resolves no anchors and computes no +**Boundary:** this module is pure parsing and text editing - it resolves no anchors and computes no hashes; it only produces the structure [`lint`](./cli-lint.md)/[`check`](./cli-check.md) act on. diff --git a/hubs/rename.md b/hubs/rename.md index e6c6cc6..0ec9cf7 100644 --- a/hubs/rename.md +++ b/hubs/rename.md @@ -3,7 +3,7 @@ summary: Deterministic, git-free rename detection via stored-hash match. anchors: - claim: > When an anchor no longer resolves, find_renamed walks every current definition and - returns the one whose canonical hash equals the claim's stored hash — because the hash + returns the one whose canonical hash equals the claim's stored hash - because the hash alpha-renames identifiers, a renamed-but-unchanged symbol still matches. No git, no similarity threshold. at: surf-core/src/rename.rs > find_renamed diff --git a/hubs/resolve.md b/hubs/resolve.md index 6ca1783..6a263c5 100644 --- a/hubs/resolve.md +++ b/hubs/resolve.md @@ -3,7 +3,7 @@ summary: Resolving an anchor to the exact span of one symbol, across language fa anchors: - claim: > The generic resolver treats a scope as a *set* of nodes, so a type and its impl/methods - (which share a name) both get descended — `Type > method` is unique even when `Type` + (which share a name) both get descended - `Type > method` is unique even when `Type` alone is ambiguous. Resolves to exactly one *logical symbol* or returns NotFound/Ambiguous; usually one node, but a Python @overload group (consecutive same-name stubs plus their implementation, in the same scope) counts as one match, so @@ -28,5 +28,5 @@ refs: [] `resolve_nodes` is the load-bearing primitive: anchor + parsed tree → exact byte/line span. TypeScript/Rust/Python use the generic scope-set walk; Go uses `resolve_go`. Python -`@overload` groups resolve and hash as one unit — stubs and implementation share a single +`@overload` groups resolve and hash as one unit - stubs and implementation share a single token stream and span (#82).