diff --git a/docs/benchmarks/tor-vs-clearnet.md b/docs/benchmarks/tor-vs-clearnet.md
new file mode 100644
index 0000000..9df8242
--- /dev/null
+++ b/docs/benchmarks/tor-vs-clearnet.md
@@ -0,0 +1,128 @@
+# Benchmark: mining over Tor vs clearnet (#256)
+
+> **Status: methodology — results pending.** This documents *what* we measure, *from where*, and
+> *how we decide*, so the run is reproducible and the conclusion is honest. The results table at the
+> bottom is filled in after the multi-day run on the [gouda test bench](../../tests/integration/gouda-testbench-README.md).
+
+## The question
+
+While the stack is **actively mining at steady state**, does routing the **mining-path** networking
+over Tor instead of clearnet measurably cost us yield or reliability? This gates the Tor-by-default
+decision for two paths:
+
+- **p2pool outbound sidechain P2P** ([#165](https://github.com/p2pool-starter-stack/pithead/issues/165)) — `p2pool.clearnet` knob.
+- **XvB donation mining** ([#166](https://github.com/p2pool-starter-stack/pithead/issues/166)) — `xvb.tor` knob.
+
+This is **not** about *initial sync* over Tor — that's a separate, already-settled question
+([#183](https://github.com/p2pool-starter-stack/pithead/issues/183)/#234: Tor IBD is impractically
+slow, hence the opt-in clearnet-then-Tor initial sync). Here both monerod and Tari stay on Tor
+throughout; only the **p2pool / XvB** egress changes between arms.
+
+## What we measure, and from where
+
+Every source below was verified against the live p2pool container on gouda (`/stats/*` files written
+by p2pool's `--local-api` / `--data-api`, plus the dashboard `/api/state` and `docker stats`).
+
+### Primary — yield (the bottom line)
+
+The headline question is "does Tor cost us money," and that is captured **directly** by our share of
+the PPLNS reward window — no raw orphan counter required, because an orphaned/uncled share simply
+fails to earn its full weight, which shows up here.
+
+| Metric | Meaning | Source · field |
+|---|---|---|
+| **PPLNS reward share** | our % of the sidechain reward window — direct revenue | `/stats/local/stratum` · `block_reward_share_percent` |
+| **Yield efficiency** | reward-share ÷ hashrate-share; `< 1` ⇒ shares going stale | derived: `block_reward_share_percent` ÷ ( `hashrate_1h` ÷ `/stats/pool/stats` · `pool_statistics.hashRate` ) |
+| **Effort** | hashes-per-share vs expected — rises when shares are found but not counted | `/stats/local/stratum` · `average_effort`, `current_effort` |
+| **Uncle / orphan events** (our shares) | raw stale-share count — a *mechanism* cross-check on the yield number | parse the p2pool **console log** (`SHARE FOUND` + `SideChain` uncle markers); **not** in `/api/state` |
+
+> **Why reward-share is primary, not a raw orphan count.** p2pool includes uncle shares at reduced
+> weight and drops orphans entirely; both effects land in `block_reward_share_percent`. So the net
+> revenue impact is measurable from the stratum stats today. The raw uncle count (log-parse) is kept
+> as a secondary cross-check, not the decision metric — it's noisier and harder to attribute to *our*
+> shares.
+
+### Secondary — the mechanism (latency / connectivity)
+
+These explain *why* the yield moved (or confirm it didn't) and are all available today.
+
+| Metric | Source · field |
+|---|---|
+| Share-found cadence / time-to-first-share | `/stats/local/stratum` · `shares_found`, `last_share_found_time` |
+| Sidechain peer count & churn | `/stats/local/p2p` · `connections`, `incoming_connections`, `peer_list_size` |
+| Stratum-level rejects | `/stats/local/stratum` · `shares_failed` |
+| Effective hashrate on pool | `/stats/local/stratum` · `hashrate_1h`, `hashrate_24h` |
+| Tor daemon overhead under sustained traffic | `docker stats tor` (CPU %, mem) |
+
+### XvB donation path (#166)
+
+Measured only during donation windows (when `algo_service` has routed the proxy to XvB).
+
+| Metric | Source · field |
+|---|---|
+| Accepted / rejected / invalid shares to XvB | xmrig-proxy `/summary` → `/api/state` · `proxy_summary.{accepted,rejected,invalid}` |
+| Credited raffle weight | XvB stats API (already fetched over Tor, #163) · `block_reward_share_percent` analogue at XvB |
+
+## Method
+
+- **Rig:** the gouda test bench + one RigForge rig (`miner-0`) at a **fixed hashrate**, driven by
+  [`tests/integration/e2e.sh`](../../tests/integration/e2e.sh)'s borrow flow.
+- **Design:** sequential **A/B over matched windows** on the *same* hashrate (one rig can't run two
+  arms at once). Each arm runs **several days** — sidechain shares are sparse, so reward-share and
+  uncle counts only converge over days, not hours.
+  - **Arm C (clearnet):** `p2pool.clearnet: true` (and, for #166, `xvb.tor: false`).
+  - **Arm T (Tor):** the defaults (`p2pool.clearnet: false` / `xvb.tor: true`).
+  - The **control variable is the #165 `p2pool.clearnet` knob itself** — the benchmark dogfoods the
+    very toggle it's validating. monerod + Tari stay on Tor in both arms.
+- **Collector:** [`tests/integration/benchmarks/bench-collect.sh`](../../tests/integration/benchmarks/bench-collect.sh)
+  — a read-only poller that snapshots the `/stats/*` fields + `docker stats tor` into one JSONL line
+  per interval (default 5 min), per arm. No new container; it reads what p2pool already writes.
+- **Confounders to control:** same pool/sidechain (`main` vs `mini`/`nano` measured separately — the
+  Tor penalty is expected to be worse on the faster sidechains), same monerod tip, same rig, and
+  windows long enough that pool-difficulty drift averages out. Network weather varies, so we report
+  the **clearnet arm's own run-to-run variance** as the noise floor.
+
+### Running an arm
+
+On the gouda box, with the arm's config deployed (`p2pool.clearnet: true` for clearnet, default for
+Tor) and `miner-0` pointed at it:
+
+```bash
+# detached, survives logout — one JSONL line every 5 min into ~/pithead-bench/<arm>.jsonl
+nohup tests/integration/benchmarks/bench-collect.sh clearnet --interval 300 >/dev/null 2>&1 &
+# …let it run for days, then switch the toggle + run the other arm:
+nohup tests/integration/benchmarks/bench-collect.sh tor --interval 300 >/dev/null 2>&1 &
+```
+
+Summarise an arm (e.g. mean PPLNS reward share, mean effort, peer count):
+
+```bash
+jq -s '{reward: (map(.reward_share|numbers)|add/length),
+        effort: (map(.avg_effort|numbers)|add/length),
+        peers:  (map(.peers_out|numbers)|add/length)}' ~/pithead-bench/clearnet.jsonl
+```
+
+## Decision rule
+
+The clearnet arm establishes the **noise floor** (its own variance across sub-windows). Then:
+
+- **Tor yield-efficiency / reward-share delta within that noise floor** ⇒ keep **Tor as the default**
+  for that path (#165/#166 ship as-is); document the measured (negligible) trade-off.
+- **Tor materially below clearnet** (a reward-share loss clearly outside the noise, especially if
+  concentrated on `--mini`/`--nano`) ⇒ document the trade-off and reconsider: a **per-sidechain
+  default**, or **clearnet-default with a Tor opt-in**, rather than a blanket Tor default.
+
+Either way the conclusion lands in [`docs/privacy.md`](../privacy.md) so operators get an honest
+trade-off, and it sets the final default for #165/#166 **before** the v1.1 release.
+
+## Results
+
+_Pending the multi-day run. To be filled with: a per-arm table (reward share, yield efficiency,
+effort, uncle count, peers, rejects, Tor overhead), the noise floor, and the recommendation._
+
+| Arm | Pool | Days | Reward share | Yield eff. | Avg effort | Uncles | Peers | Rejects | Tor CPU/mem |
+|---|---|---|---|---|---|---|---|---|---|
+| _clearnet_ | _tbd_ | _tbd_ | _tbd_ | _tbd_ | _tbd_ | _tbd_ | _tbd_ | _tbd_ | — |
+| _Tor_ | _tbd_ | _tbd_ | _tbd_ | _tbd_ | _tbd_ | _tbd_ | _tbd_ | _tbd_ | _tbd_ |
+
+**Recommendation:** _pending._
diff --git a/tests/integration/benchmarks/bench-collect.sh b/tests/integration/benchmarks/bench-collect.sh
new file mode 100755
index 0000000..da47ede
--- /dev/null
+++ b/tests/integration/benchmarks/bench-collect.sh
@@ -0,0 +1,92 @@
+#!/usr/bin/env bash
+#
+# bench-collect.sh — Tor-vs-clearnet mining benchmark collector (#256).
+#
+# Snapshots p2pool's yield + latency/connectivity metrics and the Tor daemon's overhead into a JSONL
+# file, ONE line per interval, until killed. Read-only: it only reads the `/stats/*` files p2pool
+# already writes (its --local-api / --data-api) and `docker stats`. Runs ON the mining host (gouda),
+# typically detached for days per arm. Metrics + sources are documented in
+# docs/benchmarks/tor-vs-clearnet.md.
+#
+# Usage (on the box, in or near the stack dir):
+#   tests/integration/benchmarks/bench-collect.sh <arm-label> [options]
+#   nohup .../bench-collect.sh clearnet --interval 300 >/dev/null 2>&1 &   # detached, survives logout
+#
+# Options:
+#   --interval N   seconds between snapshots (default 300 = 5 min)
+#   --out FILE     JSONL output path (default ~/pithead-bench/<arm>.jsonl; appended to)
+#   --p2pool NAME  p2pool container name (default: p2pool — the pinned compose project)
+#   --tor NAME     tor container name (default: tor)
+#   --once         take a single snapshot to stdout and exit (smoke test)
+#
+# Analyse later, e.g. the per-arm mean reward share:
+#   jq -s 'map(.reward_share|numbers)|add/length' clearnet.jsonl
+
+set -uo pipefail
+
+ARM="${1:-}"; shift || true
+case "$ARM" in ""|-*) echo "usage: bench-collect.sh <arm-label> [--interval N] [--out FILE] [--once]" >&2; exit 2;; esac
+
+INTERVAL=300; OUT=""; P2POOL="p2pool"; TOR="tor"; ONCE=0
+while [ $# -gt 0 ]; do
+    case "$1" in
+        --interval) INTERVAL="$2"; shift 2 ;;
+        --out)      OUT="$2"; shift 2 ;;
+        --p2pool)   P2POOL="$2"; shift 2 ;;
+        --tor)      TOR="$2"; shift 2 ;;
+        --once)     ONCE=1; shift ;;
+        *) echo "unknown arg: $1" >&2; exit 2 ;;
+    esac
+done
+[ -n "$OUT" ] || OUT="$HOME/pithead-bench/${ARM}.jsonl"
+
+# Read a /stats file out of the p2pool container as compact JSON, or `null` on any error/garbage.
+pstat() { docker exec "$P2POOL" cat "/stats/$1" 2>/dev/null | jq -c . 2>/dev/null || true; }
+
+snapshot() {
+    local now strat p2p pool tor_line tor_cpu tor_mem
+    now=$(date -u +%s)
+    strat=$(pstat local/stratum); [ -n "$strat" ] || strat=null
+    p2p=$(pstat local/p2p);       [ -n "$p2p" ]   || p2p=null
+    pool=$(pstat pool/stats);     [ -n "$pool" ]  || pool=null
+    # Tor daemon overhead under sustained mining traffic.
+    tor_line=$(docker stats --no-stream --format '{{.CPUPerc}}|{{.MemUsage}}' "$TOR" 2>/dev/null)
+    tor_cpu=$(printf '%s' "$tor_line" | awk -F'|' '{gsub(/%/,"",$1); print ($1==""?"null":$1+0)}')
+    tor_mem=$(printf '%s' "$tor_line" | awk -F'|' '{split($2,a," / "); print a[1]}')   # e.g. "12.3MiB"
+    [ -n "$tor_cpu" ] || tor_cpu=null
+
+    jq -cn --arg arm "$ARM" --argjson ts "$now" \
+        --argjson strat "$strat" --argjson p2p "$p2p" --argjson pool "$pool" \
+        --argjson tor_cpu "$tor_cpu" --arg tor_mem "${tor_mem:-}" '
+        {
+            ts: $ts, arm: $arm,
+            # --- primary: yield ---
+            reward_share:   ($strat.block_reward_share_percent),
+            avg_effort:     ($strat.average_effort),
+            cur_effort:     ($strat.current_effort),
+            # --- secondary: hashrate / share cadence ---
+            hashrate_1h:    ($strat.hashrate_1h),
+            hashrate_24h:   ($strat.hashrate_24h),
+            shares_found:   ($strat.shares_found),
+            shares_failed:  ($strat.shares_failed),
+            last_share:     ($strat.last_share_found_time),
+            # --- secondary: sidechain connectivity (the Tor-latency mechanism) ---
+            peers_out:      ($p2p.connections),
+            peers_in:       ($p2p.incoming_connections),
+            peer_list:      ($p2p.peer_list_size),
+            pool_hr:        ($pool.pool_statistics.hashRate),
+            sidechain_diff: ($pool.pool_statistics.sidechainDifficulty),
+            # --- Tor overhead ---
+            tor_cpu_pct:    $tor_cpu,
+            tor_mem:        $tor_mem
+        }'
+}
+
+if [ "$ONCE" = "1" ]; then snapshot; exit 0; fi
+
+mkdir -p "$(dirname "$OUT")"
+echo "[bench] arm=$ARM interval=${INTERVAL}s out=$OUT started=$(date -u +%FT%TZ)" >&2
+while :; do
+    line=$(snapshot 2>/dev/null) && [ -n "$line" ] && printf '%s\n' "$line" >> "$OUT"
+    sleep "$INTERVAL"
+done
diff --git a/tests/integration/benchmarks/bench-verify-egress.sh b/tests/integration/benchmarks/bench-verify-egress.sh
new file mode 100755
index 0000000..95fc732
--- /dev/null
+++ b/tests/integration/benchmarks/bench-verify-egress.sh
@@ -0,0 +1,70 @@
+#!/usr/bin/env bash
+#
+# bench-verify-egress.sh — prove each container's ACTUAL egress posture for the #256 benchmark
+# (and as a live privacy-leak check generally). It reads `/proc/net/tcp` from inside each container
+# (root there, so no host sudo) and reports every ESTABLISHED connection to a **public** IP — i.e. one
+# that bypasses the Tor SOCKS at <bridge>.25:9050 (a private 172.x address). Run ON the mining host.
+#
+#   tests/integration/benchmarks/bench-verify-egress.sh <tor|clearnet> [--dir STACK_DIR] [--prefix 172.28.0]
+#
+# Interpretation:
+#   - `tor` arm  → EVERY app container must show 0 public connections (all egress via Tor). Only the
+#                  `tor` container should reach public IPs (Tor relays). A non-zero app count = LEAK.
+#   - `clearnet` → the mining-path containers (p2pool, xmrig-proxy while donating) SHOULD show direct
+#                  public connections; monerod/tari staying at 0 confirms node-sync is still Tor
+#                  (the benchmark holds those constant — see docs/benchmarks/tor-vs-clearnet.md).
+#
+# Ground-truth backstop (needs root, so run by hand): a WAN-interface capture should show NO mining
+# traffic to non-Tor IPs in the tor arm —
+#   sudo tcpdump -ni <wan> 'tcp and (port 18080 or portrange 37888-37890 or port 4247) and not host <tor-relays>'
+
+set -uo pipefail
+
+ARM="${1:-}"; case "$ARM" in tor|clearnet) shift ;; *) echo "usage: bench-verify-egress.sh <tor|clearnet> [--dir DIR] [--prefix P]" >&2; exit 2 ;; esac
+DIR="/srv/code/pithead"; PREFIX="172.28.0"
+while [ $# -gt 0 ]; do
+    case "$1" in
+        --dir)    DIR="$2"; shift 2 ;;
+        --prefix) PREFIX="$2"; shift 2 ;;
+        *) echo "unknown arg: $1" >&2; exit 2 ;;
+    esac
+done
+APPS="monerod p2pool tari xmrig-proxy"
+
+# Established (st=01) foreign IPv4s for a container that are PUBLIC (skip loopback/private/bridge/
+# link-local — the Tor SOCKS lives in the private 172.16/12 range, so SOCKS-routed traffic is skipped).
+# /proc/net/tcp `rem_address` is little-endian hex "IIIIIIII:PPPP"; decode with bash arithmetic so we
+# don't depend on gawk/strtonum inside minimal images (only `cat` runs in the container).
+public_conns() {  # <container-id>
+    docker exec "$1" sh -c 'cat /proc/net/tcp 2>/dev/null' | while read -r _sl _local rem st _rest; do
+        [ "$st" = "01" ] || continue
+        local hip="${rem%:*}" hport="${rem#*:}" o1 o2 o3 o4
+        o1=$((16#${hip:6:2})); o2=$((16#${hip:4:2})); o3=$((16#${hip:2:2})); o4=$((16#${hip:0:2}))
+        case "$o1.$o2" in 10.*|127.*|0.*|169.254|192.168) continue ;; esac
+        { [ "$o1" = 172 ] && [ "$o2" -ge 16 ] && [ "$o2" -le 31 ]; } && continue
+        printf '%d.%d.%d.%d:%d\n' "$o1" "$o2" "$o3" "$o4" "$((16#$hport))"
+    done
+}
+
+cid_of() { ( cd "$DIR" && docker compose ps -q "$1" 2>/dev/null | head -n1 ); }
+
+echo "[verify-egress] arm=$ARM  stack=$DIR  tor-socks=${PREFIX}.25:9050"
+fail=0
+for c in $APPS; do
+    cid=$(cid_of "$c"); [ -n "$cid" ] || { echo "  - $c: not running (skip)"; continue; }
+    pub=$(public_conns "$cid" | sort -u); n=$(printf '%s' "$pub" | grep -c . || true)
+    if [ "$ARM" = "tor" ]; then
+        if [ "$n" -eq 0 ]; then echo "  ✓ $c: 0 direct public connections — all egress via Tor"
+        else echo "  ✗ $c: $n DIRECT PUBLIC connection(s) — CLEARNET LEAK:"; printf '%s\n' "$pub" | sed 's/^/        /'; fail=1; fi
+    else
+        if [ "$n" -gt 0 ]; then echo "  ✓ $c: $n direct public connection(s) — clearnet, as expected for this arm"
+        else echo "  · $c: 0 direct public connections (still Tor / idle — expected for monerod & tari)"; fi
+    fi
+done
+tcid=$(cid_of tor); tn=$(public_conns "$tcid" | sort -u | grep -c . || true)
+echo "  · tor: $tn external relay connection(s) (expected > 0 — this is the only container that should reach the internet)"
+
+if [ "$ARM" = "tor" ] && [ "$fail" -ne 0 ]; then
+    echo "[verify-egress] FAIL — clearnet leak(s) above; the 'all-Tor' arm is not clean." >&2; exit 1
+fi
+echo "[verify-egress] OK"