Skip to content

Multi-checkout daemon collision: two repos share port 8765, wrong-version daemon serves both (stale flags, dead-endpoint deploys) #940

Description

@zackees

Symptom chain (observed live, 2026-07-02)

Two FastLED checkouts on one machine (fastled3 and fastled6), each running its own agent session with its own venv-pinned fbuild. Both daemons contend for port 8765:

  1. fastled3's session (fbuild 2.3.24 from source) killed/restarted its own daemon during testing; fastled6's daemon (older fbuild, --spawner-cwd=C:\Users\niteris\dev\fastled6) took the port.
  2. Every subsequent fbuild build / fbuild deploy from fastled3 was silently served by fastled6's older daemon.
  3. Result A: builds came back vanilla-sizedbuild_flags from the synthesised platformio.ini (-DFASTLED_LPC_SPI_DMA=1) were not applied, so a bench harness silently compiled out. The same build run minutes later against the correct daemon produced the correct 56.42 KB image.
  4. Result B: intermittent daemon error: request failed (http://127.0.0.1:8765/api/deploy) as the daemons churned.
  5. Diagnosis took ~an hour because nothing in the output indicates which checkout's daemon served the request.

Workaround that unblocked us

FBUILD_DAEMON_PORT=8770 in the fastled3 session — fresh isolated daemon, correct version, correct behavior, first try. (Priority 1 in fbuild-paths::get_daemon_port — works as documented.)

Asks

  1. Per-checkout daemon identity by default. Derive the default port (or a Unix-socket/named-pipe path) from a hash of the project root / spawner cwd, so concurrent checkouts never share a daemon accidentally. The port file mechanism already exists; make it per-project rather than per-mode-global.
  2. Version + spawner handshake. The CLI knows its own version and project root; the daemon knows its own. On connect, if cli.version != daemon.version or the daemon's spawner-cwd is a different checkout, print a one-line warning (or refuse and spawn a dedicated daemon): warning: daemon at :8765 is fbuild 2.3.16 spawned by C:\...\fastled6 — this CLI is 2.3.24 from C:\...\fastled3.
  3. Echo the serving daemon's identity in build/deploy headers (version + spawner-cwd) so this failure mode is diagnosable from the log alone.

Severity argument

The failure is silent wrong output, not an error: a build "succeeds" with flags dropped. In our case that meant a firmware bench harness quietly compiled out and the on-silicon validation chased a phantom firmware bug. Multi-checkout, multi-agent machines are increasingly the normal dev setup.

Environment: Windows 10, fbuild 2.3.24 (fastled3, source-installed) vs older venv fbuild (fastled6), both prod mode.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    Status
    Triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions