Skip to content

Handle Unity editor offline reconnects#1183

Open
liuzqk wants to merge 1 commit into
CoplayDev:betafrom
liuzqk:fix/editor-lifecycle-recovery
Open

Handle Unity editor offline reconnects#1183
liuzqk wants to merge 1 commit into
CoplayDev:betafrom
liuzqk:fix/editor-lifecycle-recovery

Conversation

@liuzqk
Copy link
Copy Markdown

@liuzqk liuzqk commented May 31, 2026

Summary

Handle the common window where the Unity Editor is closed or still starting while MCP tools are called.

  • detect editor-offline connection failures before dispatching tool commands
  • retry connection acquisition briefly to cover editor startup and bridge registration races
  • return a structured editor_offline response with hint=retry instead of an unhandled connection failure
  • add focused regression coverage for offline, reconnect, and non-offline error behavior

Why

When the Editor is closed or not ready, agents currently see transport failures and often retry the wrong layer. A retryable MCP response gives clients a clear signal to back off and try again after Unity is ready.

Validation

  • uv run --extra dev python -m pytest tests/test_editor_lifecycle_recovery.py tests/integration/test_domain_reload_resilience.py -q
  • git diff --check

Summary by CodeRabbit

  • New Features

    • Added editor offline detection with automatic reconnection attempts within a configurable maximum wait time.
    • Returns structured offline responses with retry guidance to help clients handle editor unavailability gracefully.
  • Documentation

    • Added Editor Lifecycle Recovery specification detailing offline response behavior and reconnection handling.
  • Tests

    • Added comprehensive test coverage for offline detection, structured response generation, and reconnection retry logic.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 31, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR implements editor lifecycle recovery by adding offline detection and bounded reconnection waiting to the legacy Unity transport layer. When the Unity Editor is unavailable, the MCP server now detects this condition, returns a structured offline response with retry metadata, and retries connection acquisition within a configurable window before giving up.

Changes

Editor Lifecycle Recovery

Layer / File(s) Summary
Editor-offline detection and reconnect helpers
Server/src/transport/legacy/unity_connection.py, docs/plans/2026-05-31-editor-lifecycle-recovery-spec.md
Three private helpers detect offline connection failures by exception text, build structured offline MCPResponse objects, and repeatedly attempt connection with exponential backoff within a capped timeout. A new spec documents the target offline-response behavior and reconnect wait configuration.
Integration with send_command_with_retry
Server/src/transport/legacy/unity_connection.py
Main command handler now routes connection acquisition through the reconnect helper. When the helper returns an offline MCPResponse, the command flow returns it immediately instead of attempting retry logic.
Test coverage for editor offline scenarios
Server/tests/test_editor_lifecycle_recovery.py
Three tests verify that offline errors are detected and classified, that offline responses contain the correct structure and retry metadata, and that reconnection is retried within the window with correct sleep/backoff behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • CoplayDev/unity-mcp#595: Also updates connection/retry-loop logic in send_command_with_retry within the same module, handling bounded retry behavior with different termination reasons.
  • CoplayDev/unity-mcp#510: Related editor recovery improvements in the same transport module, adding reason-aware connection retry behavior for Unity reload/session scenarios.
  • CoplayDev/unity-mcp#796: Companion recovery mechanism in the same file that adds stale-socket detection and cleanup before sending, complementing this PR's offline-reconnect logic.

Poem

🐰 Offline, online—the editor drifts away,
But wait we shall, with exponential grace,
A structured sigh: "retry, come back today,"
Nine seconds kind, then rest in quiet place.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately captures the main change: adding editor offline detection and reconnection handling to the Unity MCP server transport layer.
Description check ✅ Passed The description is well-structured and covers the key aspects: summary of changes, motivation, and validation steps. However, it lacks some template sections like Type of Change, Compatibility details, and Testing checkboxes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@liuzqk liuzqk force-pushed the fix/editor-lifecycle-recovery branch from b7da729 to 2f50043 Compare May 31, 2026 06:22
@liuzqk liuzqk marked this pull request as ready for review May 31, 2026 06:53
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
Server/tests/test_editor_lifecycle_recovery.py (1)

26-66: 💤 Low value

Coverage note: this test patches get_unity_connection directly, so it bypasses discover_all_instances' 5s cache. It validates the retry mechanics but won't catch the no-instances reconnect gap flagged in unity_connection.py. Worth adding a case that exercises the real discovery path once that gap is addressed.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Server/tests/test_editor_lifecycle_recovery.py` around lines 26 - 66, Add a
new test that exercises the real discovery path instead of patching
get_unity_connection directly: simulate discover_all_instances returning an
empty list initially and then returning the instance after the discovery cache
expires so send_command_with_retry goes through the discovery -> reconnect gap;
patch transport.legacy.unity_connection.discover_all_instances,
transport.legacy.unity_connection.time.monotonic and time.sleep to control
timing, set UNITY_MCP_EDITOR_RECONNECT_MAX_WAIT_S via monkeypatch, and assert
send_command_with_retry (the send_command_with_retry function) eventually calls
get_unity_connection and succeeds, verifying retries across the discovery cache
gap.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Server/src/transport/legacy/unity_connection.py`:
- Around line 824-862: The reconnect loop in
_get_connection_with_editor_reconnect retries get_unity_connection() but
discovery is cached, so add a force-refresh plumbing: extend
get_unity_connection(...)/UnityConnectionPool.get_connection(...) to accept a
force_refresh: bool flag and pass it to
discover_all_instances(force_refresh=True) inside UnityConnectionPool, and then
modify _get_connection_with_editor_reconnect to set force_refresh=True for
subsequent reconnect attempts (e.g., when _resolve_instance_id raises the “No
Unity Editor instances found” error or after the first failed attempt) so
instance discovery is re-scanned within the
UNITY_MCP_EDITOR_RECONNECT_MAX_WAIT_S window.

---

Nitpick comments:
In `@Server/tests/test_editor_lifecycle_recovery.py`:
- Around line 26-66: Add a new test that exercises the real discovery path
instead of patching get_unity_connection directly: simulate
discover_all_instances returning an empty list initially and then returning the
instance after the discovery cache expires so send_command_with_retry goes
through the discovery -> reconnect gap; patch
transport.legacy.unity_connection.discover_all_instances,
transport.legacy.unity_connection.time.monotonic and time.sleep to control
timing, set UNITY_MCP_EDITOR_RECONNECT_MAX_WAIT_S via monkeypatch, and assert
send_command_with_retry (the send_command_with_retry function) eventually calls
get_unity_connection and succeeds, verifying retries across the discovery cache
gap.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 03fa78e4-798c-4909-9591-0477d80e498b

📥 Commits

Reviewing files that changed from the base of the PR and between 2d2bdc5 and 2f50043.

📒 Files selected for processing (3)
  • Server/src/transport/legacy/unity_connection.py
  • Server/tests/test_editor_lifecycle_recovery.py
  • docs/plans/2026-05-31-editor-lifecycle-recovery-spec.md

Comment on lines +824 to +862
def _get_connection_with_editor_reconnect(
instance_id: str | None,
command_type: str,
) -> UnityConnection | MCPResponse:
"""Resolve a Unity connection, waiting briefly for editor restart recovery."""
max_wait_s = _get_editor_reconnect_max_wait_s()
deadline = time.monotonic() + max_wait_s
last_error: BaseException | None = None
attempt = 0

while True:
try:
return get_unity_connection(instance_id)
except Exception as exc:
if not _is_editor_offline_error(exc):
raise

last_error = exc
now = time.monotonic()
if max_wait_s <= 0 or now >= deadline:
logger.info(
"Unity editor offline for command=%s instance=%s: %s",
command_type,
instance_id or "default",
exc,
)
return _editor_offline_response(str(last_error))

attempt += 1
remaining_s = max(0.0, deadline - now)
sleep_s = min(remaining_s, 0.25 * (2 ** min(attempt - 1, 2)))
logger.debug(
"Unity editor offline; waiting %.3fs before reconnect attempt %d for command=%s instance=%s",
sleep_s,
attempt + 1,
command_type,
instance_id or "default",
)
time.sleep(sleep_s)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
rg -nP -C3 'def (get_connection|discover_all_instances|get_unity_connection)\b' Server/src/transport/legacy/unity_connection.py
rg -nP 'force_refresh|_scan_interval|_last_full_scan' Server/src/transport/legacy/unity_connection.py

Repository: CoplayDev/unity-mcp

Length of output: 1384


🏁 Script executed:

#!/bin/bash
set -euo pipefail

file="Server/src/transport/legacy/unity_connection.py"

# Find _resolve_instance_id and its error path
rg -n "_resolve_instance_id" "$file"
rg -n "No Unity Editor instances found" "$file"

# Show get_connection implementation and calls to discover_all_instances / force_refresh
sed -n '430,760p' "$file" | sed -n '1,330p'  # adjust later if too large

Repository: CoplayDev/unity-mcp

Length of output: 13229


🏁 Script executed:

#!/bin/bash
set -euo pipefail
file="Server/src/transport/legacy/unity_connection.py"

echo "=== _resolve_instance_id context ==="
rg -n -C3 "_resolve_instance_id" "$file" || true

echo "=== error string context ==="
rg -n -C4 "No Unity Editor instances found" "$file" || true

echo "=== discover_all_instances cache logic ==="
sed -n '470,560p' "$file"

echo "=== get_connection + get_unity_connection ==="
sed -n '620,760p' "$file"

Repository: CoplayDev/unity-mcp

Length of output: 10086


🏁 Script executed:

#!/bin/bash
set -euo pipefail
file="Server/src/transport/legacy/unity_connection.py"

rg -n "_get_editor_reconnect_max_wait_s" "$file"
rg -n "_is_editor_offline_error" "$file"

# Print surrounding code for both
python3 - <<'PY'
import re,sys,os,subprocess, textwrap, json, pathlib
file="Server/src/transport/legacy/unity_connection.py"
import itertools
PY
sed -n '760,920p' "$file"

Repository: CoplayDev/unity-mcp

Length of output: 6233


🏁 Script executed:

#!/bin/bash
set -euo pipefail
file="Server/src/transport/legacy/unity_connection.py"

echo "=== _get_editor_reconnect_max_wait_s ==="
rg -n "_get_editor_reconnect_max_wait_s" "$file"
# guess definition near earlier lines; show around first occurrence
line=$(rg -n "_get_editor_reconnect_max_wait_s" "$file" | head -n1 | cut -d: -f1)
start=$((line-60))
end=$((line+60))
sed -n "${start},${end}p" "$file"

echo "=== _is_editor_offline_error ==="
rg -n "_is_editor_offline_error" "$file"
line=$(rg -n "_is_editor_offline_error" "$file" | head -n1 | cut -d: -f1)
start=$((line-80))
end=$((line+80))
sed -n "${start},${end}p" "$file"

echo "=== reconnect helper snippet area (800-900) ==="
sed -n '780,900p' "$file"

Repository: CoplayDev/unity-mcp

Length of output: 13989


🏁 Script executed:

#!/bin/bash
set -euo pipefail
file="Server/src/transport/legacy/unity_connection.py"

echo "=== _get_editor_reconnect_max_wait_s ==="
rg -n "_get_editor_reconnect_max_wait_s" "$file"
line=$(rg -n "_get_editor_reconnect_max_wait_s" "$file" | head -n1 | cut -d: -f1)
start=$((line-60))
end=$((line+60))
sed -n "${start},${end}p" "$file"

echo "=== _is_editor_offline_error ==="
rg -n "_is_editor_offline_error" "$file"
line=$(rg -n "_is_editor_offline_error" "$file" | head -n1 | cut -d: -f1)
start=$((line-80))
end=$((line+80))
sed -n "${start},${end}p" "$file"

echo "=== reconnect helper snippet area (800-900) ==="
sed -n '800,900p' "$file"

Repository: CoplayDev/unity-mcp

Length of output: 13328


🏁 Script executed:

#!/bin/bash
set -euo pipefail
file="Server/src/transport/legacy/unity_connection.py"

echo "=== _get_editor_reconnect_max_wait_s ==="
rg -n "_get_editor_reconnect_max_wait_s" "$file"
line=$(rg -n "_get_editor_reconnect_max_wait_s" "$file" | head -n1 | cut -d: -f1)
start=$((line-80))
end=$((line+120))
sed -n "${start},${end}p" "$file"

echo "=== _is_editor_offline_error ==="
rg -n "_is_editor_offline_error" "$file"
line=$(rg -n "_is_editor_offline_error" "$file" | head -n1 | cut -d: -f1)
start=$((line-120))
end=$((line+140))
sed -n "${start},${end}p" "$file"

echo "=== reconnect helper snippet area (810-880) ==="
sed -n '810,880p' "$file"

Repository: CoplayDev/unity-mcp

Length of output: 20004


Reconnect retries don’t re-scan instance discovery, so “no editor yet” can’t recover within the reconnect window

_get_connection_with_editor_reconnect() retries get_unity_connection() until UNITY_MCP_EDITOR_RECONNECT_MAX_WAIT_S (default 2s), but get_unity_connection() goes through UnityConnectionPool.get_connection(), which calls discover_all_instances() without force_refresh. When Unity isn’t running yet, discover_all_instances() caches the empty result for _scan_interval=5s, so _resolve_instance_id() keeps raising "No Unity Editor instances found..." on every retry during the 2s window, even if the editor comes back. The retry only meaningfully helps when an instance was already discovered and the failure is in UnityConnection.connect().

Consider plumbing a force_refresh flag through get_unity_connection()/UnityConnectionPool.get_connection() and enabling it on subsequent reconnect attempts when instance resolution fails due to empty discovery.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Server/src/transport/legacy/unity_connection.py` around lines 824 - 862, The
reconnect loop in _get_connection_with_editor_reconnect retries
get_unity_connection() but discovery is cached, so add a force-refresh plumbing:
extend get_unity_connection(...)/UnityConnectionPool.get_connection(...) to
accept a force_refresh: bool flag and pass it to
discover_all_instances(force_refresh=True) inside UnityConnectionPool, and then
modify _get_connection_with_editor_reconnect to set force_refresh=True for
subsequent reconnect attempts (e.g., when _resolve_instance_id raises the “No
Unity Editor instances found” error or after the first failed attempt) so
instance discovery is re-scanned within the
UNITY_MCP_EDITOR_RECONNECT_MAX_WAIT_S window.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant