Scenario yaml source-of-truth + on_simulation_end verdict override#5992
Conversation
…5557) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…onfigurable timeout param (#5529) Co-authored-by: Long Chen <longch1024@gmail.com>
Co-authored-by: nightcityblade <nightcityblade@gmail.com>
Signed-off-by: James Liounis <james.liounis@perplexity.ai> Co-authored-by: Tina Nguyen <72938484+tinalenguyen@users.noreply.github.com> Co-authored-by: james-pplx <james-pplx@users.noreply.github.com>
Co-authored-by: nightcityblade <nightcityblade@users.noreply.github.com>
Co-authored-by: Tina Nguyen <72938484+tinalenguyen@users.noreply.github.com> Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: nightcityblade <nightcityblade@users.noreply.github.com> Co-authored-by: nightcityblade <nightcityblade@gmail.com>
Co-authored-by: detail-app[bot] <180357370+detail-app[bot]@users.noreply.github.com>
Co-authored-by: detail-app[bot] <180357370+detail-app[bot]@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: nightcityblade <nightcityblade@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: nightcityblade <nightcityblade@gmail.com>
- simulation.py: proto-backed Scenario/ScenarioGroup, load_scenarios, SimulationContext - rtc_session on_simulation_end; JobContext.simulation_context() from dispatch metadata - thread simulation_end_fnc through ipc; SessionHost SimulationFinalize handler - hotel example scenarios.yaml + on_simulation_end (db vs target_state)
# Conflicts: # livekit-agents/livekit/agents/cli/cli.py # livekit-agents/livekit/agents/voice/__init__.py # livekit-agents/livekit/agents/voice/agent_activity.py # livekit-agents/livekit/agents/voice/audio_recognition.py # livekit-agents/livekit/agents/voice/remote_session.py # livekit-agents/livekit/agents/worker.py # livekit-agents/pyproject.toml # uv.lock
|
|
STT Test Results⚠ Could not parse test results: [Errno 2] No such file or directory: 'test-results.xml' Triggered by workflow run #2299 |
|
|
||
| except aiohttp.ClientError as e: | ||
| logger.error(f"WebSocket error while receiving: {e}") | ||
| except Exception as e: | ||
| logger.error(f"Unexpected error while receiving messages: {e}") | ||
| # Request reconnect if STT silently dies on WS drop. | ||
| if not self._reconnect_event.is_set(): | ||
| logger.warning("Soniox STT WebSocket closed; requesting reconnect") | ||
| self._reconnect_event.set() | ||
|
|
There was a problem hiding this comment.
🔴 Soniox STT enters infinite reconnect loop after normal stream completion
The new reconnect-on-WS-drop logic at the end of _recv_messages_task unconditionally sets self._reconnect_event whenever the async for msg in self._ws loop terminates — including after a normal finished message from the server. This causes the _run() method's while True loop to reconnect instead of breaking. On the second connection, self._input_ch is already closed (user called end_input()), so send_task immediately exhausts, sends close, and recv_task finishes, setting the reconnect event again — creating an infinite reconnect loop that prevents _run() from ever returning.
How the loop forms
- Normal completion: server sends
finished, closes WS async for msg in self._wsends → falls through to the newself._reconnect_event.set()- In
_run(), bothsend_recv_groupandwait_reconnectare done wait_reconnect in done→ loop doesn'tbreak, clears event, reconnects- New
send_tasksees closed_input_ch→ immediately finishes → sends close - New
recv_tasksees close ack → finishes → sets reconnect event again - Back to step 3 — infinite loop
The stream's _main_task hangs until aclose() cancels it, leaking the connection and preventing the base class retry loop from completing.
Prompt for agents
The reconnect-on-drop logic in _recv_messages_task fires unconditionally when the async-for loop exits, including after a normal server-sent `finished` message. This creates an infinite reconnect loop because _input_ch is already closed on the second connection.
The fix should guard the reconnect request so it only fires on unexpected WebSocket drops (not after a clean `finished` completion). One approach: track whether a `finished` message was received (e.g. set a `self._finished = True` flag inside the `if content.get('finished')` block), and only set the reconnect event when the WS closes WITHOUT having received `finished`. Alternatively, return early from _recv_messages_task after processing `finished` to avoid reaching the reconnect code.
Was this helpful? React with 👍 or 👎 to provide feedback.
|
|
||
| # TODO: default all languages to ink-2 once they are supported | ||
| if utils.is_given(model): | ||
| resolved_model = model | ||
| elif language_code is None or language_code.language == "en": | ||
| resolved_model = "ink-2" | ||
| else: | ||
| resolved_model = "ink-whisper" | ||
|
|
||
| is_whisper = _is_whisper_model(resolved_model) | ||
|
|
||
| resolved_final_transcript_mode: _ResolvedFinalTranscriptMode | ||
| if is_whisper: | ||
| resolved_final_transcript_mode = "legacy" | ||
| else: | ||
| resolved_final_transcript_mode = "auto" | ||
|
|
||
| super().__init__( |
There was a problem hiding this comment.
🚩 Cartesia STT default model changed from ink-whisper to ink-2 with turn detection
The Cartesia STT plugin now defaults to ink-2 (for English) instead of ink-whisper. ink-2 supports server-side turn detection (AutoFinalizeRecognizeStream using the /stt/turns/websocket endpoint), interim results, and preflight transcripts — but NOT word-level aligned transcripts. ink-whisper is preserved as a fallback for non-English languages and remains available via the legacy LegacyRecognizeStream. The capabilities are now set dynamically based on the resolved model (interim_results=True for ink-2, aligned_transcript='word' for ink-whisper). Users who relied on cartesia.STT() producing word timestamps will silently lose them unless they explicitly pass model='ink-whisper'.
Was this helpful? React with 👍 or 👎 to provide feedback.
| @@ -84,7 +85,7 @@ class STTOptions: | |||
| domain: str | None = None | |||
|
|
|||
| # Endpointing mode | |||
| turn_detection_mode: TurnDetectionMode = TurnDetectionMode.ADAPTIVE | |||
| turn_detection_mode: TurnDetectionMode = TurnDetectionMode.EXTERNAL | |||
There was a problem hiding this comment.
🚩 Speechmatics default turn_detection_mode changed from ADAPTIVE to EXTERNAL requires silero
The default turn_detection_mode changed from TurnDetectionMode.ADAPTIVE to TurnDetectionMode.EXTERNAL at livekit-plugins-speechmatics/stt.py:88. In EXTERNAL mode, the constructor auto-loads Silero VAD (line 278-286), which requires livekit-plugins-silero to be installed. This package is not a declared dependency of the speechmatics plugin (pyproject.toml only depends on livekit-agents>=1.5.18 and speechmatics-voice[smart]>=0.2.8). Users upgrading who relied on the old ADAPTIVE default will get an ImportError at runtime if silero is not installed. The error message is descriptive and tells users how to fix it (install silero, or pass vad=None, or use turn_detection_mode=ADAPTIVE). This pattern is consistent with inference.STT for Speechmatics models. Still, it's a breaking change in default behavior that could surprise users on upgrade.
Was this helpful? React with 👍 or 👎 to provide feedback.
| @asynccontextmanager | ||
| async def _wait_for_idle_and_hold(self) -> AsyncIterator[AgentActivity]: | ||
| """Wait for idle, then block other ``wait_for_idle`` callers until exit.""" | ||
| from .agent_activity import _IdleHoldContextVar | ||
|
|
||
| activity = await self.wait_for_idle() | ||
| self._idle_holds += 1 | ||
| self._idle_released.clear() | ||
| token = _IdleHoldContextVar.set(True) | ||
| try: | ||
| yield activity | ||
| finally: | ||
| _IdleHoldContextVar.reset(token) | ||
| self._idle_holds -= 1 | ||
| if self._idle_holds == 0: | ||
| self._idle_released.set() |
There was a problem hiding this comment.
🚩 wait_for_idle / _wait_for_idle_and_hold idle-hold mechanism may not block other callers
The _wait_for_idle_and_hold context manager at agent_session.py:1376-1391 increments _idle_holds, clears _idle_released, and sets _IdleHoldContextVar. However, the public wait_for_idle method at line 1356-1374 delegates to activity.wait_for_idle() and never checks _idle_holds or waits on _idle_released. This means the "hold" may only work if AgentActivity.wait_for_idle() checks _IdleHoldContextVar (which I couldn't fully verify in this review). If the AgentActivity side doesn't participate, other callers of session.wait_for_idle() would not be blocked during the hold window, which could cause tool reply coalescing races. The _idle_released event and _idle_holds counter would be dead code in that case.
Was this helpful? React with 👍 or 👎 to provide feedback.
32bb9b4 to
dd5098b
Compare
- cli.run_app delegates to the deprecated rich CLI (cli/_legacy.py) - new thin start/console(TCP) interface lives in livekit.agents.__main__ - restore worker.py unregistered/agent-name/aclose handling the thin CLI needs - keep rich-CLI deps (typer/sounddevice/watchfiles) required until it's removed
- room_io: remove duplicate register_text_input left by an earlier main merge (keep main's _on_chat_text_stream path; the branch dup was shadowing it) - type-check now green: cast tcp_console __anext__ return, drop unused ignores - tcp_console: raise StopAsyncIteration from None (B904) - test_ipc: pass simulation_end_fnc to ProcPool/ProcJobExecutor (branch arg) - add unit category markers to branch-only test modules (unblocks --unit collection) - test_remote_session: run_input -> run (method was renamed)
- test_remote_session: fix mock AgentSessionOptions (preemptive_generation is dict-like, not bool) and give FakeRunResult an event so handlers succeed - test_run_input_errors: wrap conn_options in SessionConnectOptions(llm_conn_options=...) and call RemoteSession.run (renamed from run_input) - xfail the two direct session.run() error-propagation tests: LLM errors aren't surfaced to RunResult yet (the e2e SessionHost path does propagate them)
When the LLM task fails (and was not interrupted/cancelled), record the exception on the SpeechHandle (_maybe_run_final_output) so RunResult raises it on await, instead of swallowing it in the speech-generation path. Also retrieves the task exception (no 'never retrieved' warning). Removes the xfail on the two error-propagation tests and asserts APIStatusError.
No description provided.