Skip to content

feat(eot): Audio EOT#4722

Open
chenghao-mou wants to merge 100 commits into
mainfrom
feat/AGT-2520-multimodal-EOU
Open

feat(eot): Audio EOT#4722
chenghao-mou wants to merge 100 commits into
mainfrom
feat/AGT-2520-multimodal-EOU

Conversation

@chenghao-mou
Copy link
Copy Markdown
Member

@chenghao-mou chenghao-mou commented Feb 5, 2026

Adds streaming audio end-of-turn detection. Single user-facing AudioTurnDetector that selects between two backends:

  • cloud
  • local

On cloud transport error or predict_end_of_turn timeout, the session swaps to local for the rest of the stream (sticky per session, one warning per failure mode). Local failures emit the default 1.0 prediction and retry on the next turn.

A user-set unlikely_threshold is scaled multiplicatively against the cloud default so the operating point survives a fallback.

Wired into AudioRecognition: VAD INFERENCE_DONE triggers warmup, END_OF_SPEECH activates the stream, predictions flow back through _run_eou_detection and arbitrate against the endpointing delay. A speaking guard cancels an in-flight bounce if VAD START_OF_SPEECH fires mid-window.

@hsjun99
Copy link
Copy Markdown

hsjun99 commented Feb 25, 2026

@chenghao-mou Excited to see this! A couple of questions:

  1. Will the multimodal EOT model be publicly accessible via model weights or agent-gateway.livekit.cloud, or in some other way?
  2. Any rough timeline for when MultiModalTurnDetector gets fully wired up?

@chenghao-mou
Copy link
Copy Markdown
Member Author

@chenghao-mou Excited to see this! A couple of questions:

  1. Will the multimodal EOT model be publicly accessible via model weights or agent-gateway.livekit.cloud, or in some other way?
  2. Any rough timeline for when MultiModalTurnDetector gets fully wired up?

Thanks for your patience! We don't have an official decision or timeline yet, but hopefully I can get it ready within a month or two.

@chenghao-mou chenghao-mou marked this pull request as ready for review April 22, 2026 07:38
@chenghao-mou chenghao-mou requested a review from a team April 22, 2026 07:38
devin-ai-integration[bot]

This comment was marked as resolved.

@chenghao-mou chenghao-mou changed the title Audio EOT feat(eot): Audio EOT May 27, 2026
github-actions Bot and others added 2 commits May 27, 2026 20:49
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

chenghao-mou and others added 3 commits May 28, 2026 14:08
…dal-EOU

# Conflicts:
#	livekit-agents/livekit/agents/version.py
#	livekit-agents/pyproject.toml
#	livekit-plugins/livekit-plugins-anam/livekit/plugins/anam/version.py
#	livekit-plugins/livekit-plugins-anam/pyproject.toml
#	livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/version.py
#	livekit-plugins/livekit-plugins-anthropic/pyproject.toml
#	livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/version.py
#	livekit-plugins/livekit-plugins-assemblyai/pyproject.toml
#	livekit-plugins/livekit-plugins-asyncai/livekit/plugins/asyncai/version.py
#	livekit-plugins/livekit-plugins-asyncai/pyproject.toml
#	livekit-plugins/livekit-plugins-avatario/livekit/plugins/avatario/version.py
#	livekit-plugins/livekit-plugins-avatario/pyproject.toml
#	livekit-plugins/livekit-plugins-avatartalk/livekit/plugins/avatartalk/version.py
#	livekit-plugins/livekit-plugins-avatartalk/pyproject.toml
#	livekit-plugins/livekit-plugins-aws/livekit/plugins/aws/version.py
#	livekit-plugins/livekit-plugins-aws/pyproject.toml
#	livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/version.py
#	livekit-plugins/livekit-plugins-azure/pyproject.toml
#	livekit-plugins/livekit-plugins-baseten/livekit/plugins/baseten/version.py
#	livekit-plugins/livekit-plugins-baseten/pyproject.toml
#	livekit-plugins/livekit-plugins-bey/livekit/plugins/bey/version.py
#	livekit-plugins/livekit-plugins-bey/pyproject.toml
#	livekit-plugins/livekit-plugins-bithuman/livekit/plugins/bithuman/version.py
#	livekit-plugins/livekit-plugins-bithuman/pyproject.toml
#	livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/version.py
#	livekit-plugins/livekit-plugins-browser/pyproject.toml
#	livekit-plugins/livekit-plugins-cambai/livekit/plugins/cambai/version.py
#	livekit-plugins/livekit-plugins-cambai/pyproject.toml
#	livekit-plugins/livekit-plugins-cartesia/livekit/plugins/cartesia/version.py
#	livekit-plugins/livekit-plugins-cartesia/pyproject.toml
#	livekit-plugins/livekit-plugins-cerebras/livekit/plugins/cerebras/version.py
#	livekit-plugins/livekit-plugins-cerebras/pyproject.toml
#	livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/version.py
#	livekit-plugins/livekit-plugins-clova/pyproject.toml
#	livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/version.py
#	livekit-plugins/livekit-plugins-deepgram/pyproject.toml
#	livekit-plugins/livekit-plugins-did/livekit/plugins/did/version.py
#	livekit-plugins/livekit-plugins-did/pyproject.toml
#	livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/version.py
#	livekit-plugins/livekit-plugins-elevenlabs/pyproject.toml
#	livekit-plugins/livekit-plugins-fal/livekit/plugins/fal/version.py
#	livekit-plugins/livekit-plugins-fal/pyproject.toml
#	livekit-plugins/livekit-plugins-fireworksai/livekit/plugins/fireworksai/version.py
#	livekit-plugins/livekit-plugins-fireworksai/pyproject.toml
#	livekit-plugins/livekit-plugins-fishaudio/livekit/plugins/fishaudio/version.py
#	livekit-plugins/livekit-plugins-fishaudio/pyproject.toml
#	livekit-plugins/livekit-plugins-gladia/livekit/plugins/gladia/version.py
#	livekit-plugins/livekit-plugins-gladia/pyproject.toml
#	livekit-plugins/livekit-plugins-gnani/livekit/plugins/gnani/version.py
#	livekit-plugins/livekit-plugins-gnani/pyproject.toml
#	livekit-plugins/livekit-plugins-google/livekit/plugins/google/version.py
#	livekit-plugins/livekit-plugins-google/pyproject.toml
#	livekit-plugins/livekit-plugins-gradium/livekit/plugins/gradium/version.py
#	livekit-plugins/livekit-plugins-gradium/pyproject.toml
#	livekit-plugins/livekit-plugins-groq/livekit/plugins/groq/version.py
#	livekit-plugins/livekit-plugins-groq/pyproject.toml
#	livekit-plugins/livekit-plugins-hamming/livekit/plugins/hamming/version.py
#	livekit-plugins/livekit-plugins-hamming/pyproject.toml
#	livekit-plugins/livekit-plugins-hedra/livekit/plugins/hedra/version.py
#	livekit-plugins/livekit-plugins-hedra/pyproject.toml
#	livekit-plugins/livekit-plugins-hume/livekit/plugins/hume/version.py
#	livekit-plugins/livekit-plugins-hume/pyproject.toml
#	livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/version.py
#	livekit-plugins/livekit-plugins-inworld/pyproject.toml
#	livekit-plugins/livekit-plugins-keyframe/livekit/plugins/keyframe/version.py
#	livekit-plugins/livekit-plugins-keyframe/pyproject.toml
#	livekit-plugins/livekit-plugins-krisp/livekit/plugins/krisp/version.py
#	livekit-plugins/livekit-plugins-krisp/pyproject.toml
#	livekit-plugins/livekit-plugins-langchain/livekit/plugins/langchain/version.py
#	livekit-plugins/livekit-plugins-langchain/pyproject.toml
#	livekit-plugins/livekit-plugins-lemonslice/livekit/plugins/lemonslice/version.py
#	livekit-plugins/livekit-plugins-lemonslice/pyproject.toml
#	livekit-plugins/livekit-plugins-liveavatar/livekit/plugins/liveavatar/version.py
#	livekit-plugins/livekit-plugins-liveavatar/pyproject.toml
#	livekit-plugins/livekit-plugins-lmnt/livekit/plugins/lmnt/version.py
#	livekit-plugins/livekit-plugins-lmnt/pyproject.toml
#	livekit-plugins/livekit-plugins-minimal/livekit/plugins/minimal/version.py
#	livekit-plugins/livekit-plugins-minimal/pyproject.toml
#	livekit-plugins/livekit-plugins-minimax/livekit/plugins/minimax/version.py
#	livekit-plugins/livekit-plugins-minimax/pyproject.toml
#	livekit-plugins/livekit-plugins-mistralai/livekit/plugins/mistralai/version.py
#	livekit-plugins/livekit-plugins-mistralai/pyproject.toml
#	livekit-plugins/livekit-plugins-murf/livekit/plugins/murf/version.py
#	livekit-plugins/livekit-plugins-murf/pyproject.toml
#	livekit-plugins/livekit-plugins-neuphonic/livekit/plugins/neuphonic/version.py
#	livekit-plugins/livekit-plugins-neuphonic/pyproject.toml
#	livekit-plugins/livekit-plugins-nltk/livekit/plugins/nltk/version.py
#	livekit-plugins/livekit-plugins-nltk/pyproject.toml
#	livekit-plugins/livekit-plugins-nvidia/livekit/plugins/nvidia/version.py
#	livekit-plugins/livekit-plugins-nvidia/pyproject.toml
#	livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/version.py
#	livekit-plugins/livekit-plugins-openai/pyproject.toml
#	livekit-plugins/livekit-plugins-perplexity/livekit/plugins/perplexity/version.py
#	livekit-plugins/livekit-plugins-perplexity/pyproject.toml
#	livekit-plugins/livekit-plugins-phonic/livekit/plugins/phonic/version.py
#	livekit-plugins/livekit-plugins-phonic/pyproject.toml
#	livekit-plugins/livekit-plugins-resemble/livekit/plugins/resemble/version.py
#	livekit-plugins/livekit-plugins-resemble/pyproject.toml
#	livekit-plugins/livekit-plugins-rime/livekit/plugins/rime/version.py
#	livekit-plugins/livekit-plugins-rime/pyproject.toml
#	livekit-plugins/livekit-plugins-rtzr/livekit/plugins/rtzr/version.py
#	livekit-plugins/livekit-plugins-rtzr/pyproject.toml
#	livekit-plugins/livekit-plugins-runway/livekit/plugins/runway/version.py
#	livekit-plugins/livekit-plugins-runway/pyproject.toml
#	livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/version.py
#	livekit-plugins/livekit-plugins-sarvam/pyproject.toml
#	livekit-plugins/livekit-plugins-silero/livekit/plugins/silero/version.py
#	livekit-plugins/livekit-plugins-silero/pyproject.toml
#	livekit-plugins/livekit-plugins-simli/livekit/plugins/simli/version.py
#	livekit-plugins/livekit-plugins-simli/pyproject.toml
#	livekit-plugins/livekit-plugins-simplismart/livekit/plugins/simplismart/version.py
#	livekit-plugins/livekit-plugins-simplismart/pyproject.toml
#	livekit-plugins/livekit-plugins-slng/livekit/plugins/slng/version.py
#	livekit-plugins/livekit-plugins-slng/pyproject.toml
#	livekit-plugins/livekit-plugins-smallestai/livekit/plugins/smallestai/version.py
#	livekit-plugins/livekit-plugins-smallestai/pyproject.toml
#	livekit-plugins/livekit-plugins-soniox/livekit/plugins/soniox/version.py
#	livekit-plugins/livekit-plugins-soniox/pyproject.toml
#	livekit-plugins/livekit-plugins-speechify/livekit/plugins/speechify/version.py
#	livekit-plugins/livekit-plugins-speechify/pyproject.toml
#	livekit-plugins/livekit-plugins-speechmatics/livekit/plugins/speechmatics/version.py
#	livekit-plugins/livekit-plugins-speechmatics/pyproject.toml
#	livekit-plugins/livekit-plugins-spitch/livekit/plugins/spitch/version.py
#	livekit-plugins/livekit-plugins-spitch/pyproject.toml
#	livekit-plugins/livekit-plugins-tavus/livekit/plugins/tavus/version.py
#	livekit-plugins/livekit-plugins-tavus/pyproject.toml
#	livekit-plugins/livekit-plugins-telnyx/livekit/plugins/telnyx/version.py
#	livekit-plugins/livekit-plugins-telnyx/pyproject.toml
#	livekit-plugins/livekit-plugins-trugen/livekit/plugins/trugen/version.py
#	livekit-plugins/livekit-plugins-trugen/pyproject.toml
#	livekit-plugins/livekit-plugins-turn-detector/livekit/plugins/turn_detector/version.py
#	livekit-plugins/livekit-plugins-turn-detector/pyproject.toml
#	livekit-plugins/livekit-plugins-ultravox/livekit/plugins/ultravox/version.py
#	livekit-plugins/livekit-plugins-ultravox/pyproject.toml
#	livekit-plugins/livekit-plugins-upliftai/livekit/plugins/upliftai/version.py
#	livekit-plugins/livekit-plugins-upliftai/pyproject.toml
#	livekit-plugins/livekit-plugins-xai/livekit/plugins/xai/version.py
#	livekit-plugins/livekit-plugins-xai/pyproject.toml
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 49 additional findings in Devin Review.

Open in Devin Review

Comment on lines +1322 to +1346
if (
prediction_event is None
or prediction_event is not self._last_emitted_prediction
):
self._last_emitted_prediction = prediction_event
inference_duration = (
prediction_event.inference_duration
if prediction_event is not None
and prediction_event.inference_duration is not None
else 0.0
)
# end of speech -> prediction receive time
delay = (
time.time() - last_speaking_time
if last_speaking_time is not None
else 0.0
)
self._hooks.on_eot_prediction(
EotPredictionEvent(
probability=end_of_turn_probability or 0.0,
threshold=unlikely_threshold or 0.0,
inference_duration=inference_duration,
delay=delay,
)
)
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot Jun 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟑 EotPredictionEvent emitted with probability=0.0 when turn detector's language is unsupported

When turn_detector.supports_language() returns False, the code skips the entire detection else block (line 1251), leaving end_of_turn_probability as None and unlikely_threshold as None. However, the on_eot_prediction dedup guard at line 1322 always fires when prediction_event is None (which it is here since we never entered the detection block). This emits an EotPredictionEvent with probability=0.0 and threshold=0.0 β€” a meaningless prediction that downstream consumers (the remote session protocol, metrics) cannot distinguish from a genuine zero-probability result. A similar false emission occurs when the supports_language check logs an info message and falls through.

Open in Devin Review

Was this helpful? React with πŸ‘ or πŸ‘Ž to provide feedback.

# Conflicts:
#	examples/voice_agents/async_tool_agent.py
#	examples/voice_agents/basic_agent.py
#	livekit-agents/livekit/agents/voice/agent_activity.py
#	tests/test_agent_session.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants