Skip to content

feat(realtime): connection quality, preflight, and glass-to-glass latency#18

Merged
AdirAmsalem merged 10 commits into
mainfrom
plan-pr-156-support
Jun 15, 2026
Merged

feat(realtime): connection quality, preflight, and glass-to-glass latency#18
AdirAmsalem merged 10 commits into
mainfrom
plan-pr-156-support

Conversation

@AdirAmsalem

Copy link
Copy Markdown
Contributor

Summary

Android realtime sessions can now report whether the connection is healthy enough to feel responsive, and apps can probe the network before opening a session. checkConnectivity() runs a fast STUN reachability check (or an opt-in deep probe that briefly opens a real session), while connectionQuality / onConnectionQuality emit a debounced in-session verdict with the limiting factor (bandwidth, latency, loss, stalls). Opt-in debugQuality measures true camera→display latency via a pixel marker — startup (ttffMs), steady-state (g2gMs), and end-to-end drops — so latency scoring reflects what users actually feel, not just network RTT.

// Preflight — gate the integration before connecting
val preflight = realtime.checkConnectivity()
if (preflight.quality == ConnectionQuality.CRITICAL) showFallbackUi(preflight.reasons)

// In-session — react to live quality while connected
realtime.connect(
    ConnectOptions(
        model = RealtimeModels.LUCY_2_1,
        onConnectionQuality = { report -> updateBadge(report.quality, report.limitingFactor) },
        onRemoteStream = { /* ... */ },
    ),
)

// Diagnostic glass-to-glass (visible marker — not for production)
val stream = realtime.createLocalVideoStream(model, debugQuality = true)
realtime.connect(ConnectOptions(model = model, debugQuality = true), localStream = stream)
realtime.getGlassToGlass() // ttffMs, medianMs, dropRatio

Test plan

  • Unit tests: connection quality scoring/evaluator, pixel marker, G2G tracker, preflight classification
  • On-device: STUN preflight on Wi‑Fi / cellular / restrictive network
  • On-device: in-session quality badge updates while throttling uplink
  • On-device: debugQuality round-trip — marker visible, g2gMs / ttffMs populate
  • On-device: deep probe completes and returns a sensible verdict
  • Sample app: connect, disconnect, toggle G2G measurement, run both connectivity checks

Made with Cursor

…s JS SDK #156, #158)

Pre-connect probe:
- checkConnectivity(): STUN-only WebRTC reachability + latency probe via a
  throwaway PeerConnection (no session); classifies udp/relay/failed + RTT
  bands into good/fair/poor/critical with reasons.
- Deep probe (opt-in, CheckConnectivityOptions(deep = true, model)): brief
  real session on a synthetic capturer measuring true glass-to-glass latency;
  hard-capped at durationMs + connect budget.

In-session quality:
- ConnectionQualityEvaluator: smoothed verdict from WebRTC stats (latency,
  loss, upstream BWE, fps/freezes) with warm-up + asymmetric hysteresis.
  Surfaced via connectionQuality StateFlow, onConnectionQuality callback,
  getConnectionQuality(), and DiagnosticEvent.ConnectionQualitySample
  (per stats tick; level debounced, metrics live).

Glass-to-glass (opt-in via ConnectOptions(debugQuality = true)):
- Pixel-marker protocol port (luma stamp/read, server pixel_latency mode):
  StampingVideoProcessor stamps outgoing I420 Y-planes, MarkerReaderSink
  reads rendered frames, SeqTracker derives ttff/g2g/drop-ratio; measured
  g2g drives the latency verdict instead of RTT.

Emulator-verified: STUN preflight (fair/udp/171ms), deep-probe failure path
(invalid key -> critical, no retry hang), stamping pipeline on live camera
frames. Marker placement vs server rotation still needs physical-device QA.

Sample app demos all flows; README documents the new API. 177 unit tests.
…ass marker

The pixel-marker protocol operates in display space, but camera buffers arrive
in sensor orientation with rotation metadata. Stamping the raw I420 buffer put
the marker where the server's pixel_latency reader never looks, so the camera
path produced zero marker matches (verified live on an emulator: synthetic
rotation-0 frames round-tripped, camera frames did not).

StampingVideoProcessor now rotates each frame upright (libyuv I420Rotate) before
the mirror flip + stamp and emits rotation-0 frames; the marker reader uprights
rotated remote frames the same way (server output is rotation-0 in practice).

Verified against the live server on an emulator:
- deep probe: transport=udp rtt=41ms ttff=11.4s g2g=515ms samples=8 (early exit)
- camera session with debugQuality: ttff 5.8s, g2g + drop ratio populating
…(ports JS SDK #161)

The loss bands were too lenient for a real-time v2v pipeline — up to 2% loss
read as "good" and only >10% as "critical", under-reporting genuinely degraded
sessions. New bands (shared by packetLoss and g2gDrop, which measure delivery
failures on the same scale): good <0.1%, fair 0.1-1%, poor 1-5%, critical >5%.

Also fixes the deep-probe reason strings to render fractional percentages
(0.1%) instead of truncating to 0%.
Comment thread sdk/src/main/java/ai/decart/sdk/realtime/RealTimeClient.kt
- Freeze-delta overcount: lastFreezeCount is now null until the first inbound
  sample baselines it, so the first interval reports a 0 delta instead of the
  whole cumulative freezeCount (which wrongly pulled the stall dimension to FAIR
  right after connect / on stats-loop restart).
- Stale quality after failed connect: reset _connectionQuality to null when
  connect() throws, so getConnectionQuality() can't return a verdict from an
  aborted attempt until the next disconnect().
Comment thread sdk/src/main/java/ai/decart/sdk/realtime/ConnectivityProbe.kt
- Deslop: compress 7 verbose comments/KDocs to keep the non-obvious WHY
  (relay headroom, BWE-vs-encoder-target, median tie-break, marker block sizes,
  probe budget, display-space rotation) while cutting narration/marketing/anecdote.
- Reuse: extract the identical 11-param makeSignals() builder duplicated across
  ConnectionQualityScoringTest and ConnectionQualityEvaluatorTest into one shared
  QualitySignalsFixtures.kt.
Comment thread sdk/src/main/java/ai/decart/sdk/realtime/RealTimeClient.kt Outdated
getConnectionQuality()/connectionQuality fell back to the cached _connectionQuality
whenever the session manager returned null (e.g. mid-session auto-reconnect or
connection loss, when the media channel is torn down), so they could keep returning
the previous session's verdict until the next explicit disconnect(). Clear the cache
when onConnectionStateChange moves to RECONNECTING or DISCONNECTED, mirroring the JS
SDK's evaluator reset on reconnect. (Addresses Cursor Bugbot on PR #18.)
Comment thread sdk/src/main/java/ai/decart/sdk/realtime/RealTimeClient.kt Outdated
Comment thread sdk/src/main/java/ai/decart/sdk/realtime/ConnectionQuality.kt Outdated
…ot, High)

getConnectionQuality() preferred sessionManager.getConnectionQuality() (the live
media channel's evaluator) over the _connectionQuality StateFlow, so during a
reconnect/disconnect window — where the flow is cleared but the channel/manager
still briefly exist — the getter and the connectionQuality flow could disagree
(getter returns the stale prior-session verdict while the flow already shows null).

Collapse to a single source of truth: getConnectionQuality() now returns
_connectionQuality.value, which is updated on every quality sample and cleared on
disconnect / failed connect / reconnecting. Removes the now-dead
RealtimeSessionManager.getConnectionQuality() and LiveKitMediaChannel.currentConnectionQuality().
… cadence (PR #18 Bugbot)

The warmup/window/hysteresis thresholds were ported from the JS SDK as raw sample
counts, but the JS evaluator samples at 1s while Android drives it every
STATS_INTERVAL_MS (3s) — so warm-up and debounced level changes took ~3x as long as
the JS values imply (warm-up ~24s, downgrades ~15s). Scale the counts to ~1/3 so
they land on the same wall-clock: window 5->3 (~9s), warmup 8->3 (~9s),
downgrade/upgrade 5->2 (~6s). Bands (rtt/g2g/ttff/loss/upstream/stall) are unchanged.
…(PR #18 Bugbot)

On RoomEvent.Reconnecting LiveKit reconnects in place — the media channel and its
publish-stats loop survive — so unlike the SDK-level reconnect (which recreates the
channel + evaluator) the ConnectionQualityEvaluator was never reset. The running
stats loop then re-emitted the stale pre-reconnect verdict (repopulating the
_connectionQuality that onConnectionStateChange had just cleared) and skipped a fresh
warm-up after Reconnected. Reset the evaluator + freeze baseline on Reconnecting,
matching the JS SDK's reset-on-reconnect.
…Bugbot)

Follow-on to the evaluator reset: on RoomEvent.Reconnecting the opt-in SeqTracker
was left intact, so pre-reconnect g2gMs/ttffMs and pending stamps kept feeding the
stats loop and produced stale glass-to-glass verdicts after reconnect when
debugQuality is on. markStart() it alongside the evaluator/freeze reset (the same
call the SDK-level reconnect makes), re-arming its TTFF clock and warm-up.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit e175f1d. Configure here.

@AdirAmsalem AdirAmsalem merged commit 9bcecf2 into main Jun 15, 2026
1 check passed
@AdirAmsalem AdirAmsalem deleted the plan-pr-156-support branch June 15, 2026 08:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant