Record replay audio#252
Conversation
|
Firetiger deploy monitoring skipped This PR didn't match the auto-monitor filter configured on your GitHub connection:
Reason: PR modifies chromium launcher, wrapper, recorder, and config packages rather than the kernel API endpoints (packages/api/cmd/api/) or Temporal workflows (packages/api/lib/temporal) specified in the filter. To monitor this PR anyway, reply with |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Recording cap shorter than script
- I made the recording duration configurable per test and increased the Zombocom archive scenario cap to 70 seconds so recording no longer ends before the Playwright flow completes.
Or push these changes by commenting:
@cursor push 2f6a9b26ef
Preview (2f6a9b26ef)
diff --git a/server/e2e/e2e_recording_audio_test.go b/server/e2e/e2e_recording_audio_test.go
--- a/server/e2e/e2e_recording_audio_test.go
+++ b/server/e2e/e2e_recording_audio_test.go
@@ -52,7 +52,7 @@
return await page.title();
`, audioSite.ContainerURL())
- recordReplayAudio(t, ctx, c, playwrightCode, os.Getenv("RECORDING_AUDIO_OUTPUT_PATH"), 0.1)
+ recordReplayAudio(t, ctx, c, playwrightCode, os.Getenv("RECORDING_AUDIO_OUTPUT_PATH"), 35, 0.1)
}
func TestReplayRecordingZombocomArchiveAudio(t *testing.T) {
@@ -132,16 +132,15 @@
return playback;
`
- recordReplayAudio(t, ctx, c, playwrightCode, outputPath, 0.01)
+ recordReplayAudio(t, ctx, c, playwrightCode, outputPath, 70, 0.01)
}
-func recordReplayAudio(t *testing.T, ctx context.Context, c *TestContainer, playwrightCode string, outputPath string, minPeakLevel float64) {
+func recordReplayAudio(t *testing.T, ctx context.Context, c *TestContainer, playwrightCode string, outputPath string, maxDuration int, minPeakLevel float64) {
t.Helper()
client, err := c.APIClient()
require.NoError(t, err, "failed to create API client")
- maxDuration := 35
maxFileSize := 100
startResp, err := client.StartRecordingWithResponse(ctx, instanceoapi.StartRecordingJSONRequestBody{
MaxDurationInSeconds: &maxDuration,You can send follow-ups to the cloud agent here.
| client, err := c.APIClient() | ||
| require.NoError(t, err, "failed to create API client") | ||
|
|
||
| maxDuration := 35 |
There was a problem hiding this comment.
Recording cap shorter than script
Medium Severity
recordReplayAudio caps ffmpeg at 35 seconds while the Zombocom Playwright script can run ~50+ seconds (30s selector timeout plus 20s of fixed waits). Recording stops mid-script, so the MP4 may omit most playback and audio checks can fail or pass on noise despite a successful browser run.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit ee7fb17. Configure here.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
There are 3 total unresolved issues (including 1 from previous review).
Autofix Details
Bugbot Autofix prepared fixes for both issues found in the latest run.
- ✅ Fixed: Pulse socket before sink ready
- I load
module-null-sinkbeforemodule-native-protocol-unixso the Pulse socket appears only afterKernelOutputis created.
- I load
- ✅ Fixed: Keepalive exit kills PulseAudio
- I replaced
wait -nwith a keepalive retry loop and wait only on PulseAudio so apacatexit no longer terminates the daemon.
- I replaced
Or push these changes by commenting:
@cursor push 8197b93a61
Preview (8197b93a61)
diff --git a/shared/start-pulseaudio.sh b/shared/start-pulseaudio.sh
--- a/shared/start-pulseaudio.sh
+++ b/shared/start-pulseaudio.sh
@@ -22,16 +22,12 @@
--daemonize=no \
--log-target=stderr \
--exit-idle-time=-1 \
- --load="module-native-protocol-unix socket=/tmp/pulse/native auth-anonymous=1" \
- --load="module-null-sink sink_name=KernelOutput rate=48000 channels=2 sink_properties=device.description=KernelOutput" &
+ --load="module-null-sink sink_name=KernelOutput rate=48000 channels=2 sink_properties=device.description=KernelOutput" \
+ --load="module-native-protocol-unix socket=/tmp/pulse/native auth-anonymous=1" &
pulse_pid=$!
- keepalive_pid=""
cleanup() {
- if [ -n "$keepalive_pid" ]; then
- kill "$keepalive_pid" 2>/dev/null || true
- fi
kill "$pulse_pid" 2>/dev/null || true
wait 2>/dev/null || true
}
@@ -44,10 +40,12 @@
sleep 0.1
done
- (
- pacat --raw --rate=48000 --channels=2 --format=s16le --device=KernelOutput /dev/zero
- ) &
- keepalive_pid=$!
+ while kill -0 "$pulse_pid" 2>/dev/null; do
+ pacat --raw --rate=48000 --channels=2 --format=s16le --device=KernelOutput /dev/zero || true
+ if kill -0 "$pulse_pid" 2>/dev/null; then
+ sleep 0.1
+ fi
+ done
- wait -n "$pulse_pid" "$keepalive_pid"
+ wait "$pulse_pid"
'You can send follow-ups to the cloud agent here.
| ) & | ||
| keepalive_pid=$! | ||
|
|
||
| wait -n "$pulse_pid" "$keepalive_pid" |
There was a problem hiding this comment.
Pulse socket before sink ready
Medium Severity
Backgrounding pulseaudio exposes the Unix socket as soon as the protocol module loads, while KernelOutput may still be registering. The container wrapper only waits for the socket file, then starts Chromium with PULSE_SINK=KernelOutput, so the browser can launch before that sink exists and miss early replay audio.
Reviewed by Cursor Bugbot for commit f416d9d. Configure here.
| ) & | ||
| keepalive_pid=$! | ||
|
|
||
| wait -n "$pulse_pid" "$keepalive_pid" |
There was a problem hiding this comment.
Keepalive exit kills PulseAudio
Medium Severity
The script uses wait -n on both pulseaudio and the pacat keepalive, and the EXIT trap always kills pulse_pid. If pacat exits while pulseaudio is still healthy, cleanup terminates the daemon and the whole supervisord program exits, causing an unnecessary audio stack restart.
Reviewed by Cursor Bugbot for commit f416d9d. Configure here.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 4 total unresolved issues (including 3 from previous reviews).
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Darwin audio path uses incompatible PulseAudio source name
- Darwin now validates audio sources as numeric AVFoundation device indices or "none" in both parameter validation and ffmpeg argument construction, preventing invalid PulseAudio names from reaching ffmpeg.
Or push these changes by commenting:
@cursor push 159fdd82c8
Preview (159fdd82c8)
diff --git a/server/lib/recorder/ffmpeg.go b/server/lib/recorder/ffmpeg.go
--- a/server/lib/recorder/ffmpeg.go
+++ b/server/lib/recorder/ffmpeg.go
@@ -89,8 +89,16 @@
if p.MaxDurationInSeconds != nil && *p.MaxDurationInSeconds <= 0 {
return fmt.Errorf("max duration must be greater than 0 seconds")
}
- if p.recordAudio() && strings.TrimSpace(p.audioSource()) == "" {
- return fmt.Errorf("audio source is required when recording audio")
+ if p.recordAudio() {
+ audioSource := strings.TrimSpace(p.audioSource())
+ if audioSource == "" {
+ return fmt.Errorf("audio source is required when recording audio")
+ }
+ if runtime.GOOS == "darwin" && audioSource != "none" {
+ if _, err := strconv.Atoi(audioSource); err != nil {
+ return fmt.Errorf("audio source must be a numeric device index or \"none\" on darwin")
+ }
+ }
}
return nil
@@ -511,10 +519,15 @@
case "darwin":
audioDevice := "none"
if recordAudio {
- audioDevice = params.audioSource()
- if strings.TrimSpace(audioDevice) == "" {
+ audioDevice = strings.TrimSpace(params.audioSource())
+ if audioDevice == "" {
return nil, fmt.Errorf("audio source is required when recording audio")
}
+ if audioDevice != "none" {
+ if _, err := strconv.Atoi(audioDevice); err != nil {
+ return nil, fmt.Errorf("audio source must be a numeric device index or \"none\" on darwin")
+ }
+ }
}
args = []string{
// Input options for AVFoundationYou can send follow-ups to the cloud agent here.
Reviewed by Cursor Bugbot for commit ce8e4a2. Configure here.
| "-pixel_format", "nv12", | ||
| // Input file | ||
| "-i", fmt.Sprintf("%d:none", *params.DisplayNum), // Screen capture, no audio | ||
| "-i", fmt.Sprintf("%d:%s", *params.DisplayNum, audioDevice), |
There was a problem hiding this comment.
Darwin audio path uses incompatible PulseAudio source name
Low Severity
The Darwin/macOS path in ffmpegArgs passes params.audioSource() (which defaults to "KernelOutput.monitor") as the AVFoundation audio device identifier. AVFoundation expects a numeric device index (e.g., "0", "1") or "none", not a PulseAudio source name. When recordAudio is true on macOS, ffmpeg will fail at runtime with an invalid device error despite Validate() passing. The validation only checks for an empty string, not platform compatibility.
Reviewed by Cursor Bugbot for commit ce8e4a2. Configure here.



Summary
Tests
go test ./lib/recorder -count=1DOCKER_BUILDKIT=1 docker build -f images/chromium-headful/Dockerfile -t onkernel/chromium-headful-test:latest .TESTCONTAINERS_RYUK_DISABLED=true TESTCONTAINERS_HOST_OVERRIDE=127.0.0.1 RECORDING_ZOMBO_OUTPUT_PATH=/home/agent/repos/kernel-images/server/e2e/testdata/replay-audio-zombocom.mp4 go test ./e2e -run 'TestReplayRecording(IncludesAudioTrack|ZombocomArchiveAudio)' -count=1 -vNote
Medium Risk
Touches container boot order, ffmpeg encoding, and recording defaults (supervisor env defaults
RECORD_AUDIOto true while the Go default remains false); misconfiguration could affect startup timing or produce silent/misaligned audio.Overview
Adds optional audio to browser replay recordings by capturing from a PulseAudio null sink (
KernelOutput/KernelOutput.monitor) and muxing it into MP4s via ffmpeg whenRECORD_AUDIOis enabled.Runtime & images: A shared
start-pulseaudio.shreplaces the headful-only script; headless gainspulseaudioand supervisord wiring. Pulse starts before Chromium (with socket readiness), autorestarts, and passesPULSE_*/AUDIO_SOURCEintokernel-images-apiand the chromium launcher. Headless default flags drop--mute-audioso page audio can reach the sink.Recorder:
RECORD_AUDIOandAUDIO_SOURCEconfig, validation, and Linux ffmpeg args add a Pulse input, AAC encode, explicit stream maps,veryfast/zerolatencyvideo, and timestamp tweaks (no wallclock timestamps / no async aresample).Tests & tooling: Docker e2e tests assert downloaded replays have a non-silent audio track and duration aligned with the container; a large
verify_replay_pipeline.mjsscript exercises full replay vs golden MP4 (optional SSIM/audio correlation).Reviewed by Cursor Bugbot for commit ce8e4a2. Bugbot is set up for automated code reviews on this repo. Configure here.