Skip to content

feat(recording): add AudioOnly capture target and recording pipeline …#1881

Open
ManthanNimodiya wants to merge 4 commits into
CapSoftware:mainfrom
ManthanNimodiya:feat/audio-only-recording
Open

feat(recording): add AudioOnly capture target and recording pipeline …#1881
ManthanNimodiya wants to merge 4 commits into
CapSoftware:mainfrom
ManthanNimodiya:feat/audio-only-recording

Conversation

@ManthanNimodiya
Copy link
Copy Markdown
Contributor

@ManthanNimodiya ManthanNimodiya commented Jun 1, 2026

Summary

  • Adds AudioOnly variant to ScreenCaptureTarget, the source of truth for what gets captured, propagated through every pipeline match
  • Wires AudioOnly through instant and studio recording pipelines, skipping screen/camera capture entirely and building audio-only output
  • Makes display: Option in SingleSegment and MultipleSegments, backwards-compatible via serde default, existing recordings unaffected
  • Makes screen: Option in studio Pipeline, audio-only recordings have no display track
  • Adds audio_only: bool to RecordingMeta so the editor and share page can read it from recording-meta.json
  • Tauri command layer: skips screen capture permission check and shareable content acquisition for AudioOnly

What's not in this PR

(Phase 6-7)
Desktop UI (mode selector), desktop editor (waveform view), and share page (AudioPlayer fallback) are follow-up PRs that build on this foundation.

Test plan

  • cargo clippy --workspace --all-targets -- -D warnings passes clean
  • Old recording-meta.json files without display or audio_only deserialize correctly via serde defaults
  • Existing screen/camera/camera-only recordings behave unchanged (manual verification on desktop app)

Greptile Summary

This PR adds the AudioOnly variant to ScreenCaptureTarget and wires it through the instant and studio recording pipelines, skipping screen/camera capture and building audio-only output. It also makes display: Option<VideoMeta> in segment structs with serde defaults for backward compatibility, adds audio_only: bool to RecordingMeta, and adds CurrentRecordingTarget::Audio for the Tauri state layer.

  • ScreenCaptureTarget::AudioOnly is added and propagated through all match sites (telemetry, screenshot, capture pipeline, shareable content acquisition).
  • display is made Option<VideoMeta> in SingleSegment and MultipleSegments with backward-compatible serde defaults; all callers updated with as_ref().map(...).unwrap_or_default() fallbacks.
  • Pipeline::screen is made Option<OutputPipeline> in the studio pipeline, but the cancel-guard task in spawn_watcher immediately cancels audio tracks when screen is absent.

Confidence Score: 3/5

Not safe to merge without addressing the cancel-guard regression in the studio pipeline and the several audio-only finalization failures flagged across review rounds.

The cancel-guard task in Pipeline::spawn_watcher immediately cancels the microphone pipeline for every audio-only studio recording, making studio audio-only recordings non-functional. Combined with still-unresolved issues from earlier rounds — ProjectRecordingsMeta::new returning Err for audio-only (blocking finalization), audio_only: false hardcoded in persist_final_recording_meta, and AudioOnly incorrectly opening the camera window — the audio-only path is not end-to-end functional.

crates/recording/src/studio_recording.rs (cancel-guard in spawn_watcher), crates/rendering/src/project_recordings.rs (SegmentRecordings.display still non-optional), apps/desktop/src-tauri/src/recording.rs (camera window triggered for AudioOnly, audio_only: false hardcoded in meta writers)

Important Files Changed

Filename Overview
crates/recording/src/studio_recording.rs Makes screen optional throughout the studio pipeline to support audio-only mode; introduces a P1 bug where the cancel-guard task in spawn_watcher immediately cancels mic/audio pipelines when screen is None.
crates/rendering/src/project_recordings.rs Converts display panics to Err returns for audio-only, but SegmentRecordings.display is still Video (non-optional), so ProjectRecordingsMeta::new propagated errors still prevent audio-only finalization.
apps/desktop/src-tauri/src/recording.rs Propagates AudioOnly through recording start/finish/finalize paths; unwrap_or_default() on missing display paths is safe but fires create_screenshot against the recording directory for audio-only.
crates/recording/src/sources/screen_capture/mod.rs Adds AudioOnly variant to ScreenCaptureTarget with correct None returns for display, area, rect, and name lookups.
crates/recording/src/instant_recording.rs Adds AudioOnly pipeline branch using DashSegmentedAudioMuxer; video_info made optional and handled correctly at stop time.
crates/project/src/meta.rs Makes display optional in SingleSegment and MultipleSegment with backward-compatible serde defaults; min_fps/max_fps updated with unwrap_or(0) fallbacks.
apps/desktop/src-tauri/src/lib.rs Adds CurrentRecordingTarget::Audio variant and maps ScreenCaptureTarget::AudioOnly to it correctly.
apps/desktop/src-tauri/src/import.rs Updates all display field accesses to use Option; adds audio_only: false to all imported recording metas.
crates/recording/src/recovery.rs Updates recovery to use Option display fields; correctly wraps display VideoMeta in Some for recovered segments.
crates/rendering/src/lib.rs Updates screen_fps and display_start_offset accesses to use Option; falls back to 0 for audio-only.

Comments Outside Diff (4)

  1. crates/recording/src/studio_recording.rs, line 1615-1636 (link)

    P1 audio_only flag always written as false by the studio pipeline

    persist_final_recording_meta hardcodes audio_only: false in the RecordingMeta it writes to disk. For audio-only studio recordings, start_recording correctly writes audio_only: true initially, but this function overwrites the file at the end of the recording, so downstream consumers (editor, share page) will always read audio_only: false. The same problem exists in write_in_progress_meta (line 1656), which runs before recording even begins and overwrites the initial value. Both functions need to either accept the capture target as a parameter or read and preserve the existing audio_only value from disk.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: crates/recording/src/studio_recording.rs
    Line: 1615-1636
    
    Comment:
    **`audio_only` flag always written as `false` by the studio pipeline**
    
    `persist_final_recording_meta` hardcodes `audio_only: false` in the `RecordingMeta` it writes to disk. For audio-only studio recordings, `start_recording` correctly writes `audio_only: true` initially, but this function overwrites the file at the end of the recording, so downstream consumers (editor, share page) will always read `audio_only: false`. The same problem exists in `write_in_progress_meta` (line 1656), which runs before recording even begins and overwrites the initial value. Both functions need to either accept the capture target as a parameter or read and preserve the existing `audio_only` value from disk.
    
    How can I resolve this? If you propose a fix, please make it concise.
  2. apps/desktop/src-tauri/src/recording.rs, line 853-869 (link)

    P2 Audio-only mode incorrectly triggers the camera window

    The AudioOnly target enters the same branch as CameraOnly here, which calls ShowCapWindow::Camera { centered: true } and sets was_camera_only_recording = true. For an audio-only recording there is no camera feed, so this opens a camera preview window with nothing to show and attaches incorrect state metadata. If the camera permission has not been granted, this could also produce an unexpected permission prompt. The AudioOnly case should likely skip this block entirely.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: apps/desktop/src-tauri/src/recording.rs
    Line: 853-869
    
    Comment:
    **Audio-only mode incorrectly triggers the camera window**
    
    The `AudioOnly` target enters the same branch as `CameraOnly` here, which calls `ShowCapWindow::Camera { centered: true }` and sets `was_camera_only_recording = true`. For an audio-only recording there is no camera feed, so this opens a camera preview window with nothing to show and attaches incorrect state metadata. If the camera permission has not been granted, this could also produce an unexpected permission prompt. The `AudioOnly` case should likely skip this block entirely.
    
    How can I resolve this? If you propose a fix, please make it concise.
  3. apps/desktop/src-tauri/src/recording.rs, line 2749-2750 (link)

    P1 Audio-only studio recordings cannot complete finalization

    SegmentRecordings.display is a non-optional Video, so ProjectRecordingsMeta::new returns Err("SingleSegment/MultipleSegment missing display") whenever display is None. At both this call site (line 2749) and handle_recording_finish (line 2506), the result is propagated with ?. For audio-only studio recordings this means neither config.write nor any downstream steps run — the recording's project config is never written and the recording appears incomplete from the editor's perspective.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: apps/desktop/src-tauri/src/recording.rs
    Line: 2749-2750
    
    Comment:
    **Audio-only studio recordings cannot complete finalization**
    
    `SegmentRecordings.display` is a non-optional `Video`, so `ProjectRecordingsMeta::new` returns `Err("SingleSegment/MultipleSegment missing display")` whenever display is `None`. At both this call site (line 2749) and `handle_recording_finish` (line 2506), the result is propagated with `?`. For audio-only studio recordings this means neither `config.write` nor any downstream steps run — the recording's project config is never written and the recording appears incomplete from the editor's perspective.
    
    How can I resolve this? If you propose a fix, please make it concise.
  4. crates/recording/src/studio_recording.rs, line 595-609 (link)

    P1 Cancel-guard immediately terminates audio pipelines for audio-only recordings

    When screen is None (audio-only), screen_done is None, so the if let Some(done) = screen_done guard is skipped and the spawned task proceeds directly to calling mic_cancel.cancel() (and cam_cancel, sys_cancel). Because the task is spawned but not immediately polled, it fires at the very next async yield point after recording starts — effectively cancelling the microphone pipeline before meaningful audio is captured. Every audio-only studio recording would produce an empty or near-empty output.

    The cancellation task should only be spawned when there is an actual screen pipeline to act as the trigger. When screen is absent, the audio pipelines should run until Pipeline::stop is called explicitly.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: crates/recording/src/studio_recording.rs
    Line: 595-609
    
    Comment:
    **Cancel-guard immediately terminates audio pipelines for audio-only recordings**
    
    When `screen` is `None` (audio-only), `screen_done` is `None`, so the `if let Some(done) = screen_done` guard is skipped and the spawned task proceeds directly to calling `mic_cancel.cancel()` (and `cam_cancel`, `sys_cancel`). Because the task is spawned but not immediately polled, it fires at the very next async yield point after recording starts — effectively cancelling the microphone pipeline before meaningful audio is captured. Every audio-only studio recording would produce an empty or near-empty output.
    
    The cancellation task should only be spawned when there is an actual screen pipeline to act as the trigger. When `screen` is absent, the audio pipelines should run until `Pipeline::stop` is called explicitly.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
crates/recording/src/studio_recording.rs:595-609
**Cancel-guard immediately terminates audio pipelines for audio-only recordings**

When `screen` is `None` (audio-only), `screen_done` is `None`, so the `if let Some(done) = screen_done` guard is skipped and the spawned task proceeds directly to calling `mic_cancel.cancel()` (and `cam_cancel`, `sys_cancel`). Because the task is spawned but not immediately polled, it fires at the very next async yield point after recording starts — effectively cancelling the microphone pipeline before meaningful audio is captured. Every audio-only studio recording would produce an empty or near-empty output.

The cancellation task should only be spawned when there is an actual screen pipeline to act as the trigger. When `screen` is absent, the audio pipelines should run until `Pipeline::stop` is called explicitly.

Reviews (4): Last reviewed commit: "fix(recording): skip audio-only segments..." | Re-trigger Greptile

Comment thread crates/rendering/src/project_recordings.rs Outdated
Comment thread apps/desktop/src-tauri/src/lib.rs Outdated
@ManthanNimodiya
Copy link
Copy Markdown
Contributor Author

@greptileai please re-review

Comment on lines +963 to +972
let raw_display_start = s
.pipeline
.screen
.as_ref()
.map(|sc| to_start_time(sc.first_timestamp));
let display_start_time = raw_display_start.unwrap_or_else(|| {
mic_start_time
.or(camera_start_time)
.unwrap_or(s.start)
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Cross-track AV sync dropped for all screen recordings

The refactor that made screen optional accidentally removes the CROSS_TRACK_SNAP_SECS snapping that used to align display_start_time to camera or mic. When screen is Some (every non–audio-only recording), raw_display_start.unwrap_or_else(...) just returns the raw display timestamp and the snapping branch is never reached. Any screen+camera or screen+mic studio recording where the two clocks drift by more than CROSS_TRACK_SNAP_SECS will now produce an unsynchronised segment, whereas before it would have been corrected.

The fallback block should only fire when screen is None (audio-only). The snapping logic needs to be preserved for the Some case.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/recording/src/studio_recording.rs
Line: 963-972

Comment:
**Cross-track AV sync dropped for all screen recordings**

The refactor that made `screen` optional accidentally removes the `CROSS_TRACK_SNAP_SECS` snapping that used to align `display_start_time` to camera or mic. When `screen` is `Some` (every non–audio-only recording), `raw_display_start.unwrap_or_else(...)` just returns the raw display timestamp and the snapping branch is never reached. Any screen+camera or screen+mic studio recording where the two clocks drift by more than `CROSS_TRACK_SNAP_SECS` will now produce an unsynchronised segment, whereas before it would have been corrected.

The fallback block should only fire when `screen` is `None` (audio-only). The snapping logic needs to be preserved for the `Some` case.

How can I resolve this? If you propose a fix, please make it concise.

@ManthanNimodiya
Copy link
Copy Markdown
Contributor Author

@greptileai please re-review

Comment on lines 3051 to 3055
for segment in &inner.segments {
let display_path = segment.display.path.to_path(recording_dir);
let display_path = segment.display.as_ref().map(|d| d.path.clone()).unwrap_or_default().to_path(recording_dir);
if display_path.is_dir() {
return true;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 needs_fragment_remux always returns true for audio-only recordings

When display is None, unwrap_or_default() produces an empty RelativePathBuf, and .to_path(recording_dir) resolves to recording_dir itself. Since recording_dir.is_dir() is always true, every audio-only MultipleSegments recording is incorrectly flagged as fragmented. This causes the editor to open in a "finalizing" state for audio-only recordings and spawns finalize_studio_recording through a fragmented-remux code path that wasn't intended for them.

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/desktop/src-tauri/src/recording.rs
Line: 3051-3055

Comment:
**`needs_fragment_remux` always returns `true` for audio-only recordings**

When `display` is `None`, `unwrap_or_default()` produces an empty `RelativePathBuf`, and `.to_path(recording_dir)` resolves to `recording_dir` itself. Since `recording_dir.is_dir()` is always `true`, every audio-only `MultipleSegments` recording is incorrectly flagged as fragmented. This causes the editor to open in a "finalizing" state for audio-only recordings and spawns `finalize_studio_recording` through a fragmented-remux code path that wasn't intended for them.

How can I resolve this? If you propose a fix, please make it concise.

@ManthanNimodiya
Copy link
Copy Markdown
Contributor Author

@greptileai please re-review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant