Skip to content

Samuel100/sdkv2 rust#836

Open
samuel100 wants to merge 8 commits into
mainfrom
samuel100/sdkv2-rust
Open

Samuel100/sdkv2 rust#836
samuel100 wants to merge 8 commits into
mainfrom
samuel100/sdkv2-rust

Conversation

@samuel100

Copy link
Copy Markdown
Contributor

Rust SDK updated to new C++ Foundry Local Core ABI.

samuel100 and others added 8 commits June 22, 2026 17:18
Replace the owned-Self FoundryLocalManager::create with an Arc<Self> that
shares one process-wide instance via a static OnceLock<Mutex<Weak<...>>>.
While any handle is alive, callers share the same instance; when the last
Arc drops, NativeManager::Drop runs teardown (Manager_Shutdown +
Manager_Release) while the ORT runtime is still alive and before the
library's C++ static destructors -- restoring singleton semantics without
the atexit hook that caused the WebGPU ReleaseEpFactory double-unregister
(ORT #29206).

Use OnceLock to wrap the Mutex (const Weak::new() needs Rust 1.73, above
the crate's 1.70 MSRV). Update stale atexit/singleton doc comments in
manager.rs, foundry_local_manager.rs, and docs/api.md. Keep the test
helper's &'static signature by holding a process-lifetime OnceLock<Arc<...>>,
so all existing call sites compile unchanged.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…elease

Fix a use-after-free reachable from safe code: Model, the OpenAI clients, and
sessions held only a raw flModel*/flSession* plus Arc<Api> (which keeps just the
shared library loaded), so dropping the last Arc<FoundryLocalManager> released
the native catalog and model handles while those derived objects were still
alive. A natural factory pattern (create manager, get a client, return it) would
dangle. This was latent under the old leaked &'static singleton and became
reachable once teardown moved onto last-Arc Drop.

Thread a strong Arc<NativeManager> keep-alive through NativeModel, NativeCatalog,
and NativeSession so every derived handle keeps the native manager (and thus the
catalog/model handles it owns) alive until the handle itself is dropped. The
keep-alive targets NativeManager (a leaf that owns no Rust wrappers), so there is
no reference cycle.

Also add NATIVE_LIFECYCLE, a Mutex<()> held across both Manager_Create and
Manager_Release, to close the create/teardown race: an Arc's strong count reaches
zero slightly before Drop finishes Manager_Release, so a concurrent create() could
otherwise observe no instance yet be rejected by the native single-instance guard.

Validated: cargo fmt/check/clippy/doc clean (lib, tests, examples).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
origin/main #746 (Add speech result types) inserted GetSpeechSegment and
GetSpeechResult into flItemApi between GetToolResult and GetMetadata. Because
ffi.rs mirrors the vtable positionally via #[repr(C)], the old layout left
GetMetadata/GetMutableMetadata/GetQueue and all ItemQueue_* slots shifted by two
pointers — so calls like ItemQueue_Push/TryPop (used by streaming) would
misdispatch to the wrong native function against a core built from the new header.

Add the two function-pointer slots in the matching position, plus the supporting
flSpeechWord/flSpeechSegmentData/flSpeechResultData structs, flSpeechSegmentKind,
the FOUNDRY_LOCAL_ITEM_SPEECH_SEGMENT/RESULT item-type constants, and the
DURATION/CONFIDENCE_UNSET sentinels. API version is unchanged (still 1).

Validated: cargo fmt/check/clippy clean (lib, tests, examples). Vtable order now
matches foundry_local_c.h flItemApi field-for-field (31 entries).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
origin/main #746 changed the native audio path to emit SPEECH_SEGMENT items
(streaming) and a SPEECH_RESULT item (final), replacing the TextItem outputs.
LiveAudioTranscriptionSession uses that native path but read results only via
read_text_item / item_text (ITEM_TEXT only), so against the new core it silently
produced empty streaming results and an empty final transcript.

Add read_speech_segment / read_speech_result_text helpers (via the new
GetSpeechSegment / GetSpeechResult accessors) and wire them into the streaming
trampoline and final-transcript aggregation, with a TEXT fallback so the
OpenAI-JSON path and older cores keep working. Segment timing (ms→s) and
PARTIAL/FINAL state are mapped onto the existing response envelope.

AudioClient::transcribe / transcribe_streaming are unaffected — they use the
OpenAI-JSON path, which still returns OPENAI_JSON-tagged TEXT items.

Validated: cargo fmt/check/clippy clean (lib, tests, examples).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 24, 2026 12:58
@vercel

vercel Bot commented Jun 24, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
foundry-local Ready Ready Preview, Comment Jun 24, 2026 12:58pm

Request Review

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR ports the Foundry Local Rust SDK (sdk_v2/rust/) onto the new C++ Core ABI exposed through foundry_local_c.h. It rebuilds the public surface (FoundryLocalManager, Catalog, Model, OpenAI-style chat/embedding/audio clients, and a live-audio streaming session) on top of thin extern "system" FFI wrappers, with native-handle lifetimes anchored to a shared Arc<NativeManager>, and adds a single-binary integration test suite plus generated API docs.

Changes:

  • New FFI-backed core wrappers (detail/{api,native,manager,session,items,model,info,...}) that mirror the C ABI's ownership model (Manager owns Catalog/Models; handles keep the Manager alive; item-queue/request ownership transfer rules).
  • New OpenAI-compatible clients (openai/{chat_client,embedding_client,audio_client,live_audio_session}) and public types, re-exported from lib.rs.
  • A consolidated integration test binary (tests/integration/*), examples, Cargo.toml/build.rs updates, and API reference docs.

Reviewed changes

Copilot reviewed 43 out of 43 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
sdk_v2/rust/src/openai/live_audio_session.rs Live audio streaming session; exposes push_queue_capacity/bits_per_sample options that aren't wired to the native layer (commented).
sdk_v2/rust/src/detail/{api,native,manager,session,items,model,info}.rs Core FFI wrappers and native lifetime/ownership handling; verified against the C ABI and stored conventions.
sdk_v2/rust/src/openai/{chat_client,embedding_client,audio_client,mod}.rs OpenAI-compatible clients and response/streaming handling.
sdk_v2/rust/src/{lib,catalog,configuration,foundry_local_manager,types,error}.rs Public API surface and re-exports (Model, DownloadBuilder, etc.).
sdk_v2/rust/tests/integration/common/mod.rs Shared test config; logsDir still points at the legacy sdk/rust tree (commented).
sdk_v2/rust/tests/integration/{main,manager,model,web_service,live_audio,embedding_client,chat_client}_test.rs New consolidated integration tests; minor duplicate/mislabeled log line in chat test (commented).
sdk_v2/rust/GENERATE-DOCS.md, sdk_v2/rust/docs/api.md Doc generation guide and API reference; reference the legacy sdk/rust path and a nonexistent ModelVariant type (commented).
sdk_v2/rust/Cargo.toml, sdk_v2/rust/build.rs Package metadata and native-artifact build script (verified).

Comment on lines +67 to +71
/// Maximum number of audio chunks buffered in the internal push queue.
/// If the queue is full, [`LiveAudioTranscriptionSession::append`] will
/// wait asynchronously.
/// Default: 100 (~3 seconds of audio at typical chunk sizes).
pub push_queue_capacity: usize,
Comment on lines +81 to +85
/// * `logsDir` → `<repo-root>/sdk/rust/logs`
/// * `logLevel` → `Warn`
pub fn test_config() -> FoundryLocalConfig {
let repo_root = get_git_repo_root();
let logs_dir = repo_root.join("sdk").join("rust").join("logs");
To generate and open the API docs in your browser:

```bash
cd sdk/rust
| `FoundryLocalConfig` | Configuration (app name, log level, service endpoint) |
| `Catalog` | Model discovery and lookup |
| `Model` | Grouped model (alias → best variant) |
| `ModelVariant` | Single variant — download, load, unload |
Comment thread sdk_v2/rust/docs/api.md
@@ -0,0 +1,552 @@
# Foundry Local Rust SDK — Public API Reference

> Auto-generated from `sdk/rust/src` source files.
Comment on lines +55 to +57
println!("Response: {content}");

println!("REST response: {content}");
/// final results are delivered through the transcription stream before it
/// completes. The native stop always completes to avoid session leaks,
/// even if the provided [`CancellationToken`] fires.
pub async fn stop(&self, _ct: Option<CancellationToken>) -> Result<()> {

@nenad1002 nenad1002 Jun 24, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to have Drop trait for the object (LiveAudioTranscriptionStream) as well, I think it can just call stop() otherwise you have a leak


/// Provide an application logger.
///
/// *Stub* — the logger is stored but not yet wired into the native core.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not wired yet?

Comment thread sdk_v2/rust/build.rs
let mut archive = zip::ZipArchive::new(io::Cursor::new(&bytes))
.map_err(|e| format!("open nupkg {}: {e}", pkg.name))?;

let mut extracted = 0usize;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the downloaded .nupkg bytes are extracted directly without any integrity check, maybe add checksum check


/// Create a [`ChatClient`](crate::openai::ChatClient) bound to the (selected) variant.
pub fn create_chat_client(&self) -> crate::openai::ChatClient {
crate::openai::ChatClient::new(self.id(), self.selected_native().clone())

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pub fn create_chat_client(&self) -> ChatClient {
    let v = self.selected_variant();
    ChatClient::new(&v.info.id, v.native.clone())
}

Prefer this structure, equally fast, but does not cause race condition if 2 threads do this.

}

/// Unload the (selected) variant from memory.
pub async fn unload(&self) -> Result<String> {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not return String

}

/// Remove the (selected) variant from the local cache.
pub async fn remove_from_cache(&self) -> Result<String> {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sam here, no String

///
/// The native catalog manages its own caching and refresh, so this is a
/// no-op retained for API compatibility.
pub async fn update_models(&self) -> Result<()> {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing, maybe make it more clear it's NO-OP

match &self.inner {
ModelKind::Variant(v) => {
vec![Arc::new(Model {
inner: ModelKind::Variant(v.clone()),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to store Vec<Arc<VariantData>> instead of Vec<VariantData>, to avoid clones

// before
enum ModelKind {
    Variant(VariantData),
    Group { alias: String, variants: Vec<VariantData>, selected: AtomicUsize },
}

// after
enum ModelKind {
    Variant(Arc<VariantData>),
    Group { alias: String, variants: Vec<Arc<VariantData>>, selected: AtomicUsize },
}
``

match &self.inner {
ModelKind::Variant(v) => Err(FoundryLocalError::ModelOperation {
reason: format!(
"select_variant is not supported on a single variant. \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

select_variant_by_id is not supported...

Should we check all error messages are structured properly?

@nenad1002

Copy link
Copy Markdown
Contributor

All Copilot comments look actionable as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants