Samuel100/sdkv2 rust#836
Conversation
Replace the owned-Self FoundryLocalManager::create with an Arc<Self> that shares one process-wide instance via a static OnceLock<Mutex<Weak<...>>>. While any handle is alive, callers share the same instance; when the last Arc drops, NativeManager::Drop runs teardown (Manager_Shutdown + Manager_Release) while the ORT runtime is still alive and before the library's C++ static destructors -- restoring singleton semantics without the atexit hook that caused the WebGPU ReleaseEpFactory double-unregister (ORT #29206). Use OnceLock to wrap the Mutex (const Weak::new() needs Rust 1.73, above the crate's 1.70 MSRV). Update stale atexit/singleton doc comments in manager.rs, foundry_local_manager.rs, and docs/api.md. Keep the test helper's &'static signature by holding a process-lifetime OnceLock<Arc<...>>, so all existing call sites compile unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…elease Fix a use-after-free reachable from safe code: Model, the OpenAI clients, and sessions held only a raw flModel*/flSession* plus Arc<Api> (which keeps just the shared library loaded), so dropping the last Arc<FoundryLocalManager> released the native catalog and model handles while those derived objects were still alive. A natural factory pattern (create manager, get a client, return it) would dangle. This was latent under the old leaked &'static singleton and became reachable once teardown moved onto last-Arc Drop. Thread a strong Arc<NativeManager> keep-alive through NativeModel, NativeCatalog, and NativeSession so every derived handle keeps the native manager (and thus the catalog/model handles it owns) alive until the handle itself is dropped. The keep-alive targets NativeManager (a leaf that owns no Rust wrappers), so there is no reference cycle. Also add NATIVE_LIFECYCLE, a Mutex<()> held across both Manager_Create and Manager_Release, to close the create/teardown race: an Arc's strong count reaches zero slightly before Drop finishes Manager_Release, so a concurrent create() could otherwise observe no instance yet be rejected by the native single-instance guard. Validated: cargo fmt/check/clippy/doc clean (lib, tests, examples). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
origin/main #746 (Add speech result types) inserted GetSpeechSegment and GetSpeechResult into flItemApi between GetToolResult and GetMetadata. Because ffi.rs mirrors the vtable positionally via #[repr(C)], the old layout left GetMetadata/GetMutableMetadata/GetQueue and all ItemQueue_* slots shifted by two pointers — so calls like ItemQueue_Push/TryPop (used by streaming) would misdispatch to the wrong native function against a core built from the new header. Add the two function-pointer slots in the matching position, plus the supporting flSpeechWord/flSpeechSegmentData/flSpeechResultData structs, flSpeechSegmentKind, the FOUNDRY_LOCAL_ITEM_SPEECH_SEGMENT/RESULT item-type constants, and the DURATION/CONFIDENCE_UNSET sentinels. API version is unchanged (still 1). Validated: cargo fmt/check/clippy clean (lib, tests, examples). Vtable order now matches foundry_local_c.h flItemApi field-for-field (31 entries). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
origin/main #746 changed the native audio path to emit SPEECH_SEGMENT items (streaming) and a SPEECH_RESULT item (final), replacing the TextItem outputs. LiveAudioTranscriptionSession uses that native path but read results only via read_text_item / item_text (ITEM_TEXT only), so against the new core it silently produced empty streaming results and an empty final transcript. Add read_speech_segment / read_speech_result_text helpers (via the new GetSpeechSegment / GetSpeechResult accessors) and wire them into the streaming trampoline and final-transcript aggregation, with a TEXT fallback so the OpenAI-JSON path and older cores keep working. Segment timing (ms→s) and PARTIAL/FINAL state are mapped onto the existing response envelope. AudioClient::transcribe / transcribe_streaming are unaffected — they use the OpenAI-JSON path, which still returns OPENAI_JSON-tagged TEXT items. Validated: cargo fmt/check/clippy clean (lib, tests, examples). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Pull request overview
This PR ports the Foundry Local Rust SDK (sdk_v2/rust/) onto the new C++ Core ABI exposed through foundry_local_c.h. It rebuilds the public surface (FoundryLocalManager, Catalog, Model, OpenAI-style chat/embedding/audio clients, and a live-audio streaming session) on top of thin extern "system" FFI wrappers, with native-handle lifetimes anchored to a shared Arc<NativeManager>, and adds a single-binary integration test suite plus generated API docs.
Changes:
- New FFI-backed core wrappers (
detail/{api,native,manager,session,items,model,info,...}) that mirror the C ABI's ownership model (Manager owns Catalog/Models; handles keep the Manager alive; item-queue/request ownership transfer rules). - New OpenAI-compatible clients (
openai/{chat_client,embedding_client,audio_client,live_audio_session}) and public types, re-exported fromlib.rs. - A consolidated integration test binary (
tests/integration/*), examples,Cargo.toml/build.rsupdates, and API reference docs.
Reviewed changes
Copilot reviewed 43 out of 43 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
sdk_v2/rust/src/openai/live_audio_session.rs |
Live audio streaming session; exposes push_queue_capacity/bits_per_sample options that aren't wired to the native layer (commented). |
sdk_v2/rust/src/detail/{api,native,manager,session,items,model,info}.rs |
Core FFI wrappers and native lifetime/ownership handling; verified against the C ABI and stored conventions. |
sdk_v2/rust/src/openai/{chat_client,embedding_client,audio_client,mod}.rs |
OpenAI-compatible clients and response/streaming handling. |
sdk_v2/rust/src/{lib,catalog,configuration,foundry_local_manager,types,error}.rs |
Public API surface and re-exports (Model, DownloadBuilder, etc.). |
sdk_v2/rust/tests/integration/common/mod.rs |
Shared test config; logsDir still points at the legacy sdk/rust tree (commented). |
sdk_v2/rust/tests/integration/{main,manager,model,web_service,live_audio,embedding_client,chat_client}_test.rs |
New consolidated integration tests; minor duplicate/mislabeled log line in chat test (commented). |
sdk_v2/rust/GENERATE-DOCS.md, sdk_v2/rust/docs/api.md |
Doc generation guide and API reference; reference the legacy sdk/rust path and a nonexistent ModelVariant type (commented). |
sdk_v2/rust/Cargo.toml, sdk_v2/rust/build.rs |
Package metadata and native-artifact build script (verified). |
| /// Maximum number of audio chunks buffered in the internal push queue. | ||
| /// If the queue is full, [`LiveAudioTranscriptionSession::append`] will | ||
| /// wait asynchronously. | ||
| /// Default: 100 (~3 seconds of audio at typical chunk sizes). | ||
| pub push_queue_capacity: usize, |
| /// * `logsDir` → `<repo-root>/sdk/rust/logs` | ||
| /// * `logLevel` → `Warn` | ||
| pub fn test_config() -> FoundryLocalConfig { | ||
| let repo_root = get_git_repo_root(); | ||
| let logs_dir = repo_root.join("sdk").join("rust").join("logs"); |
| To generate and open the API docs in your browser: | ||
|
|
||
| ```bash | ||
| cd sdk/rust |
| | `FoundryLocalConfig` | Configuration (app name, log level, service endpoint) | | ||
| | `Catalog` | Model discovery and lookup | | ||
| | `Model` | Grouped model (alias → best variant) | | ||
| | `ModelVariant` | Single variant — download, load, unload | |
| @@ -0,0 +1,552 @@ | |||
| # Foundry Local Rust SDK — Public API Reference | |||
|
|
|||
| > Auto-generated from `sdk/rust/src` source files. | |||
| println!("Response: {content}"); | ||
|
|
||
| println!("REST response: {content}"); |
| /// final results are delivered through the transcription stream before it | ||
| /// completes. The native stop always completes to avoid session leaks, | ||
| /// even if the provided [`CancellationToken`] fires. | ||
| pub async fn stop(&self, _ct: Option<CancellationToken>) -> Result<()> { |
There was a problem hiding this comment.
You need to have Drop trait for the object (LiveAudioTranscriptionStream) as well, I think it can just call stop() otherwise you have a leak
|
|
||
| /// Provide an application logger. | ||
| /// | ||
| /// *Stub* — the logger is stored but not yet wired into the native core. |
| let mut archive = zip::ZipArchive::new(io::Cursor::new(&bytes)) | ||
| .map_err(|e| format!("open nupkg {}: {e}", pkg.name))?; | ||
|
|
||
| let mut extracted = 0usize; |
There was a problem hiding this comment.
nit: the downloaded .nupkg bytes are extracted directly without any integrity check, maybe add checksum check
|
|
||
| /// Create a [`ChatClient`](crate::openai::ChatClient) bound to the (selected) variant. | ||
| pub fn create_chat_client(&self) -> crate::openai::ChatClient { | ||
| crate::openai::ChatClient::new(self.id(), self.selected_native().clone()) |
There was a problem hiding this comment.
pub fn create_chat_client(&self) -> ChatClient {
let v = self.selected_variant();
ChatClient::new(&v.info.id, v.native.clone())
}
Prefer this structure, equally fast, but does not cause race condition if 2 threads do this.
| } | ||
|
|
||
| /// Unload the (selected) variant from memory. | ||
| pub async fn unload(&self) -> Result<String> { |
There was a problem hiding this comment.
This should not return String
| } | ||
|
|
||
| /// Remove the (selected) variant from the local cache. | ||
| pub async fn remove_from_cache(&self) -> Result<String> { |
| /// | ||
| /// The native catalog manages its own caching and refresh, so this is a | ||
| /// no-op retained for API compatibility. | ||
| pub async fn update_models(&self) -> Result<()> { |
There was a problem hiding this comment.
This is confusing, maybe make it more clear it's NO-OP
| match &self.inner { | ||
| ModelKind::Variant(v) => { | ||
| vec![Arc::new(Model { | ||
| inner: ModelKind::Variant(v.clone()), |
There was a problem hiding this comment.
Let's try to store Vec<Arc<VariantData>> instead of Vec<VariantData>, to avoid clones
// before
enum ModelKind {
Variant(VariantData),
Group { alias: String, variants: Vec<VariantData>, selected: AtomicUsize },
}
// after
enum ModelKind {
Variant(Arc<VariantData>),
Group { alias: String, variants: Vec<Arc<VariantData>>, selected: AtomicUsize },
}
``
| match &self.inner { | ||
| ModelKind::Variant(v) => Err(FoundryLocalError::ModelOperation { | ||
| reason: format!( | ||
| "select_variant is not supported on a single variant. \ |
There was a problem hiding this comment.
select_variant_by_id is not supported...
Should we check all error messages are structured properly?
|
All Copilot comments look actionable as well |
Rust SDK updated to new C++ Foundry Local Core ABI.