Integrate "get model versions" and "download specific model version" into cpp Core with max_versions by selenayang888 · Pull Request #816 · microsoft/Foundry-Local

selenayang888 · 2026-06-18T01:25:48Z

Porting the changes "get model versions" and "download specific model version" from C# into C++ Core now:

List all available versions of a model (e.g., to see what versions of phi-4 are published).
Download and use a specific older version (e.g., for reproducibility, compatibility, or regression testing).

…parsing

…o baijumeswani/catalog

…/integrate-get-model-versions-into-cpp

…odel-versions-into-cpp

vercel · 2026-06-18T01:25:54Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
foundry-local	Ready	Preview, Comment	Jun 27, 2026 7:25am

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

skottmckay · 2026-06-24T04:30:48Z

+  /// is owned by the list and remains valid until the list is released. NULL
+  /// when no continuation token was set (e.g. the list came from GetModels(),
+  /// or GetModelVersions has walked the underlying source to exhaustion).
+  const char* FL_API_T(ModelList_GetContinuationToken, _In_ const flModelList* models);


Can this be removed along with next_continuation_token in flModelList?

Yes, I removed this API since the next_continuation_token was already removed.

skottmckay · 2026-06-24T04:37:46Z

+  /// Get all versions of a model alias, optionally narrowed to a specific model name.
+  /// @param model_alias Alias of the model (e.g. "phi-4-mini"). Must be non-NULL and non-empty.
+  /// @param model_name Optional model name (ModelInfo.Name, e.g. "Phi-4-generic-gpu"). NULL returns
+  ///        every model name.
+  /// @param max_versions Select latest X versions per model name. Pass 0 (or any
+  ///        negative value) for no per-model-name cap.
+  /// Returned list contains existing flModel handles owned by the catalog; releasing
+  /// the list does not invalidate the underlying model handles.
+  FL_API_STATUS(GetModelVersions, _In_ const flCatalog* catalog, _In_ const char* model_alias,
+                _In_opt_ const char* model_name, int32_t max_versions, _Outptr_ flModelList** out_models);


Should we note that (IIUC) these models will be returned by catalog operations in the current instance (e.g. list models) but are not saved in the model cache on disk so won't be present on restart? I think that behavior is fine but it could be a little surprising.

I think this is how design doc specified for caching strategy. get_model_versions results are not cached in BaseModelCatalog. Each call to get_model_versions does a fresh catalog query. However, to support download an older version", the resolved ModelInfo for that version is temporarily added to the catalog's modelIdToInfo index.

The link below is the part in the design doc for Caching Strategy for All-Versions Data:

model-versions-design.md

should we explain that behavior to the user in the comments here as it's a little opaque?

Added the explanation of the behavior in the comments now.
Moreover, based on the design doc requirements for caching strategy, I updated the GetModelVersions in base_model_catalog.cc:475 no longer calls IntegrateVariants. It now stores the fetched models in transient per-catalog query storage and returns those handles without adding them to the main alias/id indices.

skottmckay · 2026-06-24T04:40:51Z

+struct ModelVersionsPage {
+  std::vector<Model*> models;
+};


do we need this struct anymore?

Removed this struct ModelVersionsPage.

skottmckay · 2026-06-24T04:50:24Z

  while (true) {
-    const std::string body = BuildRequestBody(filters, skip, continuation_token);
+    int requested_page_size = kPageSize;
+    if (max_count > 0) {


Does anything set max_count? If not can we remove the code related to it?

As the function FetchFilterSetWithState is reverted, no max_count_ anymore.

skottmckay · 2026-06-24T04:52:56Z

-      // Page 1: run through region fallback starting from the sticky region (last known-good) or the active region.
-      // Exhaustion means every candidate had a retryable region-health failure, so fail just this filter set.
-      const std::string start =
-          region_fallback_.StickyRegion().value_or(region_);
+      // Page 1 fresh start: run region fallback starting from the sticky/active region.
+      const std::string start = region_fallback_.StickyRegion().value_or(region_);


nit: removed comments that seemed like they were more descriptive of the processing here and lines 296, 302-303.

I reverted this function FetchFilterSetWithState back to FetchFilterSet, since the continuation_token is already removed from getModelVersions.

skottmckay · 2026-06-24T05:07:33Z

+  // Scan local models so any version already on disk is reported as cached.
+  auto local_models = ScanLocalModels(cache_dir_, logger_);


Do we need to scan local models here and in FetchModelsByIds? That seems unrelated to fetching the latest versions/ids. We want to do it when we do the general fetch, but I don't think it's required in these targeted fetches.

skottmckay · 2026-06-24T05:17:58Z

+  struct FetchedModelVersions {
+    std::vector<Model> models;
+  };
+
+  virtual FetchedModelVersions FetchModelVersions(
+      const std::string& /*model_alias*/,
+      const std::string& /*model_name*/ = "") const {
+    return {};
+  }


Do we need the FetchedModelVersions struct?

Since it is only a thin wrapper around a vector, so removing and replacing it with std::vector<Model> in the base virtual method.

skottmckay · 2026-06-24T05:27:43Z

+  // Capture fetch order before moving so we can return models in the order the
+  // underlying source produced them — important for stable pagination.
+  std::vector<std::string> fetched_ids;
+  fetched_ids.reserve(fetched.models.size());
+  for (const auto& m : fetched.models) {
+    fetched_ids.push_back(m.Info().model_id);
+  }


Is this still needed?

skottmckay · 2026-06-24T05:50:26Z

+// Deterministic API output order: alias alpha, then name alpha, then version asc.
+bool CompareModelPointersForVersionList(const Model* lhs, const Model* rhs) {


Should this have the same ordering as CompareModelsForSort? Currently the device is ignored so things like 'cuda' and 'generic-gpu' will come before 'npu' which would differ from the model list output.

When we add to the model variants in IntegrateVariantsLocked do we end up with a weird ordering given AddVariant appends to the original list, which was sorted using CompareModelsForSort?

This branch off yours has an alternative approach where the variant sorting is pushed down to Model given we're now adding new variants in multiple places: skottmckay/get-model-versions-with-variant-sort

It also uses that ordering when returning values from GetModelVersions for consistency.

The local scan removal from FetchModelVersions and the transient version_query_models_ approach are both solid and fully address the first and main concern in the comments.
Then, the three major changes:

Added container re-sorting support in model.h and model.cc.

Applied that re-sort in integration flow in base_model_catalog.cc

Fixed max_versions selection robustness in base_model_catalog.cc

Thanks for sharing the branch — I've merged it in (after the latest origin/main merge). All are completed as your suggested in teams chat:

Make Model own ordering invariant

Move comparator/device-priority logic from BaseModelCatalog to Model

Add default-selection step in Model

Remove duplicated ordering work in GetModelVersions and integration paths.

skottmckay · 2026-06-24T07:30:09Z

+  // Optional Windows App SDK bootstrap. When the caller enables Bootstrap in
+  // additional_options we initialize the WinAppSDK framework package for this process. This
+  // must run before the Manager constructor so that WinML EP discovery (inside
+  // Manager::Manager) can resolve Microsoft.Windows.AI.MachineLearning.dll. We use a
+  // temporary stderr logger here because the Manager-owned logger doesn't exist yet;
+  // bootstrap output is low-volume (one line on success, one warning on failure). Mirrors
+  // the C# FoundryLocalCore IS_WINML path. Only meaningful in WinML builds; outside that
+  // configuration TryInitializeWindowsAppSdk is a no-op stub.
+#if defined(FOUNDRY_LOCAL_USE_WINML) && FOUNDRY_LOCAL_USE_WINML
+  {
+    if (IsAdditionalOptionEnabled(config, "Bootstrap")) {
+      StderrLogger bootstrap_logger;
+      TryInitializeWindowsAppSdk(bootstrap_logger);
+    }
+  }
+#endif


Assuming this is unintentional

It was unintentional and reverted the changes.

…odel-versions-into-cpp

skottmckay · 2026-06-26T08:17:16Z


+  /// Transient storage for the most recent GetModelVersions query. Replaced on each call.
+  /// These models are intentionally not integrated into the main lookup indices.
+  mutable std::vector<std::unique_ptr<Model>> version_query_models_;


Replacing on each call creates potential issues on the client side if they call this for multiple models as it would invalidate all info from the previous fetch and they have no way to keep that info alive.

Could we store in an unordered_map<string_view, unique_ptr> that uses the model id as the key?

skottmckay · 2026-06-26T08:33:48Z

+      }
+
+      result.push_back(model.get());
+    }


My concern with this is that we're using a different structure to the Model's returned by the default list, which have a parent (a.k.a. 'container') Model instance for the alias and the variants are accesses from that. Here we're returning individual variants and no parent Model instance is created.

That means usage could differ in the user code, and ideally we can avoid that.

Given we're fetching a single alias, could we create the parent Model and instead use AddVariant to add each individual result and return the parent? That would mean the API returns a single Model though.

Actually we could do both. Internally create the same structure (parent container Model with leaf variants), and return a flModelList of the variants at the C API level like we do in Model_GetVariantsImpl.

That creates a consistent setup internally whilst returning the expected list of individual variant Model instances.

…nList`

…odel-versions-into-cpp

baijumeswani and others added 14 commits June 11, 2026 21:20

Enable live catalog with region-aware fallback engine

3a6155b

Add AzureAiStudio User-Agent

cc1e22f

Address pr review comments

a827ec8

integrate get model versions into cpp core

a75bb36

Default registry region to centralus, httprequestoptions, and config …

fd865f3

…parsing

Merge branch 'main' of https://github.com/microsoft/Foundry-Local int…

c872980

…o baijumeswani/catalog

Address pull-request review comments

22f5b0f

Address pull-request review comments

d4b61f3

add max_version and continuation token

6739874

Change all defaults to centralus

f0fb73c

Address pull-request review comments

859d927

add a new test with empty alias and sort the results

7a3d089

Merge remote-tracking branch 'origin/baijumeswani/catalog' into syang…

b73c5e1

…/integrate-get-model-versions-into-cpp

Merge remote-tracking branch 'origin/main' into syang/integrate-get-m…

e7802c7

…odel-versions-into-cpp

selenayang888 marked this pull request as ready for review June 18, 2026 01:27

Copilot AI review requested due to automatic review settings June 18, 2026 01:27

Copilot started reviewing on behalf of selenayang888 June 18, 2026 01:29 View session

Copilot AI reviewed Jun 18, 2026

Copilot stopped reviewing on behalf of selenayang888 due to an error June 18, 2026 01:49
Copilot had to stop work due to a timeout.

skottmckay reviewed Jun 18, 2026

View reviewed changes

Comment thread sdk_v2/cpp/src/catalog/azure_catalog_client.cc

Resolve all the comments

2a37f65

vercel Bot deployed to Preview June 23, 2026 06:28 View deployment

put "CompareCaseInsensitive" in utils/string_utils.h

ffb7f46

vercel Bot deployed to Preview June 23, 2026 07:06 View deployment

skottmckay reviewed Jun 24, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into syang/integrate-get-m…

850358e

…odel-versions-into-cpp

vercel Bot deployed to Preview June 24, 2026 23:55 View deployment

Resolved three comments

1a31fc3

vercel Bot deployed to Preview June 25, 2026 07:24 View deployment

Resolve two comments

7164318

vercel Bot deployed to Preview June 25, 2026 23:49 View deployment

Resolved GetModelVersions , local copy and refactor catalog_urls_

e9aa2f4

vercel Bot deployed to Preview June 26, 2026 06:56 View deployment

skottmckay reviewed Jun 26, 2026

View reviewed changes

Addressing the comment for ordering of `CompareModelPointersForVersio…

e4401ce

…nList`

vercel Bot deployed to Preview June 26, 2026 08:58 View deployment

Merge remote-tracking branch 'origin/main' into syang/integrate-get-m…

94a1796

…odel-versions-into-cpp

vercel Bot deployed to Preview June 26, 2026 19:11 View deployment

Align with CompareModelsForSort

bbfb6b4

vercel Bot deployed to Preview June 27, 2026 07:25 View deployment

		// Scan local models so any version already on disk is reported as cached.
		auto local_models = ScanLocalModels(cache_dir_, logger_);

		// Deterministic API output order: alias alpha, then name alpha, then version asc.
		bool CompareModelPointersForVersionList(const Model* lhs, const Model* rhs) {

Uh oh!

Conversation

selenayang888 commented Jun 18, 2026

Uh oh!

vercel Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

selenayang888 Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

selenayang888 Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

vercel Bot commented Jun 18, 2026 •

edited

Loading

selenayang888 Jun 26, 2026 •

edited

Loading

selenayang888 Jun 26, 2026 •

edited

Loading