Skip to content

fix(mcp): report llm_available via the chat-model credential resolver (#200)#204

Open
wernerkasselman-au wants to merge 1 commit into
NVIDIA:mainfrom
wernerkasselman-au:fix/mcp-llm-available-fallback-creds
Open

fix(mcp): report llm_available via the chat-model credential resolver (#200)#204
wernerkasselman-au wants to merge 1 commit into
NVIDIA:mainfrom
wernerkasselman-au:fix/mcp-llm-available-fallback-creds

Conversation

@wernerkasselman-au

Copy link
Copy Markdown
Contributor

What this fixes

Closes the bug I filed as #200. The MCP server decided whether the semantic pass could run by calling resolve_provider_credentials(), which only ever resolves the active provider's credentials. The model the graph actually constructs, though, comes from create_chat_model(), and that one falls back to a standard OpenAI client (OPENAI_API_KEY / OPENAI_BASE_URL) when the active provider is not configured. So on an OpenAI-only setup (the default NVIDIA provider with no NVIDIA_INFERENCE_KEY, but OPENAI_API_KEY set), the gate came out false, the server reported scan_mode: static-only, and it skipped the LLM pass that the CLI would have run on the very same environment.

The change

One line of behaviour: gate llm_available on resolve_chat_model_credentials() (providers/__init__.py:111-117) instead of resolve_provider_credentials(). That resolver already mirrors the OpenAI fallback that create_chat_model() relies on, so the MCP accounting now lines up with the model path the graph actually takes. I left a comment at the call site explaining why the fallback-aware resolver is the right one, so nobody narrows it back to the active provider later.

Tests

  • Updated the three existing accounting tests to patch the new resolver (resolve_chat_model_credentials), since that is the name the server now consults.
  • Added test_run_scan_reports_llm_available_via_openai_fallback, a regression test that patches the active-provider-only resolver to None and the chat-model resolver to a credential, then asserts the server reports the LLM as available. On the old code this fails (it consulted the active-only resolver), so it pins the fix.
uv run ruff check src/ tests/        # clean
uv run ruff format --check src/ tests/  # clean
uv run pytest tests/unit/test_mcp_server.py -q  # 6 passed

One thing I want to flag, in case it was deliberate: if the MCP path was scoped to the active provider on purpose (say, to keep the server from quietly using a developer's ambient OPENAI_API_KEY), then this change is the wrong call and I would rather hear that and close it. Otherwise the accounting should tell the truth about what the scan can do.

Refs #200

run_scan gated the LLM pass on resolve_provider_credentials(), which resolves
only the active provider's credentials. create_chat_model() falls back to a
standard OpenAI client (OPENAI_API_KEY / OPENAI_BASE_URL) when the active
provider is unconfigured, so an OpenAI-only setup reported llm_available=false
and the semantic pass was skipped even though the CLI would have run it.

Gate on resolve_chat_model_credentials(), which includes that fallback, so the
MCP server's accounting matches the model path the graph actually takes. Update
the existing accounting tests to patch the new resolver and add a regression
test for the OpenAI-fallback case.

Refs NVIDIA#200

Signed-off-by: Werner Kasselman <145896621+wernerkasselman-au@users.noreply.github.com>

@rng1995 rng1995 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving — correct fix for #200. Gating MCP llm_available on resolve_chat_model_credentials() (the chat-model resolver) instead of resolve_provider_credentials() (active-provider only) makes availability match what create_chat_model actually does, including the OpenAI fallback. The regression test patches the resolver and asserts llm_available is reported truthfully. Note this edits mcp_server.py alongside #196 — expect a rebase.

@vcolombo

vcolombo commented Jul 2, 2026

Copy link
Copy Markdown

Ran into #200 independently while embedding SkillSpector in an agent runtime, and I'd like to suggest going one step further than resolve_chat_model_credentials(): gating on llm_utils.is_llm_available() instead.

The credential-resolver fix correctly handles the OpenAI-fallback case from #200, but it still reports llm_available=false for every credential-less provider — the CLI providers (claude_cli, codex_cli, gemini_cli) return None from resolve_credentials() by design, so under the MCP server they're judged unavailable even though get_chat_model() would happily serve them via their is_available() capability check. The CLI already gets this right; run_scan doesn't.

is_llm_available() is capability-aware (it delegates to is_available() for CLI-capable providers and falls back to credential resolution — including the OpenAI fallback — for HTTP providers), so:

llm_available, _ = is_llm_available()

fixes #200 and the CLI-provider case in one move, and keeps run_scan consistent with what get_chat_model() will actually do. It also future-proofs the gate for any external duck-typed provider (has_cli_capability explicitly invites those).

I have a tested implementation of this (including the monkeypatch updates to tests/unit/test_mcp_server.py, which currently patch the old resolver name) that I'm happy to contribute here or as a follow-up — whichever is easier for this PR.

(Also relevant to #243 — an injected embedded provider is credential-less by definition, so it depends on the gate being capability-aware.)

@wernerkasselman-au

Copy link
Copy Markdown
Contributor Author

Thanks, good catch. I agree run_scan should use is_llm_available() as the single availability gate rather than duplicating credential-specific logic in the MCP wrapper.

For this PR as it stands, that is behaviorally equivalent to resolve_chat_model_credentials() because is_llm_available() is still credential-backed, so the #200 OpenAI fallback fix remains covered. But it is the better abstraction once the CLI-capable providers from #234 land, and it also keeps the MCP server aligned with report.py.

I’m fine either way on how to land it: I can update this PR to use is_llm_available() and adjust the test_mcp_server.py monkeypatches, or you can contribute your tested implementation as a follow-up.

@rng1995 deferring to you on which route you’d prefer for this PR.

@vcolombo

vcolombo commented Jul 3, 2026

Copy link
Copy Markdown

One clarification that actually strengthens the case: the CLI providers (claude_cli/codex_cli/gemini_cli) and the capability-aware is_llm_available() are already on main — they don't wait on #234. So the switch isn't just future-proofing: today, SKILLSPECTOR_PROVIDER=claude_cli with the binary authenticated (and no OPENAI_API_KEY) gets llm_available=false from the credential resolver but true from is_llm_available() — a live behavioral gap under the MCP server, not an equivalent refactor.

Simplest path is folding it into this PR — please go ahead. The only test churn is the three monkeypatches in tests/unit/test_mcp_server.py: they patch the resolver name and want to become monkeypatch.setattr(mcp_server, "is_llm_available", lambda: (True, None)) (or (False, "no credentials") for the unavailable cases), with the assertions unchanged. Happy to review once updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants