fix(deps): add AAD bearer fallback and HTML sign-in detection in download_ado_file (closes #1671)#1675
fix(deps): add AAD bearer fallback and HTML sign-in detection in download_ado_file (closes #1671)#1675danielmeppiel wants to merge 2 commits into
Conversation
…load_ado_file (closes #1671) - When ADO_APM_PAT is absent, resolve an AAD bearer token via AuthResolver.resolve() (which internally uses az-cli); inject it as Authorization: Bearer <token>. PAT path is first and unchanged. - Detect ADO's silent HTTP 200 + text/html sign-in-page response; raise an actionable RuntimeError (via build_error_context) instead of writing the HTML to disk. Detection fires on both the primary and the main/master fallback requests. - Auth-boundary lint (scripts/lint-auth-signals.sh Rule A) is satisfied: download_strategies.py routes through AuthResolver, never imports the bearer provider directly. - Regression tests added: HTML-200 raises, bearer header set when no PAT, PAT path unchanged, resolver bypassed when PAT present, fallback-ref HTML detection, mutation-break assertions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes Azure DevOps file downloads so download_ado_file() (a) falls back to an AAD bearer token resolved via AuthResolver when no ADO PAT is configured, and (b) fail-closes when ADO returns an interactive sign-in HTML page with HTTP 200 (preventing corrupt HTML being written as a .agent.md payload).
Changes:
- Add bearer-token fallback (via
auth_resolver.resolve()) whenado_tokenis absent. - Add
Content-Type: text/htmldetection to treat ADO sign-in HTML pages as an auth failure even when status is 200. - Add unit regressions covering bearer fallback and HTML sign-in detection (including fallback ref path).
Show a summary per file
| File | Description |
|---|---|
| src/apm_cli/deps/download_strategies.py | Adds bearer fallback header injection and HTML sign-in detection for ADO downloads. |
| tests/unit/deps/test_download_strategies_phase3.py | Adds regression tests for #1671 covering bearer fallback + HTML 200 detection behavior. |
Copilot's findings
- Files reviewed: 2/2 changed files
- Comments generated: 2
| def _check_html_signin(response) -> None: | ||
| """Fail-closed when ADO returns an interactive sign-in HTML page. | ||
|
|
||
| Azure DevOps responds with HTTP 200 + text/html when auth is | ||
| missing or insufficient instead of a 401. Writing that HTML to | ||
| disk produces a corrupt file (the #1671 bug). Detect it by | ||
| Content-Type and raise an actionable error before any bytes are | ||
| returned to the caller. | ||
| """ | ||
| content_type = response.headers.get("Content-Type", "") | ||
| if "text/html" in content_type: | ||
| error_msg = ( |
| # Must raise, never return html bytes as if they were file content | ||
| with pytest.raises(RuntimeError): | ||
| result = d.download_ado_file(self._dep(), "apm.yml") | ||
| # If we reach here, the bug is present: HTML was returned as content | ||
| assert result != html_body, "Bug: HTML sign-in page returned as file content" | ||
|
|
…Type, fix test assertions - Add early return in _check_html_signin when response.status_code != 200 so 404/403 responses with text/html bodies fall through to raise_for_status and the existing 404-fallback / 401-403 error paths (addresses Copilot review and panel blocking finding: misleading 'sign-in page' error on non-200 responses). - Lowercase Content-Type before substring match per RFC 7230 case-insensitivity (addresses Copilot review and panel blocking finding: bypass possible with 'Text/HTML' or 'TEXT/HTML'). - Restructure test_html_200_response_writes_no_content to place assertions outside the pytest.raises context manager so they execute in the non-raising (buggy) path (addresses Copilot review and test-coverage-expert: dead assertions). - Add test_404_with_html_content_type_does_not_trigger_sign_in_detection to lock in the status-code guard: 404+text/html must reach the main/master fallback. - Add test_bearer_fallback_skipped_when_scheme_not_bearer to confirm non-bearer scheme tokens are not injected as Bearer headers. - Add CHANGELOG entry for #1671 fix. All folds: addresses panel CEO follow-ups FU-1 (status-code guard), FU-2 (case-normalization), FU-3 (dead assertions), FU-4 (CHANGELOG), FU-5 (404+html regression trap); also addresses Copilot inline comments id:3362036952 and id:3362036988. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
[shepherd-driver] Advisory panel comment posting deferred - will retry. |
APM Review Panel:
|
| Persona | B | R | N | Takeaway |
|---|---|---|---|---|
| Python Architect | 2 | 1 | 1 | Bearer fallback routed correctly through AuthResolver; HTML detection had status-code ordering bug (FOLDED). |
| CLI Logging Expert | 0 | 1 | 2 | Error message is well-structured; misleading sign-in error on 404 addressed (FOLDED). |
| DevX UX Expert | 0 | 1 | 2 | Error messages are actionable; HTML check on wrong status codes addressed (FOLDED). |
| Supply Chain Security Expert | 0 | 2 | 1 | Bearer fallback routes through AuthResolver correctly; case-normalization folded. |
| OSS Growth Hacker | 0 | 1 | 1 | CHANGELOG entry added; zero-config ADO support is a strong enterprise story. |
| Auth Expert | 0 | 0 | 2 | Reservation satisfied: auth boundary clean, PAT first, no token leak. |
| Doc Writer | 0 | 1 | 0 | CHANGELOG entry added (FOLDED); existing docs already cover auth precedence. |
| Test Coverage Expert | 0 | 2 | 1 | Dead assertions fixed; 404+html regression trap added (FOLDED). |
B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.
Reservations carried from strategic-alignment
- auth-surface: Confirm the AAD bearer path routes through AuthResolver/get_bearer_provider and NOT a raw azure-identity import;
scripts/lint-auth-signals.shmust stay green; PAT path remains FIRST/unchanged; text/html sign-in 200 fails closed viabuild_error_contextand writes no corrupt file; no secret/token is logged.
-- addressed:scripts/lint-auth-signals.shexits 0 on this commit; auth-expert confirmed the full contract; no raw azure-identity import present.
Folded in this run (commit 101bf4e)
- [panel / python-architect + cli-logging-expert + devx-ux-expert] Guard
_check_html_signinonresponse.status_code == 200-- prevents misleading "sign-in page" error on 404+html responses. - [panel / python-architect + supply-chain-security-expert] Case-normalize Content-Type via
.lower()per RFC 7230 -- prevents bypass withText/HTML. - [panel / test-coverage-expert + Copilot id:3362036988] Fixed dead assertions in
test_html_200_response_writes_no_content-- moved outsidepytest.raisesblock so they execute in the non-raising path. - [panel / doc-writer + oss-growth-hacker] Added CHANGELOG.md entry for [BUG] Azure DevOps authentication via 'az login' does not always work #1671 fix under
[Unreleased] > Fixed. - [panel / test-coverage-expert] Added
test_404_with_html_content_type_does_not_trigger_sign_in_detection-- regression trap for the status-code guard. - [panel / test-coverage-expert] Added
test_bearer_fallback_skipped_when_scheme_not_bearer-- confirms non-bearer scheme tokens are not injected as Bearer headers.
Copilot signals reviewed
- id:3362036952 (path:
src/apm_cli/deps/download_strategies.py) -- LEGIT:_check_html_signinfires on any status code and lacks case-normalization. Resolved in101bf4e. - id:3362036988 (path:
tests/unit/deps/test_download_strategies_phase3.py) -- LEGIT: Dead assertions insidepytest.raisescontext. Resolved in101bf4e.
Deferred
None. All follow-ups were within the stated scope and have been folded.
Lint evidence
uv run --extra dev ruff check src/ tests/ -> All checks passed (exit 0)
uv run --extra dev ruff format --check src/ tests/ -> 1224 files already formatted (exit 0)
uv run --extra dev python -m pylint --disable=all --enable=R0801 --min-similarity-lines=10 --fail-on=R0801 src/apm_cli/ -> 10.00/10 (exit 0)
bash scripts/lint-auth-signals.sh -> [+] auth-signal lint clean (exit 0)
CI evidence
All CI checks green on commit 101bf4e (run: https://github.com/microsoft/apm/actions/runs/27010185367)
| Job | Result |
|---|---|
| Lint | pass |
| Build & Test Shard 1 (Linux) | pass |
| Build & Test Shard 2 (Linux) | pass |
| PR Binary Smoke (Linux) | pass |
| Coverage Combine (Linux) | pass |
| APM Self-Check | pass |
| Spec conformance gate | pass |
| CodeQL | pass |
| NOTICE Drift Check | pass |
| gate | pass |
Mergeability
| #PR | sha | CEO stance | iterations | folds | deferrals | copilot rounds | ci_status | mergeable | merge_state_status | notes |
|---|---|---|---|---|---|---|---|---|---|---|
| #1675 | 101bf4e | ship_now | 1 | 6 | 0 | 1 | green | MERGEABLE | BLOCKED | awaiting maintainer review |
Recommendation
The PR correctly solves #1671. The two blocking correctness issues (status-code guard and Content-Type normalization) have been folded and all CI checks are green. The PR is landing-ready; the maintainer may merge.
Full per-persona findings
Python Architect
- [blocking]
_check_html_signinfires on non-200 responses, breaking 404 fallback logic atsrc/apm_cli/deps/download_strategies.py:550
The check runs BEFOREraise_for_status(). If a 404 response carriesContent-Type: text/html,_check_html_signinraisesRuntimeError('sign-in page')instead of lettingraise_for_status()fire and reach the 404-fallback branch.
Suggested: Guard withif response.status_code == 200:. FOLDED in 101bf4e. - [blocking] Content-Type comparison is not case-normalized at
src/apm_cli/deps/download_strategies.py:549
HTTP headers are case-insensitive per RFC 7230. A server returningText/HTMLbypasses the guard and corrupt HTML bytes flow to disk.
Suggested:content_type = response.headers.get("Content-Type", "").lower(). FOLDED in 101bf4e. - [recommended] Dead assertions in
test_html_200_response_writes_no_contentattests/unit/deps/test_download_strategies_phase3.py:935
Assertions insidepytest.raisesnever run when exception is raised; test passes vacuously. FOLDED in 101bf4e. - [nit] Missing test for 404+html edge case. FOLDED in 101bf4e (test_404_with_html_content_type_does_not_trigger_sign_in_detection added).
CLI Logging Expert
- [recommended]
_check_html_signinshould only fire on 2xx responses. FOLDED in 101bf4e. - [nit] Content-Type case-insensitivity. FOLDED in 101bf4e.
- [nit] Error message wordy -- two sentences say the same thing. Not folded (nit; message is already actionable via build_error_context).
DevX UX Expert
- [recommended]
_check_html_signinfires on wrong status codes. FOLDED in 101bf4e. - [nit] Content-Type not case-normalized. FOLDED in 101bf4e.
- [nit] No debug log when no auth credentials available. Not folded (separate diagnostic enhancement; would need logger dependency in download_strategies.py which was not introduced in this PR scope).
Supply Chain Security Expert
- [recommended] Silent unauthenticated request possible when resolver returns no token. Not folded -- the HTML detection catches the resulting sign-in redirect downstream; a pre-flight fail-close would be a separate behavioral change. Sufficient defense-in-depth for v1.
- [recommended] Content-Type not case-normalized. FOLDED in 101bf4e.
- [nit] Status-code issue. FOLDED in 101bf4e.
OSS Growth Hacker
- [recommended] Error message should include docs URL. Not folded -- build_error_context output is managed separately; adding a URL requires knowing the stable docs page, which is out of scope for this bugfix.
- [nit] CHANGELOG entry missing. FOLDED in 101bf4e.
Auth Expert
- [nit] Debug log when no auth available. Not folded -- same rationale as devx-ux nit (logger not available in this module's scope without introducing a new dependency).
- [nit]
import base64moved from local to module-level. No action needed.
Doc Writer
- Missing CHANGELOG entry. FOLDED in 101bf4e.
- Existing docs already cover auth precedence -- no prose update needed.
Test Coverage Expert
This panel is advisory. It does not block merge. Re-apply the panel-review label after addressing feedback to re-run.
TL;DR
download_ado_file()now falls back to an AAD bearer token (viaaz login) when no ADO PAT is configured, and fail-closes on the Azure DevOps sign-in HTML page that ADO silently returns as HTTP 200 instead of a 401.Closes #1671
Problem (WHY)
Azure DevOps REST API returns an interactive sign-in HTML page with HTTP 200 (not 401) when authentication is absent or insufficient. Before this fix:
Authorizationheader; ADO returned HTML. The HTML was written to disk as if it were the requested.agent.mdfile. No error was surfaced to the user.az logincredential was already wired for git-clone operations but never for the REST file-download path.Approach (WHAT)
Two independent changes in
DownloadDelegate.download_ado_file():AAD bearer fallback — when
self._host.ado_token(the ADO PAT) is absent, callself._host.auth_resolver.resolve()to obtain anAuthContext. Ifauth_ctx.auth_scheme == "bearer", injectAuthorization: Bearer <token>. The PAT branch is first and byte-identical to the pre-fix behaviour; bearer is strictly the fallback.HTML sign-in detection — after every HTTP 200, inspect
Content-Type. If it containstext/html, raise aRuntimeErrorwith an actionable message built viabuild_error_context(). The check fires on both the primary request and themain/masterfallback request.Auth-boundary compliance:
download_strategies.pyroutes entirely throughAuthResolver.resolve()(which internally handles bearer acquisition incore/auth.py). The module never imports the bearer provider directly, keepinglint-auth-signals.shRule A clean.Implementation (HOW)
flowchart TD A[download_ado_file called] --> B{ado_token set?} B -- yes --> C[Basic auth header PAT unchanged] B -- no --> D[auth_resolver.resolve host org] D --> E{auth_scheme == bearer?} E -- yes --> F[Bearer auth header] E -- no --> G[No auth header] C & F & G --> H[HTTP GET api_url] H --> I{200?} I -- no --> J[raise_for_status / 404 fallback / 401-403 error] I -- yes --> K{Content-Type text/html?} K -- yes --> L[RuntimeError: sign-in page + build_error_context] K -- no --> M[return response.content]Changed files
src/apm_cli/deps/download_strategies.py_check_html_signin()inner function (13 lines); removed one redundantimport base64from inside the function bodytests/unit/deps/test_download_strategies_phase3.pyTrade-offs
AuthResolver.resolve()already caches per(host, port, org), so repeated calls within one install plan are O(1) after the first.Validation evidence
test_pat_path_unchanged_when_pat_present+test_bearer_auth_resolver_not_called_when_pat_presentAuthorization: Bearerheadertest_bearer_fallback_used_when_no_pattest_html_200_response_raises_actionable_error+test_html_200_response_writes_no_contenttest_html_detection_on_fallback_ref_responsebash scripts/lint-auth-signals.sh-> exit 0uv run --extra dev ruff check src/ tests/ && ruff format --checkuv run --extra dev python -m pylint --disable=all --enable=R0801 ...uv run --extra dev pytest tests/unit/deps/ -qHow to test