test: improve JSON parsing module test coverage#1
Draft
tpsaint wants to merge 1 commit into
Draft
Conversation
- Add comprehensive tests for extract_dict_from_json function - Add comprehensive tests for extract_list_from_json function - Add additional error handling tests for json_loads function - Achieve 100% test coverage for forge/json/parsing.py module - Fix conftest.py to remove problematic VCR plugin reference The new tests cover: - JSON extraction from code blocks (both lowercase and uppercase) - JSON extraction from text with embedded JSON - Plain JSON parsing for dicts and lists - Nested data structures - JSON with fixable syntax issues - Comprehensive error cases for type validation - Edge cases like empty strings and whitespace
tpsaint
pushed a commit
that referenced
this pull request
Jun 26, 2026
…failure exit (Significant-Gravitas#12918) ## Background [SECRT-2275](https://linear.app/autogpt/issue/SECRT-2275). User report: when a copilot ("autopilot") turn is interrupted by a usage-limit, tool-call-limit, or other run interruption, the user's recent work disappears. User described it as: "my initial message was lost 3 times and it disappeared, then when I would say 'continue' it would do a random old task." Investigation surfaced two distinct failure modes. This PR addresses both. - **Mode 1** — rate-limit (or other pre-stream rejection) at turn start: the user's text only ever lives in the optimistic `useChat` bubble; the backend rejects before the message is persisted, so the bubble is a lie and a refresh / retry would lose the text. - **Mode 2** — long-running turn interrupted mid-stream: the entire turn's progress (assistant text, tool calls, reasoning) vanishes on interruption — what users describe as "the turn is gone." ## Mode 1 — frontend: restore unsent text on 429 Backend can't recover this on its own: `check_rate_limit` raises before `append_and_save_message`, so by the time the 429 surfaces there is no DB row to roll forward. See `autogpt_platform/backend/backend/api/features/chat/routes.py:916-922` (rate-limit check) and `routes.py:945` (later append-and-save). Frontend fix in `autogpt_platform/frontend/src/app/(platform)/copilot/useCopilotStream.ts`: when `useChat`'s `onError` reports a usage-limit error, we - drop the optimistic user bubble (DB has no record of it, so leaving it would be a phantom), - push `lastSubmittedMsgRef.current` back into the composer via the existing `setInitialPrompt` slot — the same slot URL pre-fills use, so `useChatInput`'s `consumeInitialPrompt` effect picks it up automatically, - clear `lastSubmittedMsgRef` so the dedup guard doesn't block re-send. In-memory only; surviving a hard refresh while rate-limited is a separate follow-up (would need localStorage persistence with TTL). Test: `autogpt_platform/frontend/src/app/(platform)/copilot/__tests__/useCopilotStream.test.ts` — verifies the composer is repopulated and the optimistic bubble is dropped on a 429. ## Mode 2 — backend: preserve interrupted partial in DB ### Root cause The SDK retry loop in `stream_chat_completion_sdk` always rolls back `session.messages` to the pre-attempt watermark on any exception. That rollback is correct **before a retry** so attempt #2 doesn't duplicate attempt #1's content. But it runs **before the retry decision is made**, so when retries are exhausted (or no retry is attempted) the partial work is discarded too. Three branches of the retry loop ended in a final-failure state with side effects worse than just losing the partial: - `_HandledStreamError` non-transient: rollback then add error marker — partial gone - `Exception` with `events_yielded > 0`: rollback then break — **no error marker added either**, so on refresh the chat looks like nothing happened even though the user just watched tokens stream live - `Exception` non-context-non-transient + the while-`else:` exhaustion path: same, no marker - Outer except (cancellation, GeneratorExit cleanup): didn't restore captured partial ### Fix `autogpt_platform/backend/backend/copilot/sdk/service.py`: 1. **`_InterruptedAttempt` dataclass** — holds the rolled-back `partial: list[ChatMessage]` + optional `handled_error: _HandledErrorInfo`. Three methods drive the contract: - `capture(session, transcript_builder, transcript_snap, pre_attempt_msg_count)` — slices `session.messages`, restores the transcript, strips trailing error markers to prevent duplicate markers after restore. - `clear()` — drops captured state on a successful retry so outer cleanup paths don't replay pre-retry content. - `finalize(session, state, display_msg, retryable=...) -> list[StreamBaseResponse]` — re-attaches partial, synthesizes `tool_result` rows for orphan `tool_use` blocks, appends the canonical error marker, and returns the flushed events so the caller can yield them to the client (no double-flush). 2. **`_flush_orphan_tool_uses_to_session(session, state) -> list[StreamBaseResponse]`** — synthesizes `tool_result` rows for any `tool_use` that never resolved before the error so the next turn's LLM context stays API-valid (Anthropic rejects orphan tool_use). Uses the public `adapter.flush_unresolved_tool_calls` and returns the events for the caller to yield. 3. **`_classify_final_failure(...) -> _FinalFailure | None`** — picks the display message + stream code + retryable flag for the final-failure exit. One source of truth for the in-history error marker and the client-facing `StreamError` SSE yield so they can't drift. 4. **Consolidated post-loop emit**: the former three scattered blocks (partial restore + redundant re-flush + two separate `yield StreamError` sites) collapsed to one block driven by `_classify_final_failure` → `_FinalFailure` → `finalize()` → yield events + single `StreamError`. 5. **Adapter `flush_unresolved_tool_calls`** (renamed from `_flush_unresolved_tool_calls` to drop the `# noqa: SLF001` suppressors on cross-module callers). Each retry-loop rollback site calls `interrupted.capture(...)`; the success break calls `interrupted.clear()`; the post-loop failure block calls `interrupted.finalize(...)` exactly once. The baseline service already preserves partial work via its existing finally block — no change needed there. ## Tests Backend (`backend/copilot/sdk/interrupted_partial_test.py`, new, 18 tests): - `TestInterruptedAttemptCapture` — slice semantics + stale-marker stripping - `TestInterruptedAttemptFinalize` — appends partial then marker, handles empty partial, no-op on `None` session, flushes unresolved tools between partial and marker, returns flushed events for caller to yield - `TestFlushOrphanToolUses` — synthesizes `tool_result` rows, returns events, no-op on None state / no unresolved - `TestClassifyFinalFailure` — handled_error wins, attempts_exhausted, transient_exhausted, stream_err fallback, returns None on success path - `TestRetryRollbackContract` — end-to-end: capture + finalize yields the exact content the user saw streaming live plus the error marker 1022 total SDK tests pass (baseline + new). Frontend (`useCopilotStream.test.ts`): 1 new test — `restores the unsent text and drops the optimistic user bubble on 429 usage-limit`. ## Out of scope - Frontend rendering tweaks for the interrupted-turn marker (existing error-marker rendering already works). - Refresh-survival of the unsent text in Mode 1 (would require localStorage persistence with TTL) — separate follow-up. - Hard process-kill / OOM where Python `finally` doesn't run — needs a different mechanism (pod-level checkpoint sweeper). ## Checklist - [x] My code follows the style guidelines of this project (black/isort/ruff via `poetry run format`) - [x] I have performed a self-review of my own code - [x] I have added relevant unit tests - [x] I have run lint and tests locally (1022 SDK tests pass) ## Test plan - [ ] Verify a long-running turn that hits transient-retry exhaustion preserves partial assistant text + tool results in chat history after refresh - [ ] Verify the next user message after an interrupted turn carries enough context that the model can continue the prior task instead of inventing a new one - [ ] Verify a successful retry (attempt #1 fails, attempt #2 succeeds) shows ONLY attempt #2's content (no leaked partial from #1) - [ ] Verify hitting daily usage limit at turn start re-populates the composer with the unsent text and removes the optimistic user bubble --------- Co-authored-by: Claude <noreply@anthropic.com>
tpsaint
pushed a commit
that referenced
this pull request
Jun 26, 2026
…icant-Gravitas#13030) ## Why Egress on the `agpt-prod` and `agpt-test` Supabase projects is dominated by a single query: ```sql SELECT … FROM "platform"."NotificationEvent" WHERE "userNotificationBatchId" IN ($1) OFFSET $2 ``` Per `pg_stat_statements`: | | calls | rows returned | |---|---:|---:| | prod | 455,803 | 281,229,277 | | dev | 820,787 | 347,820,981 | It's the #1 query by row volume in both environments. The shape (`IN ($1)` with a single bind param) is Prisma materializing an eager `include={"Notifications": True}` — one IN-with-one-id per batch, run per polling iteration. The email batcher in `notifications.py::_process_existing_batches` polls open batches per `NotificationType`, and: 1. `get_all_batches_by_type` returns every open batch with all events eager-loaded — but the caller only reads `batch.user_id`. 2. `get_user_notification_oldest_message_in_batch` fetches the batch + all events and sorts them in Python just to grab `[0]`. So every poll iteration drags the full event payload back even when nothing is aged out and nothing will be sent. The existing fix in `create_or_add_to_user_notification_batch` already documents this exact failure mode for heavy `AGENT_RUN` users (see [notifications.py:467-472](https://github.com/Significant-Gravitas/AutoGPT/blob/dev/autogpt_platform/backend/backend/data/notifications.py#L467-L472)), but the read paths still had it. ## What - `get_user_notification_oldest_message_in_batch` queries `NotificationEvent` directly with `order={"createdAt": "asc"}` and returns the single oldest row. - `get_all_batches_by_type` drops the `include={"Notifications": True}`; the caller re-fetches events per batch via `get_user_notification_batch` only when an email actually needs to send. ## How Both changes are equivalent at the call-site level: - `_process_existing_batches` only reads `batch.user_id` from the list result and `oldest_message.created_at` from the oldest-event result — both still satisfied. - `_should_email_user_based_on_aged_batch` (the other caller of the oldest-message query) only reads `oldest_message.created_at` — also satisfied. - `UserNotificationBatchDTO.from_db` already tolerates `model.Notifications` being `None` (`for n in model.Notifications or []`), so the dropped include is safe. Three integration tests added in `notifications_test.py` covering the new contract: oldest event returned by `createdAt`, `None` on empty batch, and `get_all_batches_by_type` not loading events. ## Checklist - [x] My code follows the project's style guidelines - [x] I have performed a self-review of my own code - [ ] I have commented my code, particularly in hard-to-understand areas (n/a — minimal change, behaviour is obvious) - [x] I have made corresponding changes to the documentation (n/a) - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective - [x] New and existing unit tests pass locally with my changes (lint/format clean; integration tests run in CI against the test DB) - [x] Any dependent changes have been merged and published in downstream modules
tpsaint
pushed a commit
that referenced
this pull request
Jun 26, 2026
…ight caps (Significant-Gravitas#13069) ## Why We hard-capped concurrent AutoPilot turns at 15 per user as a hotfix, which rejected the 16th request with HTTP 429 — blunt UX, easy to hit by accident. This PR keeps the safeguard but introduces a **soft running cap of 5** with a **FIFO queue up to 15 in-flight (running + queued)**. The user can submit beyond 5; the dispatcher auto-promotes queued sessions as running slots free. ## What - **Soft cap: 5 running tasks per user** (configurable: `max_running_copilot_turns_per_user`). - **Hard cap: 15 in-flight (running + queued) per user** (existing setting, semantics shifted from "concurrent" to "in-flight"). - **FIFO queue at the SESSION level.** `ChatSession.chatStatus` is a single text enum: `idle` (default) | `queued` | `running` (open enum). The user's pending message is just a normal `ChatMessage` row; the session carries the lifecycle. - **Cancel:** the existing `POST /sessions/{session_id}/cancel` handles both states uniformly. Queued sessions flip back to `idle` (no executor cancel needed); running sessions publish a RabbitMQ cancel event as before. No dedicated `/queued-tasks/*` endpoints. - **Frontend:** Queued-state badge on the latest user message of any session whose `chat_status === 'queued'`. Sidebar shows a green pulsing dot for `running` and a purple hourglass for `queued` so the user can see at a glance which of their chats are in flight. Cancel button on the badge calls the same session-cancel endpoint. ## How | Layer | Storage | Purpose | |---|---|---| | Session lifecycle | `ChatSession.chatStatus` (this PR) | Soft cap (count `running`) + queue (count `queued`), single source of truth | | Dispatcher payload | `ChatMessage.metadata` JSONB (this PR) | Submit-time `file_ids` / `mode` / `model` / `permissions` / `context` / `request_arrival_at` stashed on the queued user row so promotion can replay the turn faithfully | | Execution | RabbitMQ (existing) | Worker pool, retries | ### Queue lifecycle — where it lives, who clears it **Storage.** The queue lives entirely in Postgres (no Redis sorted set, no Lua). Per session: a single `chatStatus` text column (`idle | queued | running`, open enum), plus the user's pending message persisted as a normal `ChatMessage` row with the dispatcher's submit-time payload stashed in `metadata` JSONB. **Submission path** (`backend/copilot/turn_queue.py:enqueue_turn`): 1. HTTP route tries immediate dispatch via `acquire_turn_slot`. If user is at the 5-running cap, `ConcurrentTurnLimitError` is raised. 2. Route catches the error and falls through to `try_enqueue_turn` which: checks the 15 in-flight hard cap, persists the user message + metadata, then CAS-flips `ChatSession.chatStatus` `idle` → `queued`. Returns an empty-stream response so the SSE client knows the message landed. **Promotion path** (`backend/copilot/turn_queue.py:dispatch_next_for_user`) — **fires per-user, not globally**: 1. **Trigger #1:** `mark_session_completed` (in `stream_registry.py`) — after every turn completes (success / failure / cancel), the session's `user_id` is read from Redis meta and `dispatch_next_for_user(user_id)` is invoked. The dispatcher errors are swallowed with a loud log so a queue hiccup never breaks the completion path. 2. **Trigger #2:** periodic timer (covers missed dispatch events / GKE rolling restarts). 3. Picks the user's oldest queued session via `list_chat_sessions_by_status(user_id, queued)` ordered by `updatedAt`. 4. Re-validates paywall + per-window USD cap — paywalled / rate-limited sessions stay `queued` for the next tick. 5. CAS-claims `queued` → `running`. Recovers the user message + submit-time `metadata`, builds a `TurnSlot`, calls `dispatch_turn`. On any error during dispatch, rolls the session back to `queued`. **Cancellation** (`POST /sessions/{id}/cancel`): single endpoint handles both lifecycle states — queued sessions flip back to `idle` (no executor cancel needed); running sessions publish a RabbitMQ cancel event as before. The cap and queue queries are both `count` / `find_many` on `ChatSession` by `chatStatus`. Both running-turn tracking and queue admission are non-locked CAS-then-count — same TOCTOU tolerance the graph-execution credit rate-limit accepts on its INCRBY path. Going briefly to 16/17 in-flight under burst is acceptable; the cap is a safeguard, not a budget. The DB-manager surface is **4 generic ChatSession primitives** (`count_chat_sessions_by_status`, `list_chat_sessions_by_status`, `update_chat_session_status` with optional `expect_status` CAS gate, `get_chat_session_status`) plus the existing `add_chat_message` extended with optional `message_id` + `metadata`. Adding a new lifecycle state is a code-only change at call sites. DB access goes through `backend.data.db_accessors.chat_db()` so the dispatcher works from both the HTTP server (Prisma directly) and the CoPilotExecutor subprocess (RPC via `DatabaseManager`). Route gates (`is_turn_in_flight`, `acquire_turn_slot`) treat both `queued` and `running` as "in flight", so a resubmit to a queued session lands in the pending buffer or falls through to the cross-session queue rather than racing the dispatcher. ## Test plan - [x] Backend unit tests — `active_turns_test.py`, `turn_queue_test.py`, `db_test.py` (60+ tests covering admission, refresh, release, queued-collision, cap-rollback, dispatcher branches: paywall/rate-limited stays queued, rate-limit unavailable, happy path, dispatch failure → restore). - [x] **New integration tests** in `stream_registry_test.py`: pin the per-user slot-free dispatcher invocation on `mark_session_completed`, plus the error-swallowing behaviour so a queue hiccup never breaks the turn completion path. - [x] Migration scoped to additive nullable column + an additional non-partial index (cheap on Postgres, no table rewrite, no backfill needed). - [x] Frontend integration tests (Vitest): queued-state badge, cancel button, error paths (404 silent, 5xx destructive toast, network error), sidebar running/queued/idle indicators. - [x] Live UI proof inline on the PR ([scroll down](Significant-Gravitas#13069 (comment))): Queued badge, pill tooltip, cancel-hover red state, after-cancel hidden — all rendered by the production component against a real backend running this branch. Sidebar running-dot + queued-hourglass also screenshotted. - [x] CI: lint, types, integration_test, end-to-end tests. ### Stuck-running bugs found and fixed in this PR Audit during review surfaced multiple "DB says running, executor isn't actually running it" failure modes. Every one fixed in this PR; none deferred. | # | Scenario | Symptom | Fix | |---|---|---|---| | 1 | Executor crashes mid-turn, Redis meta TTLs out, DB `chatStatus` stays `running` | Sidebar green dot forever; new submits buffer indefinitely | `cancel_session_task` now calls `release_turn_slot` on the no-active-session branch — the user clicking Cancel always clears the orphan. `get_session` route also resets when DB says `running` but Redis is empty, so just opening the stuck chat clears it. | | 2 | `acquire_turn_slot` flips DB `idle → running` but the request aborts before `dispatch_turn` runs | Same orphan, no Redis meta written | Same fixes as #1 catch this on the next cancel/open. The new periodic sweep below covers the no-interaction case. | | 3 | `dispatch_turn` succeeds at `create_session` (Redis meta written) but fails at the RabbitMQ enqueue | Redis hash sits at `status='running'` until TTL; `is_turn_in_flight` keeps reporting in-flight | `dispatch_turn` now uses `try/finally` on a `committed` flag — Redis meta is deleted on ANY non-happy-path exit including `CancelledError` (which `except Exception` would miss). Cleanup runs inside `dispatch_turn` itself so both the HTTP `schedule_chat_turn` path and the queue dispatcher path are covered. | | 4 | `mark_session_completed` itself fails mid-completion (Redis blip during the CAS) | DB never flips back to `idle` | Same chat-open / cancel reset fixes handle this. No proactive sweep was viable here because we can't distinguish "legit long-running tool call" from "stuck" without a Redis side check. | | 5 | Dispatcher rollback used `except Exception`, leaking on `CancelledError` | Task cancellation during dispatch could leave DB stuck at `running` | Switched to `except BaseException` in `dispatch_next_for_user` so cancellation still rolls back the DB claim. Pairs with the `try/finally` in `dispatch_turn` for full Redis+DB symmetry. | | 6 | `chatStatus` was an open `TEXT` column with no DB-level validation | A typo like `"runnin"` could persist and break the cap-count | Converted to a Postgres enum `ChatSessionStatus` (`idle | queued | running`) — invalid values now rejected at the DB layer. Future states need a tiny `ALTER TYPE ADD VALUE` migration (cheap on PG 12+). | **Why no periodic sweep.** The reactive cleanups (rows 1 and 2) catch every user-visible path: clicking Cancel and opening the chat both reset the orphan. The existing 6h+5min inline stale-CAS in `get_active_session` remains the ultimate backstop. Adding a per-pod APScheduler job would chase the narrow "user has stuck sessions but never opens them and never cancels" case at the cost of constant operational surface — not worth it for this rare scenario. All paths that mutate `chatStatus` now also clean the matching Redis state (either inline or via the sweep) — the two layers stay in sync. ### Cap + queue safety against double-promotion (carry-over from earlier review) - `mark_session_completed` does `release_turn_slot` (running → idle) BEFORE `dispatch_next_for_user`. Two concurrent completions release two slots first, then race the CAS on the same head — only one wins per slot, so at most N promotions for N releases. Cap holds. - `claim_queued_session` is an atomic Postgres `UPDATE … WHERE chatStatus='queued' AND id=…`: keyed to the specific head row, only one CAS matches. - All db.py functions exposed on `DatabaseManager` return primitives or DTOs (`ChatSessionInfo`, `ChatMessage`) — no raw Prisma rows cross the RPC boundary, so the executor subprocess can safely route through `chat_db()`.
tpsaint
pushed a commit
that referenced
this pull request
Jun 26, 2026
…avitas#13370) ### Why / What / How **Why:** A TAC Security CASA assessment (ESOF AppSec ADA) of `platform.agpt.co` flagged several missing/leaky HTTP response headers. None are critical (overall ESOF score 9.1/10, "Low"), but all must be remediated to pass the CASA Tier 2 & Tier 3 assessments. Verified live today: prod sends `X-Powered-By: Next.js` and is missing both `X-Frame-Options` and `X-Content-Type-Options`. **What:** Adds baseline security response headers in the frontend Next.js config and disables the framework-fingerprinting header. **How:** Defines the headers in `nextConfig.headers()` (applied to `/:path*`) and sets `poweredByHeader: false`. The existing `Document-Policy: js-profiling` header was previously nested inside the Sentry plugin options (2nd arg to `withSentryConfig`), where it does not apply on the non-Sentry build path — it's moved into the Next config so every build path emits the full header set. ### Changes 🏗️ - `autogpt_platform/frontend/next.config.mjs`: - `poweredByHeader: false` → removes `X-Powered-By: Next.js` (finding #5; partially addresses proxy disclosure #3) - `X-Frame-Options: SAMEORIGIN` → anti-clickjacking (finding #2) - `X-Content-Type-Options: nosniff` → anti-MIME-sniffing (finding #4) - Moved `Document-Policy: js-profiling` from the Sentry options into `nextConfig.headers()` so it (and the new headers) apply on all build paths **Not addressed here (platform-level / separate):** - CORS `Access-Control-Allow-Origin: *` on `/_next/static/media/*` fonts (#1) — injected by Vercel for font assets; overriding risks breaking Next's `crossorigin` font preloads. Needs a separate, deploy-verified change. - `Server: Vercel` header and TRACE/TRACK methods (#3) — injected at the Vercel edge, no in-repo lever. - Info-only items #6/#7/#8/#9 are scanner observations or already handled (backend cache-protection middleware sends `no-store` on sensitive routes). `SAMEORIGIN` was chosen over `DENY` so same-origin embedding (share/preview flows) keeps working; easy to tighten if the app is never framed. ### Checklist 📋 #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes (no env changes) - [x] `docker-compose.yml` is updated or already compatible with my changes (no compose changes) - [x] I have included a list of my configuration changes in the PR description (under **Changes**) #### Test plan: - [x] `pnpm format`, `pnpm lint`, `pnpm types` pass locally - [ ] After deploy, verify response headers on `platform.agpt.co`: - [ ] `X-Frame-Options: SAMEORIGIN` present - [ ] `X-Content-Type-Options: nosniff` present - [ ] `X-Powered-By` absent - [ ] `Document-Policy: js-profiling` still present (Sentry profiling unaffected) - [ ] App still renders and fonts load (no clickjacking/embedding regressions) 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Low Risk** > Config-only HTTP header changes with no auth or data-path changes; `SAMEORIGIN` framing is a deliberate tradeoff that should be smoke-tested after deploy for embed/preview and font loading. > > **Overview** > Addresses CASA assessment gaps on the Next.js frontend by **disabling** `X-Powered-By: Next.js` via `poweredByHeader: false` and **adding** global response headers on `/:path*` through `nextConfig.headers()`: `X-Content-Type-Options: nosniff`, `X-Frame-Options: SAMEORIGIN`, and `Document-Policy: js-profiling`. > > **`Document-Policy`** (and the new security headers) were previously defined only inside the optional `withSentryConfig` wrapper, so builds that skip the Sentry webpack plugin never emitted them. They now live on the base Next config so every build path gets the same header set. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit 2c6931e. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY --> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
tpsaint
pushed a commit
that referenced
this pull request
Jun 26, 2026
…Gravitas#13371) > **Draft — needs a post-deploy header check before merge (see Testing).** ### Why / What / How **Why:** The TAC Security CASA (ESOF AppSec ADA) assessment of `platform.agpt.co` flags an "overly permissive CORS policy" (finding #1, CWE-264): self-hosted Next.js fonts under `/_next/static/media/*` respond with `Access-Control-Allow-Origin: *`. All CASA findings must be cleared for the Tier 2 & Tier 3 assessments (currently re-expiring). **What:** Scope the `Access-Control-Allow-Origin` header on the font path from `*` to our own origin. **How:** Adds a `nextConfig.headers()` rule for `/_next/static/media/:path*` setting `Access-Control-Allow-Origin: https://platform.agpt.co`. These fonts are only loaded by our own same-origin pages, so a same-origin fetch is unaffected by the header value — this should not change how fonts render. ### Changes 🏗️ - `autogpt_platform/frontend/next.config.mjs`: add `headers()` rule restricting font CORS to our origin. ###⚠️ Reviewer note — must verify after deploy Vercel's "Ignored Build Step" cancels the preview build for this PR, so the header could **not** be verified pre-merge. The header on `/_next/static/media/*` is currently emitted by the Vercel/Next.js static layer; this config override may or may not replace it. **Please confirm on a real deploy (dev/staging) before merging:** ```bash curl -sI "https://<deploy-url>/_next/static/media/<any>.woff2" | grep -i access-control-allow-origin # want: a single "access-control-allow-origin: https://platform.agpt.co" # NOT: "*", and NOT two ACAO headers (duplicate = the override didn't replace it) ``` If the override does **not** take (still `*` or duplicated), the fallback is a Vercel project-level header rule or accepting it with a CASA justification (fonts are non-sensitive, same-origin only). Failure mode is safe either way: fonts load the same regardless, since they're fetched same-origin. This overlaps `next.config.mjs` with Significant-Gravitas#13370 (baseline security headers); whichever merges second will need a trivial `headers()` merge. ### Checklist 📋 #### For configuration changes: - [x] `.env.default` / `docker-compose.yml` unaffected (no env/compose changes) - [x] Configuration changes listed above under **Changes** - [x] `pnpm format` / `pnpm types` pass locally - [ ] **Post-deploy:** font `Access-Control-Allow-Origin` is the origin (not `*`, not duplicated) - [ ] **Post-deploy:** app renders and fonts load normally 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: Swifty <craigswift13@gmail.com> Co-authored-by: Ubbe <hi@ubbe.dev>
|
This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR significantly improves test coverage for the JSON parsing module in the AutoGPT Forge framework. The
forge/json/parsing.pymodule had only 42% test coverage, with two important functions (extract_dict_from_jsonandextract_list_from_json) completely untested. This creates a risk for regressions and makes it difficult to maintain code quality.Changes 🏗️
Added comprehensive tests for
extract_dict_from_jsonfunction (10 new test cases)jsonandJSONmarkers)Added comprehensive tests for
extract_list_from_jsonfunction (10 new test cases)jsonandJSONmarkers)Enhanced
json_loadserror handling tests (3 new test cases)Fixed
conftest.pyto remove problematic VCR plugin reference that was preventing tests from runningTest Coverage Results:
forge/json/parsing.py: 42% → 100% coverage (all 43 lines now covered)Checklist 📋
For code changes:
pytest forge/json/test_parsing.py -v)pytest --cov=forge.json.parsing --cov-report=term-missing forge/json/test_parsing.py)pytest forge/utils/test_file_operations.py forge/utils/test_url_validator.py)For configuration changes:
.env.exampleis updated or already compatible with my changes (no env changes needed)docker-compose.ymlis updated or already compatible with my changes (no docker changes needed)@tpsaint can click here to continue refining the PR