[codex] Add integrated browser preview, annotations, and agent automation#3053
[codex] Add integrated browser preview, annotations, and agent automation#3053t3dotgg wants to merge 9 commits into
Conversation
Adds a desktop-only browser preview that lives in the right panel slot alongside plan/diff. Lets the user point an Electron <webview> at any URL — typed into a chrome-style URL bar, clicked from the empty-state list of detected localhost dev servers, or auto-opened by a project script with `previewUrl` set. Single-tab per thread. Server (Effect/Layers): - PreviewManager: per-(thread, tab) session metadata via SynchronizedRef + PubSub<PreviewEvent>; survives WS reconnect via `list`/replay. - PreviewPortScanner: lsof on macOS/Linux, TCP probe fallback on Windows; reference-counted polling so we only scan when subscribed. - WS RPC + streams (`preview.open|navigate|refresh|close|list|reportStatus`, `subscribePreviewEvents`, `subscribeDiscoveredLocalServers`). Desktop: - PreviewViewManager owns Chromium WebContents per tab, mediates navigation/zoom/devtools/clear-storage. registerWebview gates by webContents.getType() === "webview" and host-window match. - IPC channels for create/close/register/navigate/back/forward/refresh/ zoom/hardReload/openDevTools/clearCookies/clearCache/getBrowserPartition. - Forwards app-level shortcuts (mod+shift+J, mod+K, mod+,, mod+W) from the webview back to the main window. - Persisted browser session partition (cookies, cache). Web: - PreviewPanel/PreviewView/PreviewWebview render the surface; chrome row with back/forward/refresh + URL input + Open-in-browser + 3-dot menu (Hard reload, DevTools, Zoom −/+/reset, Clear cookies/cache). - usePreviewSession subscribes to server events; usePreviewBridge mirrors desktop state into the store and forwards Loading→Success/ LoadFailed back to the server. - previewStateStore: per-thread snapshot + desktopOverlay + recently- seen URLs (Zustand). - rightPanelStore arbitrates plan vs. preview vs. diff; ChatView's toggles strip the `?diff=1` URL hint when switching to preview and vice versa so the panels are mutually exclusive. - Top-nav Globe toggle in ChatHeader (desktop builds only) and a `mod+shift+J` keybinding routed via a typed previewActionBus. - PreviewEmptyState lists detected localhost servers (scanner + configured project URLs + recently-seen) with live "listening" pulse. - PreviewUnreachable: theme-aware port of Chromium's "site can't be reached" page. - Resizable inline panel (RightPanelResizeHandle + useResizableWidth); width persists to localStorage on drag-end. - Terminal link "Open in preview" context-menu integration for loopback URLs. Contracts: - preview.ts schemas (PreviewSessionSnapshot, PreviewNavStatus, PreviewEvent, RPC inputs/results, DiscoveredLocalServer). - ProjectScript schema gains optional `previewUrl` + `autoOpenPreview`. - New keybinding commands: preview.toggle/refresh/focusUrl/zoomIn/Out/ resetZoom; new `when:` contexts `previewFocus` / `previewOpen`. Shared: - @t3tools/shared/preview: normalizePreviewUrl, isPreviewableUrl, isLoopbackHost, newPreviewTabId, LSOF_LOCAL_HOST_TOKENS. Tests: - contracts: schema decode tests for all preview events/snapshots/inputs. - shared: URL normalization coverage. - server: PreviewManager (open/navigate/reportStatus/refresh/close, multi-subscriber isolation, idempotency); PortScanner (lsof parsing including IPv6, TCP probe, reference-counted polling). - web: previewStateStore (per-tab event application, dedupe, reconnect recovery); rightPanelStore arbitration.
Adds an in-page element picker to the preview browser. Clicking the crosshair button in the chrome row activates a blue-highlight picker inside the guest webview; clicking an element captures its component name (via react-grab), source location, html/css preview, and selector, then attaches it to the chat composer as a chip that serializes into an `<element_context>` block in the outgoing message. Architecture: - Per-`<webview>` preload bundle (`preview-pick-preload.cjs`) renders the overlay, hosts the picker event loop, and bubbles the picked payload back to main via the per-WebContents `wc.ipc` channel (not `sendToHost`, which only fires on the host renderer's <webview> element and never reaches main). - Main coordinates via `PreviewViewManager.pickElement(tabId)`, which cancels any in-flight session, force-focuses the guest (so the first click on a remote page actually reaches the preload), then awaits the payload. User-initiated cancels (Escape, beforeunload) echo `null` back to main; main-initiated cancels and supersession tear down silently to avoid the new-pick-resolves-with-stale-null race. - Renderer fetches partition + webPreferences + preload URL in a single `getPreviewConfig()` IPC call, snapshots the previously-focused host element before triggering a pick, and restores focus when the pick resolves so the user's textarea cursor isn't lost. Security posture for the guest webview: - `webpreferences="contextIsolation=false,sandbox=true,nodeIntegration=false"` centralized in `preview-webview-preferences.ts`. contextIsolation off is required so react-grab's `getElementContext` can reach the page's React DevTools hook on `globalThis`. sandbox stays on so the page cannot reach Node APIs even with shared globals (without it, the preload's `require` would land on the page's `globalThis` and any third-party site could send arbitrary IPC to main). - Defense in depth: a `will-attach-webview` handler in main, gated on the preview partition, force-pins `sandbox: true`, all `nodeIntegration*: false`, and the absolute preload PATH (not URL — that field rejects file:// URLs with "preload script must have absolute path" and silently disables the picker). Composer + transcript integration: - New `elementContexts` slice in `composerDraftStore` (mirrors the terminal-context slice: dedup by selector+tag+component+url, persist via partializer, restore on send-failure retry). - `ComposerPendingElementContexts` chip row above the editor. - `deriveDisplayedUserMessageState` now strips both `<element_context>` AND `<terminal_context>` blocks (element first, since it's appended last) and exposes element entries to `MessagesTimeline`, which renders them as compact chips beneath the message body. - Pick button is disabled with explanatory tooltip when the page failed to load (the React `<PreviewUnreachable>` overlay covers the webview, so picks would silently dangle otherwise). Tests added: - `preview-webview-preferences.test.ts` locks down the security flags (contextIsolation=false, sandbox=true, nodeIntegration=false, no whitespace, only true/false literal values). - `preview-pick-label-position.test.ts` covers the floating-label clamp/flip math (no off-screen overflow, flip-below when no room above, etc.). - `picked-element-payload.test.ts` validator coverage. - `elementContext.test.ts` for the serialization round-trip, normalization, dedup, and label formatting. - `composerDraftStore.test.ts` element-contexts slice (add, dedup, remove, set, clear, persistence round-trip). - `ChatView.logic.test.ts` sendable-content-with-element-only. Build: new `tsdown` entry inlines react-grab + bippy into the picker preload bundle (~59KB / 19KB gzipped).
Co-authored-by: codex <codex@users.noreply.github.com>
Co-authored-by: codex <codex@users.noreply.github.com>
Co-authored-by: codex <codex@users.noreply.github.com>
- Add structured annotation payload validation and tests - Update preview preload to capture selected elements, regions, and strokes - Wire new preview annotation UI into the web app Co-authored-by: codex <codex@users.noreply.github.com>
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
🚀 Expo continuous deployment is ready!
|
| const userImages = row.message.attachments ?? []; | ||
| const displayedUserMessage = deriveDisplayedUserMessageState(row.message.text); | ||
| const terminalContexts = displayedUserMessage.contexts; | ||
| const elementContextState = extractTrailingElementContexts(displayedUserMessage.visibleText); |
There was a problem hiding this comment.
🟠 High chat/MessagesTimeline.tsx:431
extractTrailingElementContexts is called on displayedUserMessage.visibleText, but deriveDisplayedUserMessageState already stripped element contexts from that text. Since visibleText is the text after stripping, elementContextState.contexts will always be empty and no element context chips render. Use displayedUserMessage.elementContexts directly instead.
- const elementContextState = extractTrailingElementContexts(displayedUserMessage.visibleText);🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/web/src/components/chat/MessagesTimeline.tsx around line 431:
`extractTrailingElementContexts` is called on `displayedUserMessage.visibleText`, but `deriveDisplayedUserMessageState` already stripped element contexts from that text. Since `visibleText` is the text *after* stripping, `elementContextState.contexts` will always be empty and no element context chips render. Use `displayedUserMessage.elementContexts` directly instead.
Evidence trail:
apps/web/src/lib/terminalContext.ts lines 248-262 (deriveDisplayedUserMessageState strips element contexts first at line 252, then terminal contexts at line 253; visibleText = extractedTerminal.promptText at line 255; elementContexts = extractedElement.contexts at line 260). apps/web/src/components/chat/MessagesTimeline.tsx line 431 (extractTrailingElementContexts called on already-stripped visibleText). apps/web/src/components/chat/MessagesTimeline.tsx line 470 (elementContextState.contexts.length > 0 check will always be false).
| const APP_FORWARDED_SHORTCUTS: ReadonlyArray<{ | ||
| key: string; | ||
| meta: boolean; | ||
| shift: boolean; | ||
| control: boolean; | ||
| }> = Object.freeze([ | ||
| // mod+shift+J → preview.toggle | ||
| { key: "j", meta: true, shift: true, control: false }, | ||
| // mod+K → command palette | ||
| { key: "k", meta: true, shift: false, control: false }, | ||
| // mod+, → settings (macOS convention) | ||
| { key: ",", meta: true, shift: false, control: false }, | ||
| // mod+W → close tab/panel | ||
| { key: "w", meta: true, shift: false, control: false }, | ||
| ]); |
There was a problem hiding this comment.
🟡 Medium src/preview-view-manager.ts:137
The APP_FORWARDED_SHORTCUTS entries use meta: true, control: false, which only matches macOS Command-key shortcuts. On Windows/Linux, input.control is true and input.meta is false for these same shortcuts, so the exact equality check in isAppShortcut fails and Ctrl+K, Ctrl+W, etc. are not forwarded to the main window. Since the comments use "mod+" (platform modifier), this appears intended to be cross-platform. Consider normalizing both the definition and input to use the platform modifier key, or match either meta or control being true.
shift: boolean;
control: boolean;
}> = Object.freeze([
// mod+shift+J → preview.toggle
- { key: "j", meta: true, shift: true, control: false },
+ { key: "j", shift: true },
// mod+K → command palette
- { key: "k", meta: true, shift: false, control: false },
+ { key: "k" },
// mod+, → settings (macOS convention)
- { key: ",", meta: true, shift: false, control: false },
+ { key: "," },
// mod+W → close tab/panel
- { key: "w", meta: true, shift: false, control: false },
+ { key: "w" },
]);🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/desktop/src/preview-view-manager.ts around lines 137-151:
The `APP_FORWARDED_SHORTCUTS` entries use `meta: true, control: false`, which only matches macOS Command-key shortcuts. On Windows/Linux, `input.control` is `true` and `input.meta` is `false` for these same shortcuts, so the exact equality check in `isAppShortcut` fails and Ctrl+K, Ctrl+W, etc. are not forwarded to the main window. Since the comments use "mod+" (platform modifier), this appears intended to be cross-platform. Consider normalizing both the definition and input to use the platform modifier key, or match either `meta` or `control` being true.
Evidence trail:
apps/desktop/src/preview-view-manager.ts lines 137-151 (APP_FORWARDED_SHORTCUTS definition with meta:true, control:false), lines 557-566 (isAppShortcut strict equality check on meta and control fields), lines 522-538 (before-input-event handler using isAppShortcut). Platform support evidence: apps/desktop/src/shell/DesktopShellEnvironment.ts lines 52-333, apps/desktop/src/app/DesktopLifecycle.ts line 219, apps/desktop/src/window/DesktopWindow.ts lines 84-85 (all showing win32/linux support).
| interface ManagedListeners { | ||
| navigate: () => void; | ||
| failed: (event: Event, code: number, description: string) => void; | ||
| } |
There was a problem hiding this comment.
🟡 Medium src/preview-view-manager.ts:127
The before-input-event handler attached in attachListeners is never removed in detachListeners. ManagedListeners only tracks navigate and failed, so the shortcut forwarding listener leaks when a tab's webContentsId changes. The orphaned handler continues forwarding keystrokes to mainWindow, and re-attaching the same webContents creates duplicate handlers.
interface ManagedListeners {
navigate: () => void;
failed: (event: Event, code: number, description: string) => void;
+ shortcut: (event: Electron.Event, input: Electron.Input) => void;
}🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/desktop/src/preview-view-manager.ts around lines 127-130:
The `before-input-event` handler attached in `attachListeners` is never removed in `detachListeners`. `ManagedListeners` only tracks `navigate` and `failed`, so the shortcut forwarding listener leaks when a tab's `webContentsId` changes. The orphaned handler continues forwarding keystrokes to `mainWindow`, and re-attaching the same webContents creates duplicate handlers.
Evidence trail:
apps/desktop/src/preview-view-manager.ts lines 127-130 (ManagedListeners interface only has navigate and failed), lines 486-541 (attachListeners adds before-input-event at line 524 as anonymous function, stores only navigate/sync and failed in ManagedListeners at line 540), lines 543-555 (detachListeners removes did-navigate, did-navigate-in-page, page-title-updated, did-start-loading, did-stop-loading, did-fail-load but NOT before-input-event), lines 254-255 (registerWebview calls detachListeners on old webContentsId when it changes).
| {shouldUsePlanSidebarSheet && previewPanelOpen && activeThreadRef ? ( | ||
| <RightPanelSheet open onClose={closePreviewPanel}> | ||
| <Suspense fallback={null}> | ||
| <PreviewPanel mode="sheet" threadRef={activeThreadRef} visible /> | ||
| </Suspense> | ||
| </RightPanelSheet> | ||
| ) : null} |
There was a problem hiding this comment.
🟢 Low components/ChatView.tsx:4243
The mobile preview sheet at line 4243 conditionally renders based on previewPanelOpen, so it unmounts instantly when closed. The plan sidebar sheet at line 4250 stays mounted with open={planSidebarOpen}, allowing the @base-ui/react Sheet closing animation to play. This causes the preview panel to disappear jarringly on mobile instead of animating smoothly like the plan sidebar.
- {shouldUsePlanSidebarSheet && previewPanelOpen && activeThreadRef ? (
+ {shouldUsePlanSidebarSheet && activeThreadRef ? (
<RightPanelSheet open onClose={closePreviewPanel}>
<Suspense fallback={null}>
- <PreviewPanel mode="sheet" threadRef={activeThreadRef} visible />
+ <PreviewPanel mode="sheet" threadRef={activeThreadRef} visible={previewPanelOpen} />
</Suspense>
</RightPanelSheet>
) : null}🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/web/src/components/ChatView.tsx around lines 4243-4249:
The mobile preview sheet at line 4243 conditionally renders based on `previewPanelOpen`, so it unmounts instantly when closed. The plan sidebar sheet at line 4250 stays mounted with `open={planSidebarOpen}`, allowing the `@base-ui/react` `Sheet` closing animation to play. This causes the preview panel to disappear jarringly on mobile instead of animating smoothly like the plan sidebar.
Evidence trail:
apps/web/src/components/ChatView.tsx lines 4243-4264 (REVIEWED_COMMIT) — preview panel conditional mount vs. plan sidebar staying mounted.
apps/web/src/components/RightPanelSheet.tsx lines 6-29 (REVIEWED_COMMIT) — `keepMounted` on SheetPopup, `open` prop passed through to `Sheet`.
apps/web/src/components/ui/sheet.tsx line 3 — imports `@base-ui/react/dialog` as the Sheet primitive.
| useEffect(() => { | ||
| setMountedPreviewThreadKeys((currentThreadIds) => { | ||
| const nextThreadIds = reconcileRetainedMountedThreadIds({ | ||
| currentThreadIds, | ||
| openThreadIds: existingPreviewSessionThreadKeys, | ||
| activeThreadId: activeThreadKey, | ||
| activeThreadOpen: Boolean(activeThreadKey && !shouldUsePlanSidebarSheet), | ||
| maxHiddenThreadCount: MAX_HIDDEN_MOUNTED_PREVIEW_THREADS, | ||
| retainInactiveActiveThread: true, | ||
| }); | ||
| return currentThreadIds.length === nextThreadIds.length && | ||
| currentThreadIds.every((nextThreadId, index) => nextThreadId === nextThreadIds[index]) | ||
| ? currentThreadIds | ||
| : nextThreadIds; | ||
| }); | ||
| }, [ | ||
| activeThreadKey, | ||
| existingPreviewSessionThreadKeys, | ||
| previewPanelOpen, | ||
| shouldUsePlanSidebarSheet, | ||
| ]); |
There was a problem hiding this comment.
🟢 Low components/ChatView.tsx:1149
previewPanelOpen is declared in the dependency array of the setMountedPreviewThreadKeys effect (line 1167) but is never read inside the effect body. The reconciliation logic uses activeThreadOpen: Boolean(activeThreadKey && !shouldUsePlanSidebarSheet) instead. This causes the effect to re-run whenever previewPanelOpen toggles without changing the output, wasting renders and obscuring that the mounted preview set does not actually react to preview open/close state.
activeThreadKey,
existingPreviewSessionThreadKeys,
- previewPanelOpen,
shouldUsePlanSidebarSheet,
]);🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/web/src/components/ChatView.tsx around lines 1149-1169:
`previewPanelOpen` is declared in the dependency array of the `setMountedPreviewThreadKeys` effect (line 1167) but is never read inside the effect body. The reconciliation logic uses `activeThreadOpen: Boolean(activeThreadKey && !shouldUsePlanSidebarSheet)` instead. This causes the effect to re-run whenever `previewPanelOpen` toggles without changing the output, wasting renders and obscuring that the mounted preview set does not actually react to preview open/close state.
Evidence trail:
apps/web/src/components/ChatView.tsx lines 1149-1169 (effect body and dependency array at REVIEWED_COMMIT), line 1109 (`previewPanelOpen` definition), line 933 (`shouldUsePlanSidebarSheet` definition), lines 1134-1148 (parallel terminal effect for comparison).
| const key = input.key; | ||
| const text = key.length === 1 ? key : undefined; | ||
| const params = { | ||
| key, |
There was a problem hiding this comment.
🟠 High src/preview-view-manager.ts:759
The code field at line 762 generates invalid CDP key codes for single-character non-letter keys. For digits like "1", the expression `Key${key.toUpperCase()}` produces "Key1" when CDP expects "Digit1"; for space it produces "Key " when CDP expects "Space". This causes Input.dispatchKeyEvent to fail or send the wrong key when automation presses digits, punctuation, or space.
- const code = key.length === 1 ? `Key${key.toUpperCase()}` : key;🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/desktop/src/preview-view-manager.ts around lines 759-762:
The `code` field at line 762 generates invalid CDP key codes for single-character non-letter keys. For digits like `"1"`, the expression `` `Key${key.toUpperCase()}` `` produces `"Key1"` when CDP expects `"Digit1"`; for space it produces `"Key "` when CDP expects `"Space"`. This causes `Input.dispatchKeyEvent` to fail or send the wrong key when automation presses digits, punctuation, or space.
Evidence trail:
apps/desktop/src/preview-view-manager.ts lines 759-768 (code generation logic), packages/contracts/src/previewAutomation.ts lines 70-73 (key is TrimmedNonEmptyString, no letter restriction), CDP docs at https://chromedevtools.github.io/devtools-protocol/tot/Input/#method-dispatchKeyEvent (code param: 'Unique DOM defined string value for each physical key (e.g., KeyA)'), W3C UI Events spec defines Digit0-Digit9 for digit keys not Key0-Key9.
Summary
Adds a complete integrated browser workflow to T3 Code, spanning the web UI, Electron guest webview, environment server, provider sessions, and shared contracts.
preview_*automation toolsAgent automation
The environment server now hosts one reusable Streamable HTTP MCP endpoint at
/mcp. Provider sessions receive short-lived, capability-scoped bearer credentials when they start or resume; only token hashes are retained, and credentials are revoked with the provider session.The preview toolkit supports:
Automation is routed through a preview broker to the focused desktop owner and then executed against the existing visible Electron webview via CDP. It does not launch a separate headless browser or per-thread MCP process, so the agent and user share the same page, cookies, navigation history, and visual state.
Provider integration covers Codex, Claude, Cursor, Grok, and OpenCode session startup/resume paths.
Preview and annotation architecture
apps/serverowns local-server discovery, preview session state, WebSocket RPCs, MCP authentication, scoped provider credentials, and automation request routing.apps/webowns the right-side preview experience, per-thread state, focused automation ownership, composer attachments, and preview lifecycle UX.apps/desktopowns the sandboxed Electron webview, navigation/zoom state, screenshot capture, element picking, annotation overlays, and CDP execution.packages/contractsandpackages/client-runtimedefine the shared preview, IPC, RPC, annotation, and automation protocols.The picker preload intentionally uses
contextIsolation=falseso React component metadata is visible, while retainingsandbox=trueandnodeIntegration=false; the main process also enforces the security-critical guest preferences before attachment.Reliability
202 Acceptedfor Codex Streamable HTTP compatibilityUser impact
Users can discover and open a local app inside T3 Code, inspect and annotate the actual rendered page, attach precise visual context to a prompt, and ask the coding agent to operate that same visible browser directly.
Validation
vp checkvp run typecheckvp test(3,438passed,7skipped)ready;preview_open,preview_status, andpreview_snapshotexecuted against the integratedt3.chatwebview