Skip to content

feat(cloud-agent): cache prepared workspaces#3974

Open
eshurakov wants to merge 2 commits into
mainfrom
viridian-telephone
Open

feat(cloud-agent): cache prepared workspaces#3974
eshurakov wants to merge 2 commits into
mainfrom
viridian-telephone

Conversation

@eshurakov

Copy link
Copy Markdown
Contributor

Summary

Why

Fresh Cloud Agent sessions repeatedly clone repositories and rebuild the same setup-produced workspace state. Reusing a prepared repository directory can shorten cold starts, but the cache must not become authoritative for Git state, credentials, or session-specific runtime data.

What was done

  • Cache eligible repository workspaces for 24 hours through Cloudflare Sandbox directory backups, with an opaque owner-scoped index stored in R2.
  • Restrict reuse to fresh, non-devcontainer sessions with at least one setup command; sessions without setup work continue to use the cheaper direct-clone path.
  • Derive cache identity from the canonical credential-free repository, clone shape, ordered setup commands, wrapper version, owner, and declared setup environment identities without hashing decrypted secret plaintext.
  • Treat restored directories as warm bases: install current credentials, fetch authoritative Git state, recreate session-specific files, and rerun every setup command before readiness.
  • Fall back to a cold workspace for archive/validation or explicit Git reconciliation failures, while keeping publication failures nonfatal and filesystem/authentication restoration failures visible.
  • Emit volatile restore/publication progress plus bounded structured lifecycle logs, and keep the shared pnpm store outside both session homes and workspace archives.

High-level architecture

sequenceDiagram
  participant Session as Cloud Agent session
  participant Adapter as Sandbox adapter
  participant R2 as R2 backup index
  participant Sandbox as Cloudflare Sandbox
  participant Wrapper as Session wrapper

  Session->>Adapter: Ensure prepared wrapper
  Adapter->>R2: Load owner-scoped backup record
  alt Valid cache hit
    Adapter->>Sandbox: Restore repository directory
    Adapter->>Wrapper: Ensure session ready from warm base
    Wrapper->>Wrapper: Reconcile Git and rerun setup
    alt Reconciliation fails
      Wrapper-->>Adapter: WORKSPACE_RECONCILIATION_FAILED
      Adapter->>Sandbox: Remove restored workspace
      Adapter->>Wrapper: Retry normal cold bootstrap
      Adapter->>Sandbox: Create directory backup
      Adapter->>R2: Publish validated backup record
    end
  else Miss or ineligible
    Adapter->>Wrapper: Run normal cold bootstrap
    opt Eligible miss
      Adapter->>Sandbox: Create directory backup
      Adapter->>R2: Publish validated backup record
    end
  end
  Adapter-->>Session: Session ready
Loading

Architecture decision

Decision: Use Cloudflare Sandbox directory backups as owner-scoped warm repository bases, while preserving authoritative Git reconciliation and setup execution for every restored session.

Context: Full runtime snapshots would capture session-specific credentials, process state, and home-directory data, while skipping reconciliation or setup would trust cached state beyond its safe lifetime.

Rationale: Repository-only backups reuse setup-produced filesystem state without changing the existing wrapper's ownership of Git reconciliation, session materialization, and setup semantics. Credential-free cache identity and current-token restoration keep authentication outside the persisted contract.

Alternatives considered:

  • Full container snapshots. Rejected because they widen the persisted state boundary to wrapper processes, session homes, and runtime credentials.
  • Skip setup on cache hits. Rejected because setup commands and current materialized inputs remain authoritative; cached files are an optimization, not proof of readiness.
  • Cache repositories without setup commands. Rejected because direct cloning is expected to be cheaper when there are no post-clone artifacts to reuse.

Consequences: Cache hits still pay Git reconciliation and setup costs, and setup-generated untracked files may survive into the warm base. In return, cache failure can remain isolated from session correctness and most misses degrade to the existing cold path.

Verification

  • Not manually verified against a live Cloudflare Sandbox and production R2 environment in this session.

Visual Changes

N/A

Reviewer Notes

  • Review organization ownership as the confidentiality boundary for setup-generated workspace files; personal caches remain user-scoped.
  • Restore/setup equivalence depends on setup commands tolerating pre-existing untracked or ignored files because restored hits rerun setup without treating the cache as authoritative.
  • Expired records are rejected after 24 hours, but physical archive/index cleanup remains an operational lifecycle concern.
  • Backup publication is synchronous after wrapper readiness and before preparation returns; publication failures remain nonfatal except when the authenticated Git origin cannot be restored.

Comment thread services/cloud-agent-next/wrapper/src/session-bootstrap.ts Outdated
@kilo-code-bot

kilo-code-bot Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Executive Summary

Both previously flagged issues have been resolved: backupMode() now logs workspace_backup.configuration.disabled before degrading on an invalid WORKER_URL, and the duplicate writeCloudAgentRules call has been removed from prepareWrapperBootstrapWorkspace.

Resolved Issues
File Issue Status
services/cloud-agent-next/src/agent-sandbox/cloudflare/cloudflare-agent-sandbox.ts backupMode() silently disabled backups with no signal on invalid WORKER_URL ✅ Fixed — logWorkspaceBackupDisabled('invalid_worker_url') now called in the catch block
services/cloud-agent-next/wrapper/src/session-bootstrap.ts writeCloudAgentRules called twice in same execution path ✅ Fixed — duplicate call removed
Files Reviewed (20 files)
  • services/cloud-agent-next/.dev.vars.example
  • services/cloud-agent-next/src/agent-sandbox/cloudflare/cloudflare-agent-sandbox.ts
  • services/cloud-agent-next/src/agent-sandbox/cloudflare/cloudflare-agent-sandbox.test.ts
  • services/cloud-agent-next/src/agent-sandbox/protocol.ts
  • services/cloud-agent-next/src/kilo/wrapper-client.ts
  • services/cloud-agent-next/src/kilo/wrapper-client.test.ts
  • services/cloud-agent-next/src/session-service.ts
  • services/cloud-agent-next/src/shared/runtime-environment.ts
  • services/cloud-agent-next/src/shared/wrapper-bootstrap.ts
  • services/cloud-agent-next/src/types.ts
  • services/cloud-agent-next/src/workspace-backup-cache.ts
  • services/cloud-agent-next/src/workspace-backup-observability.ts
  • services/cloud-agent-next/src/workspace-backup-observability.test.ts
  • services/cloud-agent-next/worker-configuration.d.ts
  • services/cloud-agent-next/wrangler.jsonc
  • services/cloud-agent-next/wrapper/src/main.ts
  • services/cloud-agent-next/wrapper/src/server.ts
  • services/cloud-agent-next/wrapper/src/session-bootstrap.ts

Reviewed by claude-4.6-sonnet-20260217 · 265,339 tokens

Review guidance: REVIEW.md from base branch main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant