You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The built-in grep/search tool, when the copilot_cli_tgrep experiment is assigned, replaces ripgrep with the native Rust tgrep trigram indexer. At session startup the CLI spawns a persistent daemon:
over the entire working-directory tree. The startup orchestrator gates this on a lower file-count threshold only — there is no upper bound and no memory cap. The trigram build holds the whole index plus intermediate structures in RAM (the on-disk index is 1.9 GB; live anon RSS reaches ~45 GB, ≈24×). On a large monorepo this exhausts host memory and the Linux kernel OOM-killer reaps the process.
Observed in an internal monorepo (370,925 tracked text files) in WSL2 (46 GiB RAM):
tgrep was OOM-killed twice at anon-rss ~46–47 GB (total-vm ~60 GB).
A third build was caught live: its resident memory rose without interruption (0.3 GB → 4.7 GB over 20 min, every sample ≥ the previous) while it pegged ~95% of one CPU core — the same trajectory as the two kills.
Because the indexer auto-starts during session startup (it runs tgrep count-files, sees the repo is over threshold, and spawns serve), merely opening or resuming a session in the repo root triggers it — independent of any prompt or tool call. When it OOMs it can take swap and unrelated processes down with it, effectively wedging the WSL VM.
This is not the same as the existing Node.js JavaScript heap out of memory reports (e.g. #841, #1457, #1386, #2132). This is a native child process killed by the kernel OOM-killer, with no JS heap message.
Note: the feature was not opted into. USE_TGREP is unset; it is enabled via a server-side experiment assignment (copilot_cli_tgrep) found in ~/.cache/copilot/exp-cache.json.
Affected version
GitHub Copilot CLI 1.0.66-2
Steps to reproduce the behavior
Be assigned the copilot_cli_tgrep experiment (or set USE_TGREP=true).
cd into a git repository with ≥ 50,000 tracked text files (≥ 10,000 on Windows). Example = 370,925 files (verified: tgrep count-files . → 370925 text files (12504 binary skipped, 0 errors)).
Start or resume a session there: copilot --resume --add-dir <repo-root>.
The CLI logs Starting tgrep serve (index: …, cwd: <repo-root>) and spawns the daemon. From then on its resident memory only ever increases — every sampled RSS is ≥ the previous, with no plateau or drop — while it keeps ~one CPU core busy: an active, unbounded in-memory build.
On a host without ~50+ GB free, the kernel OOM-killer kills tgrep; the CLI keeps polling tgrep status (~1/sec) and can respawn into a fresh cold rebuild.
Jun 29 21:40:52 kernel: Out of memory: Killed process 2817274 (tgrep) total-vm:59968024kB, anon-rss:46125720kB, file-rss:1336kB, shmem-rss:0kB, UID:1000 pgtables:113820kB
Jun 30 00:46:09 kernel: Out of memory: Killed process 3026540 (tgrep) total-vm:60820144kB, anon-rss:47212960kB, file-rss:1436kB, shmem-rss:0kB, UID:1000 pgtables:115496kB
Jun 30 00:46:09 kernel: oom-kill:constraint=CONSTRAINT_NONE,...,task=tgrep,pid=3026540,uid=1000
Live third occurrence (caught mid-build, idle session, no active prompt):
tgrep pid=3114713 cpu=94.8% rss=4.70GB elapsed=20:19 # still climbing; on-disk index already 1.9 GB
Expected behavior
The indexer must never be able to exhaust host memory or OOM unrelated processes. Specifically:
Upper bound / ceiling. Add a maximum file-count (and/or total-bytes) above which the tool falls back to ripgrep instead of indexing. Today the only gate is the lower threshold (fileCount < lY → skip); there is no "too large to index safely" gate, so the biggest repos — exactly where the feature is meant to help — are where it fails hardest.
Memory budget for the build. Cap/stream the index build (chunked, bounded working set) so peak RSS is predictable and far below host RAM. tgrep currently exposes no --max-memory flag.
Gate: after tgrep count-files, else if (u < lY) return { outcome: "skipped_below_threshold" } — then it immediately sets the root and spawns tgrep serve. No upper-bound check exists between the threshold test and the spawn.
Off-switch (verified): process.env.USE_TGREP === "false" → "tgrep disabled via USE_TGREP=false" → ripgrep fallback. (USE_BUILTIN_RIPGREP=false also forces fallback.)
On-disk index for 1JS: 1.9 GB; peak live anon RSS at kill: ~45 GB (~24×).
Workarounds:
export USE_TGREP=false (forces ripgrep; applies to new sessions).
Launch from a subdirectory under the 50k-file threshold.
Cap the WSL VM ([wsl2] memory=… in .wslconfig) to contain the blast radius.
Environment:
OS: Linux (WSL2), kernel 6.6.87.2-microsoft-standard-WSL2
CPU architecture: x86_64 (32 logical CPUs)
RAM: 46 GiB total, 12 GiB swap
Terminal: tmux (TERM=tmux-256color)
Shell: zsh (/usr/bin/zsh)
Repo under test: large monorepo, 370,925 tracked text files
Possibly related:#277 (CLI doesn't detect that a child process it started was killed) — appears as a secondary symptom of this failure (the kernel kills tgrep; the CLI keeps polling status).
Describe the bug
The built-in grep/search tool, when the
copilot_cli_tgrepexperiment is assigned, replaces ripgrep with the native Rusttgreptrigram indexer. At session startup the CLI spawns a persistent daemon:over the entire working-directory tree. The startup orchestrator gates this on a lower file-count threshold only — there is no upper bound and no memory cap. The trigram build holds the whole index plus intermediate structures in RAM (the on-disk index is 1.9 GB; live anon RSS reaches ~45 GB, ≈24×). On a large monorepo this exhausts host memory and the Linux kernel OOM-killer reaps the process.
Observed in an internal monorepo (370,925 tracked text files) in WSL2 (46 GiB RAM):
tgrepwas OOM-killed twice atanon-rss~46–47 GB (total-vm~60 GB).Because the indexer auto-starts during session startup (it runs
tgrep count-files, sees the repo is over threshold, and spawnsserve), merely opening or resuming a session in the repo root triggers it — independent of any prompt or tool call. When it OOMs it can take swap and unrelated processes down with it, effectively wedging the WSL VM.This is not the same as the existing Node.js
JavaScript heap out of memoryreports (e.g. #841, #1457, #1386, #2132). This is a native child process killed by the kernel OOM-killer, with no JS heap message.Note: the feature was not opted into.
USE_TGREPis unset; it is enabled via a server-side experiment assignment (copilot_cli_tgrep) found in~/.cache/copilot/exp-cache.json.Affected version
GitHub Copilot CLI 1.0.66-2
Steps to reproduce the behavior
copilot_cli_tgrepexperiment (or setUSE_TGREP=true).cdinto a git repository with ≥ 50,000 tracked text files (≥ 10,000 on Windows). Example = 370,925 files (verified:tgrep count-files .→370925 text files (12504 binary skipped, 0 errors)).copilot --resume --add-dir <repo-root>.Starting tgrep serve (index: …, cwd: <repo-root>)and spawns the daemon. From then on its resident memory only ever increases — every sampled RSS is ≥ the previous, with no plateau or drop — while it keeps ~one CPU core busy: an active, unbounded in-memory build.tgrep; the CLI keeps pollingtgrep status(~1/sec) and can respawn into a fresh cold rebuild.Observed daemon command line:
Kernel OOM-killer evidence (two separate kills):
Live third occurrence (caught mid-build, idle session, no active prompt):
Expected behavior
The indexer must never be able to exhaust host memory or OOM unrelated processes. Specifically:
fileCount < lY→ skip); there is no "too large to index safely" gate, so the biggest repos — exactly where the feature is meant to help — are where it fails hardest.tgrepcurrently exposes no--max-memoryflag.A 50k-file repo and a 370k-file repo currently take the same unbounded code path; only the latter OOMs.
Additional context
Root cause (from the bundled
app.js, pkg 1.0.66-2):TGREP = "copilot_cli_tgrep"— "When true, enables tgrep indexed search for large repositories" (sdk/index.d.ts:7310). Default availabilityoff; here enabled via experiment assignment.lY = process.platform === "win32" ? 1e4 : 5e4(10,000 / 50,000 files).tgrep count-files,else if (u < lY) return { outcome: "skipped_below_threshold" }— then it immediately sets the root and spawnstgrep serve. No upper-bound check exists between the threshold test and the spawn.process.env.USE_TGREP === "false"→"tgrep disabled via USE_TGREP=false"→ ripgrep fallback. (USE_BUILTIN_RIPGREP=falsealso forces fallback.)Workarounds:
export USE_TGREP=false(forces ripgrep; applies to new sessions).[wsl2] memory=…in.wslconfig) to contain the blast radius.Environment:
6.6.87.2-microsoft-standard-WSL2TERM=tmux-256color)/usr/bin/zsh)Possibly related: #277 (CLI doesn't detect that a child process it started was killed) — appears as a secondary symptom of this failure (the kernel kills
tgrep; the CLI keeps pollingstatus).