Install gate: warn-only npm audit second opinion#107
Conversation
resolve_tree now returns TreeResolution{packages, audit}: for npm, the
dry-run temp dir moves into a detached thread that runs `npm audit --json
--package-lock-only` (5s deadline, kill on timeout; stdout parsed
regardless of exit code — audit exits 1 when it finds advisories).
run_tree_pass collects the summary via recv_timeout(2s) after the
verdict pool so the two overlap; any failure is a silent skip.
The signal is supplementary only: should_block_install never consults
it. When total>0 a note prints to stderr; --json carries an `npm_audit`
object (or null) in the tree arm. CORGEA_NO_NPM_AUDIT=1 disables.
| let stdout_path = work.join("corgea-npm-audit.json"); | ||
| let stdout_file = std::fs::File::create(&stdout_path).ok()?; | ||
| let mut child = Command::new(npm) | ||
| .args(["audit", "--json", "--package-lock-only"]) |
There was a problem hiding this comment.
Blocking: the audit subprocess ignores npm CLI configuration from the install command. The lockfile resolver above runs npm install with .args(install_args), so --registry, --userconfig, --scope:registry, and auth-related flags can affect the resolved lockfile. This audit invocation is always just npm audit --json --package-lock-only, so a command like corgea npm install --registry https://registry.company @scope/pkg can resolve against the private registry but then run audit against the user's default/public registry. Impact: private package metadata can be sent to the wrong registry, or the second-opinion audit silently disappears because it cannot authenticate to the intended registry. Fix by carrying the relevant npm config/auth flags (or the same sanitized npm config/env used for resolution) into the audit command, and add an e2e test proving --registry/--userconfig reaches the audit invocation.
| // Collect the warn-only npm audit second opinion only after the verdict | ||
| // pool so the two truly overlap; any failure (timeout, disconnected | ||
| // sender) is a silent skip. | ||
| let audit = audit_rx.and_then(|rx| rx.recv_timeout(Duration::from_secs(2)).ok()); |
There was a problem hiding this comment.
Blocking: after this 2s timeout, the code drops the only audit handle and proceeds, but the audit thread/subprocess is still running. spawn_audit returns only an mpsc::Receiver, so the main path cannot cancel or join the thread; if the real install is fast, the corgea process can exit before run_audit reaches its 5s kill path, leaving a hidden npm audit child doing network work and leaking the temp dir. The new hang test only asserts the wrapper returns quickly; it does not prove the audit child is killed or cleaned up. Fix by returning a cancellable/joinable audit handle whose timeout/drop path kills the child process (ideally the process group on Unix) and joins/cleans the temp dir before the wrapper exits, and add a test that observes the hung audit process is actually terminated.
Summary
Unit 6 of the install-vuln-gate dogfood frictions: run
npm auditas a concurrent, warn-only second opinion during the gatedcorgea npm installflow. npm's own signal comes first but never decides anything — the vuln-api verdict alone blocks.resolve_treenow returnsTreeResolution { packages, audit: Option<mpsc::Receiver<AuditSummary>> }(pip:None). After the lockfile parse, the npm resolver moves its dry-run temp dir into a detached thread (spawn_audit) that runsnpm audit --json --package-lock-onlywith a 5s deadline (kill on timeout). Stdout is parsed regardless of exit code — audit exits 1 when it finds advisories, which is the success case. Stdout goes through a file, not a pipe, so the deadline poll can't deadlock.run_tree_passcollects the summary viarecv_timeout(2s)only afterverdict_pool, so audit and verdicts truly overlap. Any failure (spawn error, timeout, unparsable output, disconnect) is a silent skip.TreeReport::Fullgainsaudit: Option<AuditSummary { total, critical, high, moderate, low, info, top }>.should_block_installnever consults it — exit codes are unchanged in both directions (covered by tests).total > 0—note: npm audit reports N advisories (M high/critical) — supplementary signal, not blocking;--jsoncarries annpm_auditobject (ornull) in the tree arm.CORGEA_NO_NPM_AUDIT=1skips the audit subprocess entirely. Documented in the SKILL.md overrides sentence.Testing
./harness checkgreen (clippy -D warnings, fmt, 263 tests).tests/cli_npm_audit.rs(7 tests): note-without-blocking, clean-report silence, JSON object shape, env kill switch, broken-audit silent skip, hung-audit bounded by the 2s collect window, and audit-never-unblocks-a-vulnerable-verdict.totalsummed, garbage rejected).minimist@0.0.8in a scratch project): stderr showsnote: npm audit reports 1 advisories (1 high/critical) — supplementary signal, not blocking; exit code stayed the verdict's block. WithCORGEA_NO_NPM_AUDIT=1the note disappears.Notes
--package-lock-onlybecause the throwaway work dir holds only manifests plus the generated lockfile, never anode_modules.unverifiable→ fail-closed block rather than avulnerableblock. The audit-note and kill-switch assertions were unaffected; the stub tests are the source of truth per the recipe.