feat(config): add include to force gitignored first-party source into the index#1063
Open
luoyxy wants to merge 1 commit into
Open
feat(config): add include to force gitignored first-party source into the index#1063luoyxy wants to merge 1 commit into
include to force gitignored first-party source into the index#1063luoyxy wants to merge 1 commit into
Conversation
…to the index `.gitignore` keeps files out of the index, which is wrong when the gitignored files are real first-party source tracked by a SECOND VCS. In a project that uses SVN/Perforce alongside Git, some source is committed to that VCS and deliberately `.gitignore`d so it never lands in Git. git never lists those files, so CodeGraph never indexed them — and neither existing knob helped: `includeIgnored` only revives *embedded git repos* inside an ignored dir (findIgnoredEmbeddedRepos/findNestedGitRepos), and `exclude` is the opposite (drop tracked files, colbymchenry#999). Add a project-level `include` whitelist to `codegraph.json`: gitignore-style patterns whose matching source files are indexed even when gitignored / not git-tracked. { "include": ["Tools/", "Tools/**", "Local/typescript/"] } Implementation: - project-config: parse/validate/cache `include` exactly like `exclude` (warn-and-skip on malformed input, never throw), exposed as `loadIncludePatterns`. - extraction: - `collectIncludedFiles` actively discovers the whitelisted files off disk (git can't list them) via a targeted walk of each pattern's static prefix (`includeStaticRoots`), overriding `.gitignore` — but never resurfacing a built-in default-ignored dir (node_modules/dist/…), `.git`, or the CodeGraph data dir, and honoring `exclude`. - union those files into BOTH enumeration paths: `getGitVisibleFiles` (git) and `scanDirectoryWalk` (non-git). `sync` rides the same scan, so adds/ mods/removals of included files reconcile automatically. - make `ScopeIgnore` include-aware so the file watcher watches the included files and the gitignored directories leading to them; `exclude` still wins and default-ignored dirs are still pruned. Precedence: exclude > include > .gitignore/defaults; built-in default-ignored dirs are never re-included. Docs: configuration.md, README, and a CHANGELOG [Unreleased] entry. Tests: __tests__/include-config.test.ts — loader (parse/validate/cache), `scanDirectory` behavior on git and non-git paths, recursive `**` glob, exclude-wins, no node_modules resurrection, and `buildScopeIgnore` (watcher-facing) scope. Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
.gitignorekeeps files out of the index — wrong when those files are realfirst-party source tracked by a second VCS. In an SVN/Perforce + Git project,
some source is committed to that VCS and deliberately
.gitignored so it neverlands in Git; git never lists it, so CodeGraph never indexed it. Neither existing
knob helps:
includeIgnoredonly revives embedded git repos inside an ignoreddir, and
excludeis the opposite (drop tracked files, #999).What this adds
A project-level
includewhitelist incodegraph.json— gitignore-style patternswhose matching source is indexed even when gitignored / not git-tracked:
{ "include": ["Tools/", "Tools/**", "Local/typescript/"] }How
project-config: parse/validate/cache include like exclude (warn-and-skip, never throw); loadIncludePatterns.
extraction:
collectIncludedFiles discovers the files off disk (git can't list them) via a targeted walk of each pattern's static prefix (includeStaticRoots), overriding .gitignore — but never resurfacing built-in default-ignored dirs (node_modules/dist/…), .git, or the CodeGraph data dir, and honoring exclude.
union into BOTH enumeration paths (getGitVisibleFiles, scanDirectoryWalk); sync rides the same scan, so adds/mods/removals reconcile automatically.
ScopeIgnore is now include-aware so the watcher watches included files and the gitignored dirs leading to them.
Precedence: exclude > include > .gitignore/defaults; default-ignored dirs are never re-included.
Tests
tests/include-config.test.ts (16): loader parse/validate/cache; scanDirectory on git and non-git paths; recursive ** glob; exclude-wins; no node_modules resurrection; buildScopeIgnore watcher scope. tsc clean; no regressions in the exclude / includeIgnored / watch-policy suites.
Docs
configuration.md, README.md, and a CHANGELOG [Unreleased] entry.