Skip to content

feat(config): add include to force gitignored first-party source into the index#1063

Open
luoyxy wants to merge 1 commit into
colbymchenry:mainfrom
luoyxy:feat/include-gitignored-source
Open

feat(config): add include to force gitignored first-party source into the index#1063
luoyxy wants to merge 1 commit into
colbymchenry:mainfrom
luoyxy:feat/include-gitignored-source

Conversation

@luoyxy

@luoyxy luoyxy commented Jun 29, 2026

Copy link
Copy Markdown

Summary

.gitignore keeps files out of the index — wrong when those files are real
first-party source tracked by a second VCS. In an SVN/Perforce + Git project,
some source is committed to that VCS and deliberately .gitignored so it never
lands in Git; git never lists it, so CodeGraph never indexed it. Neither existing
knob helps: includeIgnored only revives embedded git repos inside an ignored
dir, and exclude is the opposite (drop tracked files, #999).

What this adds

A project-level include whitelist in codegraph.json — gitignore-style patterns
whose matching source is indexed even when gitignored / not git-tracked:

{ "include": ["Tools/", "Tools/**", "Local/typescript/"] }

How

project-config: parse/validate/cache include like exclude (warn-and-skip, never throw); loadIncludePatterns.
extraction:
collectIncludedFiles discovers the files off disk (git can't list them) via a targeted walk of each pattern's static prefix (includeStaticRoots), overriding .gitignore — but never resurfacing built-in default-ignored dirs (node_modules/dist/…), .git, or the CodeGraph data dir, and honoring exclude.
union into BOTH enumeration paths (getGitVisibleFiles, scanDirectoryWalk); sync rides the same scan, so adds/mods/removals reconcile automatically.
ScopeIgnore is now include-aware so the watcher watches included files and the gitignored dirs leading to them.
Precedence: exclude > include > .gitignore/defaults; default-ignored dirs are never re-included.

Tests

tests/include-config.test.ts (16): loader parse/validate/cache; scanDirectory on git and non-git paths; recursive ** glob; exclude-wins; no node_modules resurrection; buildScopeIgnore watcher scope. tsc clean; no regressions in the exclude / includeIgnored / watch-policy suites.

Docs

configuration.md, README.md, and a CHANGELOG [Unreleased] entry.

…to the index

`.gitignore` keeps files out of the index, which is wrong when the gitignored
files are real first-party source tracked by a SECOND VCS. In a project that
uses SVN/Perforce alongside Git, some source is committed to that VCS and
deliberately `.gitignore`d so it never lands in Git. git never lists those
files, so CodeGraph never indexed them — and neither existing knob helped:
`includeIgnored` only revives *embedded git repos* inside an ignored dir
(findIgnoredEmbeddedRepos/findNestedGitRepos), and `exclude` is the opposite
(drop tracked files, colbymchenry#999).

Add a project-level `include` whitelist to `codegraph.json`: gitignore-style
patterns whose matching source files are indexed even when gitignored / not
git-tracked.

  { "include": ["Tools/", "Tools/**", "Local/typescript/"] }

Implementation:
- project-config: parse/validate/cache `include` exactly like `exclude`
  (warn-and-skip on malformed input, never throw), exposed as
  `loadIncludePatterns`.
- extraction:
  - `collectIncludedFiles` actively discovers the whitelisted files off disk
    (git can't list them) via a targeted walk of each pattern's static prefix
    (`includeStaticRoots`), overriding `.gitignore` — but never resurfacing a
    built-in default-ignored dir (node_modules/dist/…), `.git`, or the
    CodeGraph data dir, and honoring `exclude`.
  - union those files into BOTH enumeration paths: `getGitVisibleFiles` (git)
    and `scanDirectoryWalk` (non-git). `sync` rides the same scan, so adds/
    mods/removals of included files reconcile automatically.
  - make `ScopeIgnore` include-aware so the file watcher watches the included
    files and the gitignored directories leading to them; `exclude` still wins
    and default-ignored dirs are still pruned.

Precedence: exclude > include > .gitignore/defaults; built-in default-ignored
dirs are never re-included.

Docs: configuration.md, README, and a CHANGELOG [Unreleased] entry.

Tests: __tests__/include-config.test.ts — loader (parse/validate/cache),
`scanDirectory` behavior on git and non-git paths, recursive `**` glob,
exclude-wins, no node_modules resurrection, and `buildScopeIgnore`
(watcher-facing) scope.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant