Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

### New Features

- You can now force gitignored first-party source **into** the index with an `include` list in `codegraph.json`. The case this solves: a project tracked by a second VCS (SVN, Perforce, …) alongside Git, where some real source is committed to that VCS and deliberately listed in `.gitignore` so it never lands in Git — git never lists those files, so CodeGraph never indexed them, and neither `includeIgnored` (which only revives *embedded git repositories* inside a gitignored directory) nor `exclude` (its opposite) could help. Add a root `codegraph.json` with, e.g., `{ "include": ["Tools/", "Local/typescript/"] }` and CodeGraph discovers those files directly off disk — overriding `.gitignore` — and indexes them on the full index, incremental `sync`, and file-watching, on both git and non-git projects. Patterns are gitignore-style and matched against project-root-relative paths (a directory, a recursive `**` glob, or a single file). An explicit `exclude` still wins, and built-in skips like `node_modules`, `dist`, and `.git` are never re-included. This complements the existing `exclude` (its opposite — keep tracked files *out*) and `includeIgnored` (opt *in* to gitignored embedded repos).

## [1.1.3] - 2026-06-29

Expand Down
15 changes: 15 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -618,6 +618,21 @@ watch:
}
```

Conversely, when real source is gitignored on purpose — a project under a second
VCS (SVN, Perforce) that `.gitignore`s its own source so it stays out of Git —
force it back in with `include` (the opposite of `exclude`; `includeIgnored`
only revives embedded git repos, not plain source):

```json
{
"include": ["Tools/", "Local/typescript/"]
}
```

CodeGraph discovers those files off disk, overriding `.gitignore`, on index,
sync, and watch. An explicit `exclude` still wins, and built-in skips
(`node_modules`, `dist`, `.git`) are never re-included.

### Custom file extensions

If your project uses a non-standard extension for a [supported
Expand Down
242 changes: 242 additions & 0 deletions __tests__/include-config.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,242 @@
/**
* `codegraph.json` `include` — force first-party source INTO the index even when
* `.gitignore` would drop it.
*
* The whitelist `includeIgnored` never was: that one only revives *embedded git
* repos* inside ignored dirs (#622/#699), so pure source gitignored out of Git
* (the SVN+Git dual-VCS case — committed to SVN, `.gitignore`d so it never lands
* in Git) had no way in. Three layers under test:
* 1. Loader: parse/validate/cache, mirroring the `exclude` loader.
* 2. Behavior: `scanDirectory` adds included paths on BOTH the git
* (`git ls-files`) and non-git (filesystem walk) enumeration paths.
* 3. Scope: `buildScopeIgnore` (the watcher's source of truth) treats an
* included file — and the gitignored dirs leading to it — as not-ignored.
*
* Invariants: an explicit `exclude` still wins; built-in default-ignored dirs
* (`node_modules`, …) are never resurfaced; every loader failure mode degrades
* to the zero-config default (force nothing in), never a throw.
*/
import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import * as fs from 'node:fs';
import * as path from 'node:path';
import * as os from 'node:os';
import { execFileSync } from 'node:child_process';
import {
loadIncludePatterns,
loadExcludePatterns,
loadExtensionOverrides,
loadIncludeIgnoredPatterns,
clearProjectConfigCache,
} from '../src/project-config';
import { scanDirectory, buildScopeIgnore } from '../src/extraction';

describe('include loader (codegraph.json)', () => {
let dir: string;
beforeEach(() => {
dir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-include-'));
clearProjectConfigCache();
});
afterEach(() => {
clearProjectConfigCache();
fs.rmSync(dir, { recursive: true, force: true });
});
const writeConfig = (obj: unknown) =>
fs.writeFileSync(
path.join(dir, 'codegraph.json'),
typeof obj === 'string' ? obj : JSON.stringify(obj)
);

it('returns an empty list when there is no codegraph.json (the default)', () => {
expect(loadIncludePatterns(dir)).toEqual([]);
});

it('loads a well-formed pattern array', () => {
writeConfig({ include: ['Tools/', 'Local/**'] });
expect(loadIncludePatterns(dir)).toEqual(['Tools/', 'Local/**']);
});

it('trims whitespace and drops blank / non-string entries', () => {
writeConfig({ include: [' Tools/ ', '', ' ', 42, null, 'Local/'] });
expect(loadIncludePatterns(dir)).toEqual(['Tools/', 'Local/']);
});

it('ignores a non-array include value without throwing', () => {
writeConfig({ include: 'Tools/' });
expect(loadIncludePatterns(dir)).toEqual([]);
});

it('ignores malformed JSON without throwing', () => {
writeConfig('{ not: valid json ');
expect(loadIncludePatterns(dir)).toEqual([]);
});

it('coexists with extensions / includeIgnored / exclude in one file (shared single parse)', () => {
writeConfig({
extensions: { '.foo': 'typescript' },
includeIgnored: ['pkgs/'],
exclude: ['static/'],
include: ['Tools/'],
});
expect(loadExtensionOverrides(dir)).toEqual({ '.foo': 'typescript' });
expect(loadIncludeIgnoredPatterns(dir)).toEqual(['pkgs/']);
expect(loadExcludePatterns(dir)).toEqual(['static/']);
expect(loadIncludePatterns(dir)).toEqual(['Tools/']);
});

it('picks up a changed config (mtime-invalidated cache)', () => {
writeConfig({ include: ['Tools/'] });
expect(loadIncludePatterns(dir)).toEqual(['Tools/']);

writeConfig({ include: ['Local/'] });
const future = new Date(Date.now() + 2000);
fs.utimesSync(path.join(dir, 'codegraph.json'), future, future);

expect(loadIncludePatterns(dir)).toEqual(['Local/']);
});

it('drops the patterns again when the config file is removed', () => {
writeConfig({ include: ['Tools/'] });
expect(loadIncludePatterns(dir)).toEqual(['Tools/']);
fs.rmSync(path.join(dir, 'codegraph.json'));
expect(loadIncludePatterns(dir)).toEqual([]);
});
});

describe('include behavior — scanDirectory force-indexes gitignored source', () => {
let dir: string;
const mk = (rel: string, content = 'export const x = 1;\n') => {
const p = path.join(dir, rel);
fs.mkdirSync(path.dirname(p), { recursive: true });
fs.writeFileSync(p, content);
};
const writeConfig = (obj: unknown) =>
fs.writeFileSync(path.join(dir, 'codegraph.json'), JSON.stringify(obj));
const scan = () => scanDirectory(dir).map((f) => f.replace(/\\/g, '/'));

beforeEach(() => {
dir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-include-scan-'));
clearProjectConfigCache();
});
afterEach(() => {
clearProjectConfigCache();
fs.rmSync(dir, { recursive: true, force: true });
});

const gitInit = () => {
execFileSync('git', ['init', '-q'], { cwd: dir });
execFileSync('git', ['add', '-A'], { cwd: dir });
execFileSync('git', ['-c', 'user.email=a@b.c', '-c', 'user.name=t', 'commit', '-qm', 'x'], { cwd: dir });
};

it('indexes a .gitignored source dir when include opts it in (git path) — the core fix', () => {
mk('app/main.ts');
mk('Tools/gen.py', 'def gen():\n return 1\n');
fs.writeFileSync(path.join(dir, '.gitignore'), 'Tools/\n'); // SVN-only source, kept out of Git
gitInit(); // Tools/ is gitignored → NOT tracked

// Sanity: without include the gitignored source is invisible.
let files = scan();
expect(files).toContain('app/main.ts');
expect(files.some((f) => f.startsWith('Tools/'))).toBe(false);

// With include the gitignored source is forced in, app code still there.
writeConfig({ include: ['Tools/'] });
clearProjectConfigCache();
files = scan();
expect(files).toContain('app/main.ts');
expect(files).toContain('Tools/gen.py');
});

it('forces gitignored source in on the non-git filesystem-walk path too', () => {
mk('app/main.ts');
mk('Tools/gen.py', 'def gen():\n return 1\n');
fs.writeFileSync(path.join(dir, '.gitignore'), 'Tools/\n');
// No git init → scanDirectory falls back to the filesystem walk (which still
// honours .gitignore), so Tools/ must be re-added by include.
writeConfig({ include: ['Tools/'] });
clearProjectConfigCache();
const files = scan();
expect(files).toContain('app/main.ts');
expect(files).toContain('Tools/gen.py');
});

it('supports a recursive ** glob and nested dirs', () => {
mk('src/a.ts');
mk('Local/ts/a.ts');
mk('Local/ts/nested/b.ts');
fs.writeFileSync(path.join(dir, '.gitignore'), 'Local/\n');
gitInit();
writeConfig({ include: ['Local/**'] });
clearProjectConfigCache();
const files = scan();
expect(files).toContain('Local/ts/a.ts');
expect(files).toContain('Local/ts/nested/b.ts');
});

it('lets an explicit exclude win over include', () => {
mk('Tools/keep.py', 'def k():\n return 1\n');
mk('Tools/secret/drop.py', 'def d():\n return 1\n');
fs.writeFileSync(path.join(dir, '.gitignore'), 'Tools/\n');
gitInit();
writeConfig({ include: ['Tools/'], exclude: ['Tools/secret/'] });
clearProjectConfigCache();
const files = scan();
expect(files).toContain('Tools/keep.py');
expect(files.some((f) => f.startsWith('Tools/secret/'))).toBe(false);
});

it('never resurrects a built-in default-ignored dir (node_modules) via include', () => {
mk('src/a.ts');
mk('node_modules/pkg/index.js');
gitInit();
// Even explicitly opting node_modules in must not pull it into the graph.
writeConfig({ include: ['node_modules/'] });
clearProjectConfigCache();
const files = scan();
expect(files).toContain('src/a.ts');
expect(files.some((f) => f.startsWith('node_modules/'))).toBe(false);
});

it('is a no-op with no include config (gitignored source stays out)', () => {
mk('app/main.ts');
mk('Tools/gen.py', 'def gen():\n return 1\n');
fs.writeFileSync(path.join(dir, '.gitignore'), 'Tools/\n');
gitInit();
const files = scan();
expect(files).toContain('app/main.ts');
expect(files.some((f) => f.startsWith('Tools/'))).toBe(false);
});
});

describe('include scope — buildScopeIgnore keeps included paths watchable', () => {
let dir: string;
beforeEach(() => {
dir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-include-scope-'));
clearProjectConfigCache();
execFileSync('git', ['init', '-q'], { cwd: dir });
fs.writeFileSync(path.join(dir, '.gitignore'), 'Tools/\nOther/\n');
fs.writeFileSync(path.join(dir, 'codegraph.json'), JSON.stringify({ include: ['Tools/'] }));
});
afterEach(() => {
clearProjectConfigCache();
fs.rmSync(dir, { recursive: true, force: true });
});

it('does not ignore an included file, nor the gitignored dir leading to it', () => {
const scope = buildScopeIgnore(dir);
// The included file and its (gitignored) directory are watchable.
expect(scope.ignores('Tools/gen.py')).toBe(false);
expect(scope.ignores('Tools/')).toBe(false);
// A different gitignored dir that was NOT opted in stays ignored.
expect(scope.ignores('Other/')).toBe(true);
expect(scope.ignores('Other/x.py')).toBe(true);
});

it('still ignores everything when no include is configured', () => {
fs.writeFileSync(path.join(dir, 'codegraph.json'), JSON.stringify({}));
clearProjectConfigCache();
const scope = buildScopeIgnore(dir);
expect(scope.ignores('Tools/gen.py')).toBe(true);
expect(scope.ignores('Tools/')).toBe(true);
});
});
26 changes: 24 additions & 2 deletions site/src/content/docs/getting-started/configuration.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
---
title: Configuration
description: CodeGraph is zero-config by default, with one optional codegraph.json for custom extensions, excluding tracked directories, and indexing nested git repositories.
description: CodeGraph is zero-config by default, with one optional codegraph.json for custom extensions, excluding tracked directories, indexing gitignored source, and indexing nested git repositories.
---

Next to none — CodeGraph is **zero-config by default**, with nothing to write or keep in sync to get started. Language support is automatic from the file extension; there's nothing to wire up per language. The one optional file, `codegraph.json`, covers [custom file extensions](#custom-file-extensions), [excluding tracked directories](#excluding-a-tracked-directory), and [indexing nested git repositories](#indexing-nested-git-repositories).
Next to none — CodeGraph is **zero-config by default**, with nothing to write or keep in sync to get started. Language support is automatic from the file extension; there's nothing to wire up per language. The one optional file, `codegraph.json`, covers [custom file extensions](#custom-file-extensions), [excluding tracked directories](#excluding-a-tracked-directory), [indexing gitignored source](#indexing-gitignored-source-a-second-vcs), and [indexing nested git repositories](#indexing-nested-git-repositories).

## What it skips out of the box

Expand Down Expand Up @@ -31,6 +31,28 @@ Each entry is a gitignore-style pattern, matched against project-root-relative p

Re-index (`codegraph index`) after adding or changing `exclude`.

## Indexing gitignored source (a second VCS)

`.gitignore` keeps files out of the index — which is usually what you want, but not when the gitignored files are real first-party source. The case this exists for: a project tracked by **SVN, Perforce, or another VCS alongside Git**, where some source is committed to that VCS and deliberately listed in `.gitignore` so it never lands in Git. That source is still yours and you want it in the graph, but git never lists it, so CodeGraph never sees it. (`includeIgnored` doesn't help — it only revives *embedded git repositories* inside a gitignored directory, not plain source.)

List those paths under `include` in `codegraph.json` to force them in:

```json
{
"include": ["Tools/", "Local/typescript/"]
}
```

Each entry is a gitignore-style pattern, matched against project-root-relative paths (a directory like `"Tools/"`, a recursive `"Tools/**"` glob, or a single file all work). CodeGraph discovers the matching files directly off disk — overriding `.gitignore` — and indexes them everywhere it looks at files: the full index, incremental `sync`, and file-watching.

A few things to know:

- An explicit [`exclude`](#excluding-a-tracked-directory) still wins — listing the same path in both keeps it out.
- Built-in skips like `node_modules`, `dist`, and `.git` are never re-included, even when an `include` pattern would match inside them.
- This is the opposite of `exclude` (which keeps tracked files *out*); it's for source git itself never tracks.

Re-index (`codegraph index`) after adding or changing `include`.

## Custom file extensions

If your project uses a non-standard extension for a [supported language](/codegraph/reference/languages/) — say `.dota_lua` for Lua, or `.tpl` for PHP — those files are skipped by default, because the extension isn't one CodeGraph recognizes. Map them with an optional `codegraph.json` at your project root:
Expand Down
Loading