Summary
Running java-codebase-rag increment (or reprocess/init) while another process has the graph open in read-write mode fails with a ladybug (kuzu) exclusive-lock error. The lock is held for the entire graph rebuild, so on a large repo the window can be minutes.
When an agent is connected to the MCP server it's tempting to blame that — but the MCP server opens the graph read-only, which does not conflict. The error requires a second read-write opener.
Exact error
RuntimeError: IO exception: Could not set lock on file : …/code_graph.lbug (Lock is held by PID 24383)
See the docs: https://docs.ladybugdb.com/concurrency for more information.
(ladybug == rebadged kuzu 0.17.1; internal symbols are lbug::*.)
Root cause (reproduced)
ladybug/kuzu takes an exclusive file lock (flock on the DB) on every read-write open. Single-writer across processes — inherent to its model, no multi-writer mode.
Confirmed empirically:
| Scenario |
Result |
Read-only held + read-write open (server vs increment) |
✅ coexist, no error |
Active reader mid-scan + writer doing drop+rewrite |
✅ no error either side |
| Two read-write openers (A holds, B opens) |
❌ the lock error above |
| LanceDB read+write, write+write (all combos) |
✅ no errors, no lock file |
So the connected MCP server cannot by itself block increment:
- Server opens read-only:
ladybug_queries.py:359 → ladybug.Database(db_path, read_only=True).
- The only writer reachable from
server.py (run_refresh_pipeline, which shells out to build_ast_graph.py) is not exposed as an MCP tool — it's invoked only by the CLI reprocess (java_codebase_rag/cli.py:480).
- Read-write opens live only in the indexer path:
build_ast_graph.py:3486 and write_ladybug at build_ast_graph.py:3817.
Within a single increment there's only one ladybug writer at a time (the full-rebuild fallback releases its first handle at build_ast_graph.py:3501 before write_ladybug reopens). So the conflict is always cross-process: a genuinely separate index-modifying process is holding the write lock when increment tries to acquire it.
Likely real-world trigger
A second concurrent indexer — e.g. a background agent/host (Cursor, another Claude Code instance) auto-running increment/reprocess via file-watch or on-save, or a previous long full-rebuild still finishing while you kick off another. (The LanceDB commit-conflict in #308 shows this repo already gets hit by multi-indexer concurrency.)
To confirm on your end: next time it happens,
ps aux | grep -E 'java-codebase-rag|build_ast_graph|cocoindex'
should show a second indexer whose PID matches the one named in the error.
Proposed fix
The fix belongs at the read-write open in the indexer (build_ast_graph.py:3486 / :3817):
- Catch + bounded retry on the specific lock error. Wrap the
ladybug.Database(path) open; on RuntimeError matching "Could not set lock on file … Lock is held by PID", back off and retry for ~60–90s (absorbs the common case — the other index command finishes), then fail with a clean message naming the PID and remediation instead of a raw traceback.
- (Optional, stronger) advisory coordination lock (
fcntl.flock on index_dir/.index.lock) acquired by init/increment/reprocess, so our own CLI runs serialize with a friendly "waiting for another index operation (PID X)…" message. Doesn't override a rogue holder, so (1) is still needed for the error path.
Secondary note (separate from this bug)
write_ladybug does _drop_all + full rewrite while the server may be querying. ladybug allows read+write concurrently (no lock), so a query mid-rewrite can briefly see a torn/empty schema. That's a correctness concern, not the lock conflict — the fix there is atomic temp-write + swap, which is more invasive and should be scoped separately.
Investigation includes AI-assisted reproduction/analysis.
Summary
Running
java-codebase-rag increment(orreprocess/init) while another process has the graph open in read-write mode fails with a ladybug (kuzu) exclusive-lock error. The lock is held for the entire graph rebuild, so on a large repo the window can be minutes.When an agent is connected to the MCP server it's tempting to blame that — but the MCP server opens the graph read-only, which does not conflict. The error requires a second read-write opener.
Exact error
(ladybug == rebadged kuzu 0.17.1; internal symbols are
lbug::*.)Root cause (reproduced)
ladybug/kuzu takes an exclusive file lock (
flockon the DB) on every read-write open. Single-writer across processes — inherent to its model, no multi-writer mode.Confirmed empirically:
increment)drop+rewriteSo the connected MCP server cannot by itself block
increment:ladybug_queries.py:359→ladybug.Database(db_path, read_only=True).server.py(run_refresh_pipeline, which shells out tobuild_ast_graph.py) is not exposed as an MCP tool — it's invoked only by the CLIreprocess(java_codebase_rag/cli.py:480).build_ast_graph.py:3486andwrite_ladybugatbuild_ast_graph.py:3817.Within a single
incrementthere's only one ladybug writer at a time (the full-rebuild fallback releases its first handle atbuild_ast_graph.py:3501beforewrite_ladybugreopens). So the conflict is always cross-process: a genuinely separate index-modifying process is holding the write lock whenincrementtries to acquire it.Likely real-world trigger
A second concurrent indexer — e.g. a background agent/host (Cursor, another Claude Code instance) auto-running
increment/reprocessvia file-watch or on-save, or a previous long full-rebuild still finishing while you kick off another. (The LanceDB commit-conflict in #308 shows this repo already gets hit by multi-indexer concurrency.)To confirm on your end: next time it happens,
should show a second indexer whose PID matches the one named in the error.
Proposed fix
The fix belongs at the read-write open in the indexer (
build_ast_graph.py:3486/:3817):ladybug.Database(path)open; onRuntimeErrormatching"Could not set lock on file … Lock is held by PID", back off and retry for ~60–90s (absorbs the common case — the other index command finishes), then fail with a clean message naming the PID and remediation instead of a raw traceback.fcntl.flockonindex_dir/.index.lock) acquired byinit/increment/reprocess, so our own CLI runs serialize with a friendly "waiting for another index operation (PID X)…" message. Doesn't override a rogue holder, so (1) is still needed for the error path.Secondary note (separate from this bug)
write_ladybugdoes_drop_all+ full rewrite while the server may be querying. ladybug allows read+write concurrently (no lock), so a query mid-rewrite can briefly see a torn/empty schema. That's a correctness concern, not the lock conflict — the fix there is atomic temp-write + swap, which is more invasive and should be scoped separately.Investigation includes AI-assisted reproduction/analysis.