Summary
Add 7 architecture analysis metrics from Emerge + UA: Louvain modularity (community detection), TF-IDF keyword extraction, whitespace complexity (fast proxy), fan-in/fan-out, inheritance graph, change coupling (git history), code churn + author diversity. Plus architectural layer detection from UA.
Worker consensus (2 reports — mostly Emerge, with UA additions)
| Worker |
Source |
Contribution |
| Emerge |
update!/CodeLens_Upgrade_Issues_from_Emerge.md CL-024 |
Louvain modularity — python-louvain library, 5x optimization runs, resolution 1.5. 3 commands: modularity, clusters, ball-of-mud. Run on dependency/call/inheritance/complete graph. |
| Emerge |
same file CL-025 |
TF-IDF semantic keyword extraction per file — scikit-learn TfidfVectorizer, 12 language-specific stopword sets. 2 commands: keywords, semantic-search. |
| Emerge |
same file CL-026 |
Whitespace complexity metric — counts indentation per-line, 10-100x faster than cyclomatic. --quick mode skips AST. |
| Emerge |
same file CL-027 |
Fan-in / fan-out graph metrics — avg_fan_in, avg_fan_out, max_fan_in_name, max_fan_out_name. New command fan-in-out. Integrate with impact. |
| Emerge |
same file CL-028 |
Inheritance graph — extend parsers to extract class + parent. New graph type inheritance_graph. 2 commands: inheritance, god-class (>20 children or >5 depth). |
| Emerge |
same file CL-029 |
Change coupling graph (git history) — PyDriller traverses commit history, files committed together = coupled. 3 commands: change-coupling, shotgun-surgery, coupled-with. |
| Emerge |
same file CL-030 |
Git code churn & author diversity — extend ownership_engine.py with multi-commit churn. 2 commands: hotspot, bus-factor. |
| UnderstandAnything |
update!/CodeLens_vs_UnderstandAnything_Upgrade_Analysis.md U2 |
Architectural layer detection — 9 layers (API, Service, Data, UI, Middleware, External, Background, Utility, Test) via directory-path heuristic. New command layers. Update summary + impact. |
Proposed scope (P2, 6-10 weeks total — can be split across multiple PRs)
Each metric is independent and can ship separately:
Metric 1 — Whitespace complexity (P2, 3 days, quick win)
- Copy
emerge/metrics/whitespace/whitespace.py (81 LOC, MIT)
- Add
--metric ws|cyclomatic|cognitive|all to complexity command
- Quick mode
codelens complexity --quick --top 20 skips AST, <5s for 5000 files
Metric 2 — Fan-in / fan-out (P1, 3 days, quick win)
- Add
calculate_fan_in_out() to callgraph_engine.py
- New command
codelens fan-in-out [workspace] [--name FN] [--top N]
- Update
impact_engine.py to include fan-in/out in output
Metric 3 — Louvain modularity (P2, 1 week)
- Adapt
emerge/metrics/modularity/modularity.py (188 LOC, MIT)
- Add
python-louvain dependency
- 3 commands:
modularity, clusters, ball-of-mud
- MCP tools:
codelens_modularity, codelens_clusters, codelens_ball_of_mud
- Benchmark: 5000-file dependency graph <30s
Metric 4 — TF-IDF keyword extraction (P2, 1 week)
- Adapt
emerge/metrics/tfidf/tfidf.py (118 LOC, MIT)
- Add
scikit-learn dependency
- 2 commands:
keywords, semantic-search
- Cache at
.codelens/keywords_cache.json
- MCP tools:
codelens_keywords, codelens_semantic_search
Metric 5 — Inheritance graph (P2, 2-3 weeks)
- Extend parsers (Python, JS, TS, TSX, Rust, Vue, Svelte + fallback for Java/Kotlin/Swift/C++/C#/PHP/Ruby) to extract class + parent
- New graph type
inheritance_graph in callgraph_engine.py
- 2 commands:
inheritance [workspace] [--class NAME] [--depth N], god-class [workspace]
- MCP tools:
codelens_inheritance, codelens_god_class
Metric 6 — Change coupling (P1, 1-2 weeks, high impact unique feature)
- Add
pydriller dependency
- Adapt
emerge/metrics/git/git.py (234 LOC)
- 3 commands:
change-coupling, shotgun-surgery, coupled-with <file>
- Update
impact_engine.py to include coupled files
- Performance: 1000 commits <60s
Metric 7 — Code churn & author diversity (P2, 1 week, depends on Metric 6)
- Refactor
ownership_engine.py to traverse git history (not just git blame)
- 2 commands:
hotspot, bus-factor
- Extend
ownership command output with code_churn_30d, code_churn_90d, number_authors, top_contributors
Metric 8 — Architectural layer detection (P2, 1 week)
- New
scripts/layer_detector.py with 9 layer patterns (port from UA layer-detector.ts)
- Directory-path heuristic, first-match-wins
- New command
codelens layers [workspace]
- Update
summary to include layer breakdown
- Update
impact to show affected layer
Acceptance criteria
Files
- New:
scripts/{ws_complexity,modularity,tfidf,change_coupling,hotspot,layer_detector}_engine.py, scripts/commands/{fan_in_out,modularity,clusters,ball_of_mud,keywords,semantic_search,inheritance,god_class,change_coupling,shotgun_surgery,coupled_with,hotspot,bus_factor,layers}.py
- Update:
scripts/{callgraph,ownership,impact,summary}_engine.py, scripts/{python,js_backend,ts_backend,tsx,rust,vue,svelte}_parser.py
Summary
Add 7 architecture analysis metrics from Emerge + UA: Louvain modularity (community detection), TF-IDF keyword extraction, whitespace complexity (fast proxy), fan-in/fan-out, inheritance graph, change coupling (git history), code churn + author diversity. Plus architectural layer detection from UA.
Worker consensus (2 reports — mostly Emerge, with UA additions)
update!/CodeLens_Upgrade_Issues_from_Emerge.mdCL-024python-louvainlibrary, 5x optimization runs, resolution 1.5. 3 commands:modularity,clusters,ball-of-mud. Run on dependency/call/inheritance/complete graph.scikit-learnTfidfVectorizer, 12 language-specific stopword sets. 2 commands:keywords,semantic-search.--quickmode skips AST.avg_fan_in,avg_fan_out,max_fan_in_name,max_fan_out_name. New commandfan-in-out. Integrate withimpact.inheritance_graph. 2 commands:inheritance,god-class(>20 children or >5 depth).PyDrillertraverses commit history, files committed together = coupled. 3 commands:change-coupling,shotgun-surgery,coupled-with.ownership_engine.pywith multi-commit churn. 2 commands:hotspot,bus-factor.update!/CodeLens_vs_UnderstandAnything_Upgrade_Analysis.mdU2layers. Updatesummary+impact.Proposed scope (P2, 6-10 weeks total — can be split across multiple PRs)
Each metric is independent and can ship separately:
Metric 1 — Whitespace complexity (P2, 3 days, quick win)
emerge/metrics/whitespace/whitespace.py(81 LOC, MIT)--metric ws|cyclomatic|cognitive|alltocomplexitycommandcodelens complexity --quick --top 20skips AST, <5s for 5000 filesMetric 2 — Fan-in / fan-out (P1, 3 days, quick win)
calculate_fan_in_out()tocallgraph_engine.pycodelens fan-in-out [workspace] [--name FN] [--top N]impact_engine.pyto include fan-in/out in outputMetric 3 — Louvain modularity (P2, 1 week)
emerge/metrics/modularity/modularity.py(188 LOC, MIT)python-louvaindependencymodularity,clusters,ball-of-mudcodelens_modularity,codelens_clusters,codelens_ball_of_mudMetric 4 — TF-IDF keyword extraction (P2, 1 week)
emerge/metrics/tfidf/tfidf.py(118 LOC, MIT)scikit-learndependencykeywords,semantic-search.codelens/keywords_cache.jsoncodelens_keywords,codelens_semantic_searchMetric 5 — Inheritance graph (P2, 2-3 weeks)
inheritance_graphincallgraph_engine.pyinheritance [workspace] [--class NAME] [--depth N],god-class [workspace]codelens_inheritance,codelens_god_classMetric 6 — Change coupling (P1, 1-2 weeks, high impact unique feature)
pydrillerdependencyemerge/metrics/git/git.py(234 LOC)change-coupling,shotgun-surgery,coupled-with <file>impact_engine.pyto include coupled filesMetric 7 — Code churn & author diversity (P2, 1 week, depends on Metric 6)
ownership_engine.pyto traverse git history (not justgit blame)hotspot,bus-factorownershipcommand output withcode_churn_30d,code_churn_90d,number_authors,top_contributorsMetric 8 — Architectural layer detection (P2, 1 week)
scripts/layer_detector.pywith 9 layer patterns (port from UAlayer-detector.ts)codelens layers [workspace]summaryto include layer breakdownimpactto show affected layerAcceptance criteria
codelens complexity --quick<5s for 5000 filescodelens modularityproduces stable results across 5 runs (Louvain non-determinism controlled)codelens change-couplingcorrectly identifies files often committed togethercodelens layerscorrectly classifies files into 9 architectural layersFiles
scripts/{ws_complexity,modularity,tfidf,change_coupling,hotspot,layer_detector}_engine.py,scripts/commands/{fan_in_out,modularity,clusters,ball_of_mud,keywords,semantic_search,inheritance,god_class,change_coupling,shotgun_surgery,coupled_with,hotspot,bus_factor,layers}.pyscripts/{callgraph,ownership,impact,summary}_engine.py,scripts/{python,js_backend,ts_backend,tsx,rust,vue,svelte}_parser.py