[FEATURE] LLM integration framework — multi-provider abstraction, cache, cost tracking, explanation generator

## Summary

Add LLM integration framework for CodeLens features that benefit from LLM reasoning (taint validation, secret FP check, smell justification, dead-code reason, bug explanation). Multi-provider (OpenAI/Anthropic/Google/DeepSeek/Z.ai GLM), disk cache, token cost tracking, opt-in.

## Worker consensus (4 reports)

| Worker | Source | Contribution |
|---|---|---|
| RepoAudit | `update!/CodeLens_Upgrade_Issues_from_RepoAudit.md` CL-037 | `LLMTool` ABC with `invoke(input, output_cls)`, `_get_prompt`, `_parse_response`, built-in caching, retry, token-cost tracking. `LLMToolInput`/`LLMToolOutput` ABCs with `__hash__`/`__eq__`. |
| RepoAudit | same file CL-042 | LLM response cache at `~/.codelens/llm_cache/<tool_name>/<input_hash>.json` (key = SHA-256 of `(tool_name, model_name, input_hash)`). Hit ratio in output. `pricing.json` per model. `codelens llm-cache stats` / `clear`. `--no-cache`, `--max-cost-usd N` flags. |
| RepoAudit | same file CL-043 | Multi-provider LLM abstraction — 6 providers: OpenAI, Anthropic, Bedrock, Google, DeepSeek, Z.ai GLM. Dispatch by `model_name` prefix. Lazy import. 60s timeout, 3-retry backoff. |
| RepoAudit | same file CL-044 | Bug report with LLM-generated CoT explanation — `explanation: str`, `fix_suggestion: str`, `confidence: float`. After PathValidator confirms `is_reachable: True`, invoke `ExplanationGenerator`. `<10s per finding. |
| CodeGraph | `update!/CodeLens_CodeGraph_Upgrade_Analysis.md` #19 | Reasoning offload — `codelens_explore` sends assembled context to remote OpenAI-compatible reasoning model, returns tight self-contained answer. Strictly degradable (any failure → null → caller falls back to local source). NEVER throws. BYO endpoint (`CODELENS_OFFLOAD_URL`, `CODELENS_OFFLOAD_MODEL`, `CODELENS_OFFLOAD_API_KEY` or `keyEnv`). |
| Semgrep | `update!/CodeLens_Upgrade_Issues_from_Semgrep.md` CL-014 | MCP prompt `write_custom_codelens_rule(description)` returns ready-to-use rule YAML. `get_codelens_rule_schema`, `get_codelens_rule_yaml(rule_id)`. |

## Proposed scope (P2, 4-6 weeks total)

**Phase 1 — LLMTool ABC + provider abstraction (P2, 1-2 weeks)**
- New `scripts/llm/base_tool.py` with `LLMTool` ABC
- New `scripts/llm/provider.py` with 6 providers (OpenAI, Anthropic, Bedrock, Google, DeepSeek, Z.ai GLM)
- Dispatch by `model_name` prefix (`gpt-*`, `claude-*`, `gemini-*`, `deepseek-*`, `glm-*`)
- Lazy import per provider
- 60s timeout, 3-retry exponential backoff
- API keys from env vars per provider
- Config via `CODELENS_LLM_PROVIDER`, `CODELENS_LLM_MODEL`, `CODELENS_LLM_API_KEY` env vars + `codelens.yaml`

**Phase 2 — Disk cache + cost tracking (P2, 1 week)**
- Cache at `~/.codelens/llm_cache/<tool_name>/<input_hash>.json`
- Key = SHA-256 of `(tool_name, model_name, input_hash)` (invalidates on model change)
- Cache value: `{output, input_token_cost, output_token_cost, timestamp, model_name}`
- `pricing.json` per model (GPT-4o, Claude 3.7, Gemini 1.5 Pro, DeepSeek V3, Z.ai GLM-4)
- `codelens llm-cache stats` / `codelens llm-cache clear` commands
- `--no-cache` and `--max-cost-usd N` flags
- Auto-evict entries >30 days
- Thread-safe for concurrent agents
- Hit ratio in output: `{cache: {hits, misses, hit_ratio}}`

**Phase 3 — Explanation generator (P2, 1-2 weeks, depends on taint analysis depth issue Phase 7)**
- `ExplanationGenerator` LLMTool subclass
- After taint PathValidator confirms `is_reachable: True`, invoke generator with bug_type/buggy_value/relevant_functions/path
- Output: `explanation: str` (CoT), `fix_suggestion: str` (code snippet), `confidence: float` (0.0-1.0)
- Embed in JSON, SARIF (`result.message.text`), Markdown (blockquote)
- `codelens explain <finding-id>` command (re-generate)
- MCP tool `codelens_explain_finding`
- Target <10s per finding

**Phase 4 — Reasoning offload (P3, 2-3 weeks, optional, depends on Phase 1)**
- `codelens_explore` does local retrieval, then sends context to remote reasoning model
- Returns tight self-contained answer that becomes tool call result
- Strictly degradable: any failure → null → caller falls back to local source verbatim
- NEVER throws, NEVER `isError`
- BYO endpoint via `CODELENS_OFFLOAD_URL`, `CODELENS_OFFLOAD_MODEL`, `CODELENS_OFFLOAD_API_KEY`
- Token storage at `~/.codelens/credentials.json` (revocable, org-scoped)

**Phase 5 — MCP prompts for rule authoring (P3, 1 week, optional)**
- 3 MCP prompts: `write_custom_codelens_rule(description)`, `get_codelens_rule_schema`, `get_codelens_rule_yaml(rule_id)`
- Let Claude Code invoke `/write_custom_codelens_rule description="detect SQL injection in Flask"` and get validated rule

## Acceptance criteria

- [ ] Phase 1: all 6 providers work with valid API key
- [ ] Phase 2: cache hit ratio >80% on repeated queries
- [ ] Phase 2: `--max-cost-usd 1.0` aborts before exceeding budget
- [ ] Phase 3: explanation embedded in SARIF renders in GitHub code scanning UI
- [ ] Phase 4: reasoning offload falls back gracefully on network failure

## License note

RepoAudit is Purdue Non-Commercial — design influenced, reimplement from scratch. Use Z.ai GLM provider as default for CodeLens (matches existing `z-ai-web-dev-sdk` integration pattern).

## Related

- Phase 3 depends on taint analysis depth issue (Phase 7 LLM validator)
- Phase 4 depends on `codelens_explore` consolidation from #23 split


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] LLM integration framework — multi-provider abstraction, cache, cost tracking, explanation generator #63

Summary

Worker consensus (4 reports)

Proposed scope (P2, 4-6 weeks total)

Acceptance criteria

License note

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Worker	Source	Contribution
RepoAudit	`update!/CodeLens_Upgrade_Issues_from_RepoAudit.md` CL-037	`LLMTool` ABC with `invoke(input, output_cls)`, `_get_prompt`, `_parse_response`, built-in caching, retry, token-cost tracking. `LLMToolInput`/`LLMToolOutput` ABCs with `__hash__`/`__eq__`.
RepoAudit	same file CL-042	LLM response cache at `~/.codelens/llm_cache/<tool_name>/<input_hash>.json` (key = SHA-256 of `(tool_name, model_name, input_hash)`). Hit ratio in output. `pricing.json` per model. `codelens llm-cache stats` / `clear`. `--no-cache`, `--max-cost-usd N` flags.
RepoAudit	same file CL-043	Multi-provider LLM abstraction — 6 providers: OpenAI, Anthropic, Bedrock, Google, DeepSeek, Z.ai GLM. Dispatch by `model_name` prefix. Lazy import. 60s timeout, 3-retry backoff.
RepoAudit	same file CL-044	Bug report with LLM-generated CoT explanation — `explanation: str`, `fix_suggestion: str`, `confidence: float`. After PathValidator confirms `is_reachable: True`, invoke `ExplanationGenerator`. `<10s per finding.
CodeGraph	`update!/CodeLens_CodeGraph_Upgrade_Analysis.md` #19	Reasoning offload — `codelens_explore` sends assembled context to remote OpenAI-compatible reasoning model, returns tight self-contained answer. Strictly degradable (any failure → null → caller falls back to local source). NEVER throws. BYO endpoint (`CODELENS_OFFLOAD_URL`, `CODELENS_OFFLOAD_MODEL`, `CODELENS_OFFLOAD_API_KEY` or `keyEnv`).
Semgrep	`update!/CodeLens_Upgrade_Issues_from_Semgrep.md` CL-014	MCP prompt `write_custom_codelens_rule(description)` returns ready-to-use rule YAML. `get_codelens_rule_schema`, `get_codelens_rule_yaml(rule_id)`.

[FEATURE] LLM integration framework — multi-provider abstraction, cache, cost tracking, explanation generator #63

Description

Summary

Worker consensus (4 reports)

Proposed scope (P2, 4-6 weeks total)

Acceptance criteria

License note

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions