refactor(models): narrow Any-typed parse boundary with TypedDict unions by clean6378-max-it · Pull Request #72 · cppalliance/claude-code-chat-browser

clean6378-max-it · 2026-06-09T15:33:30Z

closes #68

Summary

Replaces NotRequired[Any] on MessageDict.tool_result and MessageDict.data with typed unions so mypy strict constrains the JSONL parse boundary without weakening tests or adding # type: ignore.
Rebased onto Monday's ruff/pip-audit CI merge (#70).

Problem

tool_result and data are the highest-risk fields in the Claude Code JSONL format — structurally complex and prone to upstream drift. Any let schema ambiguity flow silently into rendering, export, and search.

Changes

models/tool_results.py (new) — per-tool TypedDicts for all 15 _TOOL_RESULT_DISPATCH shapes (Bash, Read, Write, Edit, Glob, Grep, Task variants, TodoWrite, WebFetch, WebSearch, UserInput, Plan), plus ToolResultUnion, ToolNameLiteral, and TypeGuard helpers
models/record_data.py (new) — RecordDataUnion for progress data payloads (bash_progress, hook_progress, agent_progress, summary, compact_boundary) with dict[str, object] fallback
models/session.py — tool_result: ToolResultUnion | None, data: RecordDataUnion; adds ToolUseDict and MessageUsageDict
utils/tool_dispatch.py — predicates/builders typed with ToolResultDict; _parse_tool_result accepts ToolResultUnion | None
utils/jsonl_parser.py — is_tool_result_dict narrowing at image-extraction site; typed ToolUseDict assembly in _process_assistant
utils/md_exporter.py — _render_tool_use consumes ToolUseDict with explicit dict[str, object] narrowing
Unknown upstream shapes use dict[str, object] instead of Any to force explicit narrowing at use sites.

Acceptance criteria

tool_result typed with union covering all known tool result shapes from fixtures
data typed with union covering known record types
Narrowing via TypeGuard / Literal (ToolNameLiteral, is_tool_result_dict)
No new # type: ignore in changed production code
mypy -p api -p utils -p models passes
ruff check . passes (rebased on feat: add ruff and pip-audit CI gates, fix style violations, and add … #70)
All existing tests pass (350)

Test plan

mypy -p api -p utils -p models
pytest -q
ruff check .

 tests/test_jsonl_parser.py — all _parse_tool_result parametrized cases green

 tests/test_real_session_fixtures.py — dispatch predicate coverage green

 Full suite green locally after rebase onto #70
Out of scope
Hypothesis fuzz tests (Wednesday #4 — this PR defines the fuzz oracle)
Export partial-failure surfacing (Tuesday PR 2)
Frontend TypeScript types
Dependencies
Depends on: #70 (ruff CI) — rebased
Blocks: Wednesday fuzz testing (#4)

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

* **Refactor**
  * Improved structured parsing and runtime validation for progress, tool-result, and tool-use data for more reliable handling.
* **Bug Fixes**
  * Safer handling of malformed or legacy payloads to prevent errors and incorrect fields.
  * Markdown export now skips invalid todo/question entries and maps statuses to display icons.
  * Image extraction and tool-result parsing only run when payloads are recognized as valid.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Replace NotRequired[Any] on MessageDict.tool_result and data with per-tool and per-record unions, type guards, and dispatch-aligned narrowing so mypy strict constrains schema drift without weakening tests.

coderabbitai · 2026-06-09T15:33:53Z

📝 Walkthrough

Walkthrough

Adds TypedDict schemas and TypeGuard predicates for tool results and record data, narrows session message typings to those unions, and updates tool dispatch, JSONL parsing, and markdown exporter to use and validate the new typed contracts.

Changes

Type-narrowing refactor from Any to structured unions

Layer / File(s)	Summary
Tool result TypedDict schemas and type guards `models/tool_results.py`	Defines TypedDict schemas for many tool-result shapes; exports `ToolResultDict`, `ToolResultUnion`, runtime `TypeGuard` predicates, and `ToolNameLiteral`.
Record data TypedDict schemas and type guard `models/record_data.py`	Defines TypedDicts for progress/record `data` variants, exports `RecordDataUnion`, and `is_bash_progress_data` type guard.
Session message type narrowing `models/session.py`, `models/__init__.py`	Adds `ToolUseDict`, `MessageUsageDict`, `SystemSubtypeLiteral`; narrows `MessageDict` fields: `tool_result: ToolResultUnion
Tool result dispatch with type-narrowed predicates `utils/tool_dispatch.py`	Updates predicates/builders to `ToolResultDict`, adds defensive handling for malformed payloads, and validates inputs with `is_tool_result_dict()` before dispatching.
Parser and exporter updates with typed models `utils/jsonl_parser.py`, `utils/md_exporter.py`	`jsonl_parser` parses `toolUseResult` via `_parse_tool_result()`, guards image extraction on `is_tool_result_dict()`, normalizes `tool_uses` and `data`; `md_exporter._render_tool_use` accepts `ToolUseDict`, normalizes `input`, and guards todos/questions iteration.
Package type exports `models/__init__.py`	Re-exports `RecordDataUnion`, `ToolResultUnion`, and `ToolUseDict` for downstream use.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

cppalliance/claude-code-chat-browser#38: Both PRs add/propagate stricter typed models into the same session parsing pipeline—especially utils/jsonl_parser.py.
cppalliance/claude-code-chat-browser#57: Both PRs modify the JSONL tool-result parsing/dispatch pipeline and typed boundary handling.
cppalliance/claude-code-chat-browser#31: Touches the same parsing/export pipeline (jsonl_parser, tool_dispatch) and relates to dispatch/table refactor.

Suggested labels

enhancement

Suggested reviewers

timon0305
wpak-ai
jonathanMLDev

"I hopped through types both stout and small,
Replaced wild Any with unions for all.
Guards now check shapes before parse begins,
Exports let callers find each typed bin.
A rabbit cheers the tidy typing thrall! 🐇"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 8.51% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main refactoring: narrowing Any-typed parse boundaries with TypedDict unions, which is the core objective.
Linked Issues check	✅ Passed	The PR successfully implements all coding requirements from `#68`: TypedDict unions for tool_result and data fields, new modules for tool results and record data, TypeGuard helpers, defensive normalization, and mypy strict compliance without new type ignores.
Out of Scope Changes check	✅ Passed	All changes directly support the goal of replacing Any with typed unions in the parse boundary. Updates to utils files apply the new types consistently without introducing unrelated modifications.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch refactor/parse-boundary-typeddict

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

models/__init__.py (1)

3-16: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Fix import sorting to pass ruff check.

The import block is flagged as unsorted by the pipeline. The imports should be in alphabetical order by module name.

📦 Proposed fix

 from models.errors import ErrorResponse
 from models.export import ExportStateDict
 from models.project import ProjectDict, ProjectSessionRowDict, SessionListItemDict
-from models.search import SearchHitDict
 from models.record_data import RecordDataUnion
+from models.search import SearchHitDict
 from models.session import (
     MessageDict,
     QuickSessionInfoDict,
     SessionDict,
     SessionMetadataDict,
     ToolUseDict,
 )
-from models.tool_results import ToolResultUnion
 from models.stats import FilesTouchedDict, SessionStatsDict
+from models.tool_results import ToolResultUnion

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@models/__init__.py` around lines 3 - 16, The import block in
models/__init__.py is unsorted causing Ruff to fail; reorder the import lines
alphabetically by module name so modules appear as: errors (ErrorResponse),
export (ExportStateDict), project (ProjectDict, ProjectSessionRowDict,
SessionListItemDict), record_data (RecordDataUnion), search (SearchHitDict),
session (MessageDict, QuickSessionInfoDict, SessionDict, SessionMetadataDict,
ToolUseDict), stats (FilesTouchedDict, SessionStatsDict), tool_results
(ToolResultUnion); update the file to reflect that alphabetical module ordering
while keeping the same imported symbols.

Source: Pipeline failures

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@models/record_data.py`:
- Line 3: The import line in record_data.py is unsorted and fails ruff; reorder
the imports alphabetically (by module/member name) so the from typing import ...
line lists members in sorted order (e.g., ensure Literal, TypeGuard, TypedDict
are alphabetized) or split into multiple lines per project style so ruff
import-sorting passes; update the import of Literal, TypeGuard, TypedDict
accordingly in the file.

In `@models/tool_results.py`:
- Line 7: The import block in models/tool_results.py is unsorted; reorder the
imported names in the from typing import ... line so they are alphabetically
sorted (e.g., ensure Literal, TypeGuard, TypedDict are in the correct
alphabetical order) to satisfy the ruff sorting rule and rerun the linter; keep
the single from typing import ... statement and do not add or remove imports.

---

Outside diff comments:
In `@models/__init__.py`:
- Around line 3-16: The import block in models/__init__.py is unsorted causing
Ruff to fail; reorder the import lines alphabetically by module name so modules
appear as: errors (ErrorResponse), export (ExportStateDict), project
(ProjectDict, ProjectSessionRowDict, SessionListItemDict), record_data
(RecordDataUnion), search (SearchHitDict), session (MessageDict,
QuickSessionInfoDict, SessionDict, SessionMetadataDict, ToolUseDict), stats
(FilesTouchedDict, SessionStatsDict), tool_results (ToolResultUnion); update the
file to reflect that alphabetical module ordering while keeping the same
imported symbols.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 588107cc-972b-4a26-b183-6be6a98fbc80

📥 Commits

Reviewing files that changed from the base of the PR and between 32e60f4 and 0269b8c.

📒 Files selected for processing (7)

models/__init__.py
models/record_data.py
models/session.py
models/tool_results.py
utils/jsonl_parser.py
utils/md_exporter.py
utils/tool_dispatch.py

clean6378-max-it · 2026-06-09T20:38:03Z

@CodeRabbit , Can you reeviewe this PR fully?

coderabbitai · 2026-06-09T20:38:15Z

@clean6378-max-it Sure! I'll kick off a full review of PR #72 right away.

✅ Action performed

Full review finished.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@models/record_data.py`:
- Around line 6-8: The TypedDict BashProgressDataDict is currently total (all
keys required) which makes the TypeGuard is_bash_progress_data unsafe — change
the TypedDict to allow missing keys by declaring it as non-total (e.g., class
BashProgressDataDict(TypedDict, total=False): with the same type:
Literal["bash_progress"] and output: str), and/or mark output as
NotRequired[str]; additionally ensure the TypeGuard function
is_bash_progress_data either checks for the presence of "output" (e.g., "output"
in data) before narrowing or leave the TypedDict non-total so mypy won't assume
output exists after the type guard.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4b2ee96e-d606-435b-9762-a7866daf186b

📥 Commits

Reviewing files that changed from the base of the PR and between 32e60f4 and 805ef93.

📒 Files selected for processing (7)

models/__init__.py
models/record_data.py
models/session.py
models/tool_results.py
utils/jsonl_parser.py
utils/md_exporter.py
utils/tool_dispatch.py

coderabbitai

🧹 Nitpick comments (1)

models/record_data.py (1)
46-47: Don’t add extra TypeGuards for progress variants unless code actually narrows on data["type"]

utils/jsonl_parser.py:_process_progress stores data: RecordDataUnion and derives only progress_type = str(data.get("type", "")); there are no consumption sites that branch on data["type"] for hook_progress, agent_progress, or summary. Repo-wide, those discriminator strings only appear in models/record_data.py TypedDict definitions (and compact_boundary is handled as a system subtype, not as record data.type). Also, is_bash_progress_data (models/record_data.py:46-47) is currently unused—add its intended consumption site or remove/defer it.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@models/record_data.py` around lines 46 - 47, The is_bash_progress_data
TypeGuard is currently unused; either remove this TypeGuard from
models/record_data.py or actually use it to narrow progress records where code
branches on data["type"]. If you choose removal, delete the
is_bash_progress_data function and any unused TypedDict-only discriminators; if
you choose to keep it, update utils/jsonl_parser.py::_process_progress to
inspect and branch on data["type"] (e.g., call is_bash_progress_data(data) and
handle the bash_progress shape) so the TypeGuard is consumed and provides real
narrowing.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@models/record_data.py`:
- Around line 46-47: The is_bash_progress_data TypeGuard is currently unused;
either remove this TypeGuard from models/record_data.py or actually use it to
narrow progress records where code branches on data["type"]. If you choose
removal, delete the is_bash_progress_data function and any unused TypedDict-only
discriminators; if you choose to keep it, update
utils/jsonl_parser.py::_process_progress to inspect and branch on data["type"]
(e.g., call is_bash_progress_data(data) and handle the bash_progress shape) so
the TypeGuard is consumed and provides real narrowing.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6c45113f-de28-44c8-aaac-c296d48c2e10

📥 Commits

Reviewing files that changed from the base of the PR and between 805ef93 and 3bc6b06.

📒 Files selected for processing (1)

models/record_data.py

timon0305

Solid parse-boundary refactor — mypy strict is green, the isinstance-guarded narrowing in the parser/exporter is the right shape, and no # type: ignore crept in. Main thing: the per-tool TypeGuards in tool_results.py are unused and duplicate tool_dispatch.py's predicates (drift risk) — wire them in or drop them. Two minor notes inline.

timon0305 · 2026-06-09T21:41:15Z

+def is_bash_tool_result(tr: ToolResultDict) -> TypeGuard[BashToolResultDict]:
+    return "stdout" in tr or "stderr" in tr
+
+
+def is_file_edit_tool_result(tr: ToolResultDict) -> TypeGuard[FileEditToolResultDict]:
+    return "structuredPatch" in tr or ("filePath" in tr and "newString" in tr)
+
+
+def is_plan_tool_result(tr: ToolResultDict) -> TypeGuard[PlanToolResultDict]:
+    return "plan" in tr and "filePath" in tr
+
+
+def is_file_write_tool_result(tr: ToolResultDict) -> TypeGuard[FileWriteToolResultDict]:
+    return "filePath" in tr and "content" in tr
+
+
+def is_glob_tool_result(tr: ToolResultDict) -> TypeGuard[GlobToolResultDict]:
+    filenames = tr.get("filenames")
+    return "filenames" in tr and isinstance(filenames, list)
+
+
+def is_grep_tool_result(tr: ToolResultDict) -> TypeGuard[GrepToolResultDict]:
+    return "mode" in tr and "numFiles" in tr
+
+
+def is_read_tool_result(tr: ToolResultDict) -> TypeGuard[ReadToolResultDict]:
+    file_obj = tr.get("file")
+    return "file" in tr and isinstance(file_obj, dict)
+
+
+def is_web_search_tool_result(tr: ToolResultDict) -> TypeGuard[WebSearchToolResultDict]:
+    return "query" in tr and "results" in tr
+
+
+def is_web_fetch_tool_result(tr: ToolResultDict) -> TypeGuard[WebFetchToolResultDict]:
+    return "url" in tr and "code" in tr


These nine per-tool TypeGuards (is_bash_tool_result … is_web_fetch_tool_result) have zero callers — utils/tool_dispatch.py does the actual dispatch with its own parallel _tool_result_pred_* functions carrying identical logic (e.g. "stdout" in tr or "stderr" in tr exists in both files). As written this is dead code that duplicates the dispatch predicates, so the two will silently drift when one side changes. Either wire tool_dispatch.py to use these TypeGuards (so the narrowing flows to the builders) or drop them and keep tool_dispatch.py as the single source of truth. (is_tool_result_dict at 153 is fine — it IS used.)

timon0305 · 2026-06-09T21:41:15Z

+)
+
+
+def is_bash_progress_data(data: RecordDataUnion) -> TypeGuard[BashProgressDataDict]:


is_bash_progress_data is the only TypeGuard among the five RecordDataUnion members and it's also unused. Either it's the start of a pattern (then the other four deserve the same + a caller), or drop it — a lone unused guard reads like a leftover.

timon0305 · 2026-06-09T21:41:15Z

+
+class ToolUseDict(TypedDict, total=False):
+    id: str
+    name: ToolNameLiteral | str


ToolNameLiteral | str collapses to str for type-checking, so the Literal gives no narrowing — it's documentation only. Fine as a "known names + escape hatch" intent, but worth a one-line comment saying that's deliberate, otherwise a reader expects exhaustiveness checking it won't get.

refactor(models): narrow Any-typed parse boundary with TypedDict unions

0269b8c

Replace NotRequired[Any] on MessageDict.tool_result and data with per-tool and per-record unions, type guards, and dispatch-aligned narrowing so mypy strict constrains schema drift without weakening tests.

coderabbitai Bot reviewed Jun 9, 2026

View reviewed changes

Comment thread models/record_data.py Outdated

Comment thread models/tool_results.py Outdated

clean6378-max-it added 2 commits June 10, 2026 00:01

fix(models): sort imports for ruff I001 in models package

e72280e

style: ruff format jsonl_parser and tool_dispatch

805ef93

coderabbitai Bot reviewed Jun 9, 2026

View reviewed changes

Comment thread models/record_data.py Outdated

fix(models): make BashProgressDataDict non-total for safe TypeGuard

3bc6b06

coderabbitai Bot reviewed Jun 9, 2026

View reviewed changes

timon0305 reviewed Jun 9, 2026

View reviewed changes

timon0305 requested changes Jun 9, 2026

View reviewed changes

		)


		def is_bash_progress_data(data: RecordDataUnion) -> TypeGuard[BashProgressDataDict]:

Conversation

clean6378-max-it commented Jun 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Changes

Acceptance criteria

Test plan

Uh oh!

coderabbitai Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

clean6378-max-it commented Jun 9, 2026

Uh oh!

coderabbitai Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

timon0305 left a comment

Choose a reason for hiding this comment

Uh oh!

timon0305 Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

timon0305 Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

timon0305 Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

clean6378-max-it commented Jun 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading