Phase 3: Rebuild FLASHApp viewers on OpenMS-Insight via frozen template by t0mdavid-m · Pull Request #93 · OpenMS/FLASHApp

t0mdavid-m · 2026-06-05T09:21:03Z

Summary

This PR completes Phase 3 of the FLASHApp migration: rebuild the three viewer pages (FLASHDeconv, FLASHTnT, FLASHQuant) on top of OpenMS-Insight using a reusable, tool-agnostic grid module from a frozen streamlit-template package. The old bespoke Vue mega-component and index-based selection logic are replaced with value-based filters/interactivity and a shared StateManager for cross-linked panels.

Key Changes

New Modules

src/view/grid.py — Tool-agnostic linked-grid renderer (render_linked_grid() + LayoutManager class). Distills the old render.py::render_grid() + both FLASH*LayoutManager classes into a single reusable template that can be frozen and vendored byte-for-byte unchanged into downstream apps.
src/render/schema.py — Adapter layer that converts FLASHApp's wide, list-column, index-addressed FileManager caches into tidy parquet with stable value IDs (scan_id, mass_id, peak_id, protein_id, etc.). Enables OpenMS-Insight's value-based filters/interactivity model. Includes helpers for exploding nested signal/noise peaks, KDE distributions, and protein/tag data.
src/render/render.py (repurposed) — Now a builder factory. make_builders(file_manager, dataset_id, tool, settings=None) returns a {comp_name: () -> BaseComponent} map. Each zero-arg factory closes over dataset context and uses file_manager.result_path() to feed data_path= to Insight components. Includes oracle Tabulator column definitions (titles, formatters, sorters) ported verbatim so migrated Tables show the same curated columns and formatting.

Removed Modules

src/render/update.py — Index-based selection logic (scanIndex / massIndex / proteinIndex / xRange/yRange slicing). Replaced by value-based filters/interactivity in Insight components.
src/render/initialize.py — Per-panel cache loading. Now handled by Insight's data_path= + subprocess preprocessing.
src/render/StateTracker.py — Local state reconciler. Replaced 1:1 by openms_insight.StateManager.
src/render/components.py — Vue component registration. No longer needed.
src/render/util.py — Hash utility (unused).
.gitmodules + openms-streamlit-vue-component/ — Old Vue submodule removed.

Page Updates

content/FLASHDeconv/FLASHDeconvViewer.py, content/FLASHTnT/FLASHTnTViewer.py, content/FLASHQuant/FLASHQuantViewer.py — Simplified to call show_linked_grid() with a layout and builders. No more render_grid() / StateTracker / initialize_data / update_data / filter_data calls.
src/common/common.py — Added show_linked_grid() convenience wrapper for rendering N-experiment grids with the frozen template.

Data Layer

src/workflow/FileManager.py — Added result_path(dataset_id, name_tag) method to expose parquet file paths for data_path= (Insight's subprocess preprocessing entry point).

Testing & Migration Harness

tests/test_render_schema.py — Construct-smoke for build_insight_caches(): synthetic FileManager caches → tidy parquet with stable IDs and correct explode shapes.
**`tests/test_render

https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne

Summary by CodeRabbit

New Features
- Viewer pages rebuilt to use a unified linked-grid with per-experiment layout editor, value-based selection, reusable Insight builders, and best-effort cache warming.
Documentation
- Added Phase 3 migration plan, review guidance, and convergence tooling docs.
Tests
- Large new suites validating builders, schema transforms, linked-grid wiring, and cache warming.
Chores
- Runtime now installs openms-insight; legacy local JS submodule removed.

Scaffolds Phase-3 tracking: rebuild FLASHApp viewer pages on OpenMS-Insight via a reusable, frozen streamlit-template grid (no FLASHApp fork). No app code changes. - units.yaml: 10 units (template build+freeze, then FLASHApp rebuild) with oracles. - run_review.py: shared record/gate/report convergence driver. - nondivergence.py: asserts FLASHApp grid code == frozen template (normalized hash). - review-log/ ledger, REVIEW.md rollup, README.md. https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne

… non-divergence) Concrete spec from the planning fan-out: streamlit-template src/view/grid.py (render_linked_grid + LayoutManager), FileManager data layer, show_linked_grid, visualization_template demo; FLASHApp src/render/schema.py (tidy parquet, stable IDs); builders factory + StateManager per (tool,experiment); 3 viewer rebuilds; index->value selection map; non-divergence via vendored grid.py == frozen template grid.py. https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne MSG

- src/view/grid.py: byte-identical vendored copy of the frozen streamlit-template src/view/grid.py (the render_linked_grid + LayoutManager SSOT). Plain copy synced from the template, NOT a git submodule. - src/view/__init__.py: package marker. - migration/units.yaml: register nondivergence_pairs = [FLASHApp/src/view/grid.py, streamlit-template/src/view/grid.py] so nondivergence.py enforces byte-identical (normalized-hash) equality of the two side-by-side checkouts. nondivergence GREEN (vendored == frozen template). https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne

…rozen template grid - src/render/schema.py (NEW): build_insight_caches(file_manager, dataset_id, tool) -> Insight-ready tidy parquet (stable scan_id/mass_id/peak_id/protein_id/tag_id/feature_id; exploded SignalPeaks/NoisyPeaks; comma-split quant traces; long-format target/decoy KDE). - src/render/render.py: make_builders(file_manager, dataset_id, tool) factory (comp_name -> zero-arg -> BaseComponent), cache_id=f"{tool}__{dataset_id}__{comp}", data_path=result_path; index->value selection map (scanIndex->scan_id, massIndex->mass_in_scan/mass_id, proteinIndex+proteoform_scan_map->protein_id filter, StateTracker->StateManager). - 3 viewer pages shrink to page_setup -> pick experiment -> load layout -> show_linked_grid (imports render_linked_grid/LayoutManager from the vendored src/view/grid.py, unchanged). - src/common/common.py: show_linked_grid; FileManager: get_results(as_path)/result_path. - DELETED dead layer: components.py, initialize.py, update.py, StateTracker.py, util.py. - requirements.txt: + openms-insight. - tests/: conftest + test_render_schema (9) + test_render_builders (8). - Known seams (for review loop): Plot3D x/y/z config not forwarded through Insight's data_path subprocess (worked around with in-process data= for the small Plot3D frames); FLASHTnT SequenceView peaks not scan->protein remapped on protein selection. Verified: pytest 45 passed/2 skipped, parse OK, nondivergence GREEN (grid.py untouched). https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne

Replace the Phase-0 ast.parse placeholders with: nondivergence + template pytest (74, incl. test_view_grid construct-smoke) + FLASHApp pytest (45, incl. render schema/builders construct-smoke) + viewer-parse. Real machine gate for the review loop. https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne

Resolves the high/med round-1 review findings on the FLASHApp rebuild: schema.py: - _explode_list_cols: drop null/empty list cells before exploding (no more phantom null rows for empty spectra / zero-mass scans). - deconv_spectrum_tidy: alias MonoMass->mass (SequenceView requires a 'mass' column) and expose the per-scan ordinal as mass_in_scan (the oracle massIndex space the 3D / mass-table share) instead of dropping it. - proteins: denormalize scan_id (=deconv_index via build_proteoform_scan_map) so a protein-row click resolves to its scan (value-based proteoform_scan_map). - tags: key on scan_id (not a collapsed last-wins protein_id) so every tag on the selected proteoform's scan shows for ANY proteoform on that scan (oracle filtered by Scan). render.py: - mass_table / deconv_spectrum: 'mass' selection = mass_in_scan (per-scan ordinal the 3D consumes), not a global id; deconv x_column -> mass. - anno_spectrum: remove mass interactivity (raw-m/z click never matched the deconvolved MonoMass in the oracle, so it selected nothing). - protein_table: click sets BOTH protein and scan; tag_table + augmented spectrum + sequence-view peaks follow via scan; combined_spectrum filters by scan (was an unset 'spectrum' slot -> blank). - SequenceView (tnt): filters by protein (sequence) AND scan (peaks). viewers: blank-until-pick experiment selector (oracle parity; also avoids eager cache builds on page load). tests updated to assert the parity-correct wiring + a null-guard regression.

- combined_spectrum: resolve the tag-table's scalar tag_id to the tag's fragment masses + sequence via the tags frame (FLASHTnT only; FLASHDeconv has no tags frame so the resolve kwargs are omitted). - sequence_view (tnt): publish residue clicks as the 'aa' selection the tagger consumes for the tag-relative selectedAA (gold highlight). - test: the tagger resolves tag_id -> {sequence, masses, selectedAA}, tag/aa are re-render dependencies, and FLASHDeconv has no tag resolution. Resolves the last round-1 finding (3-tnt-003, tagger overlay dead).

…rowing Resolves the round-2 review findings: - 3-builders-003 (high): 3D Precursor Signals showed nothing until a mass was clicked because mass was a hard filter. Now optional_filters=["mass"] so the scan's full Signal/Noisy scatter shows immediately, narrowing to one mass only when selected (oracle parity). - 3-quant-001 (med): quant feature-trace 3D axes were swapped/mislabeled. Now x=m/z, y=retention time, z=intensity with explicit labels (oracle FLASHQuantView), title 'Feature group signals'. - 3-tnt-005 (low): clicking a sequence residue now narrows the tag table to tags spanning it via interval_filters={"aa": ("StartPos", "EndPos")}, reusing the SequenceView 'aa' selection (oracle TabulatorTagTable secondary filter). migration: gate now also runs the OpenMS-Insight gate (pytest/build/vitest/ parity) since Phase 3 re-opened Insight; insight:tagger-seqview tracked as a no-regression unit. Tests added for all three fixes.

…utive clean round

…or grid - 3-schema-004 (high): the Precursor Signals 3D plotted raw m/z on a 'Mass' axis. The oracle (get3DplotInputFromSNRPeaks) uses x = mz*charge; add a precomputed 'mass' column to precursor_signals and point Plot3D x_column there (Plot3D's default x_label 'Mass' matches the oracle axis title). Charge states no longer collapse to their m/z positions. - re-vendor src/view/grid.py from the frozen template (empty-experiment upload fix); nondivergence GREEN. - tests: precursor_signals carries mass==mz*charge (explode unit + integration); 3D builder x/y/z = mass/charge/intensity with label 'Mass'. Also records round-5 review (9 clean, 2 findings: 3-schema-004, 3-grid-003).

…ution lines Resolves round-6 review findings: - 3-builders-005 (med): revert the 3D Precursor Signals optional_filters=[mass]. The oracle frontend (getPrecursorSignal) renders EMPTY when no mass is selected (the precursor-scan lookup fails on the scan-filtered per_scan_data); it only draws SignalPeaks[mass_index] once a mass is chosen. mass is again a REQUIRED filter. (Round-2's 3-builders-003 misread update.py's data-prep as the displayed behavior.) - 3-builders-004 (low): pass oracle axis titles to the plain spectra + heatmaps (Monoisotopic Mass / m/z / Intensity / Retention Time) instead of raw column names. - 3-quant-002 (med): quant feature-trace 3D uses stem=False so each charge is one connected elution line (oracle FLASHQuantView mode:lines), not per-point spikes. - 3-schema-006 (nit): correct the anno is_signal test comment. tests: axis-title coverage; quant stem=False; corrected is_signal comment. (OpenMS-Insight tagger config-key leak 3-tnt-006 fixed in the Insight commit.)

…ch-oracle-chrome) Per the chosen full-presentation-parity bar: - All 5 tables now pass column_definitions (oracle Tabulator titles, fixed + placeholder(-1->'-') formatters, sorters) + per-table initial_sort (Protein/Tag = Score desc). The Table renders ONLY the curated oracle columns; internal carrier columns (scan_id, mzs, ProteinIndex, full sequence, ...) stay in the data for filters/interactivity/index but are no longer displayed. Coverage(%) omitted (oracle commented it out); de-duped FLASHQuant's duplicated 'Feature Group Quantity' column. - quant_traces_3d: series_column='isotope' (breaks the polyline between isotopes within a charge, reproducing the oracle -1000 z-sentinel gaps) + category_name_template='Charge: {}' (legend 'Charge: 2'), via the new Plot3D features. - conftest: synthetic protein_dfs/tag_dfs fixtures extended with the real FLASHTagger columns (Score, MatchingFragments, ModCount, TagCount, Coverage(%), Nmass/Cmass/DeltaMass) so initial_sort + the placeholder formatter are exercised. Resolves 3-tables-001, 3-quant-003, 3-quant-004. (ProteinTable best-per-spectrum toggle still pending.)

Round 8 (match-oracle-chrome bar): template/grid/common/filemanager/page + insight + deconv-viewer CLEAN; table chrome + quant isotope-breaks/legend verified resolved. Remaining findings: 3-quant-005 (per-trace-id for same-isotope duplicate traces), 3-tables-002 (best-per-spectrum toggle), 3-tables-003 (go_to_fields), 3-fdr-001/002 (FDR title+trace names), 3-feat-001 (feature-table title), 3-anno-001 (anno spectrum selection-driven highlight model). User directive: match the full interaction model.

…can flag Data-layer support for the full-interaction-model parity work (render wiring in a follow-up): - anno_highlight_link (NEW tidy frame: scan_id, peak_id, mass_in_scan, charge): maps each annotated signal peak to the deconvolved mass(es) it belongs to + its charge, keyed by anno_spectrum_tidy.peak_id, so the annotated spectrum can highlight ONLY the selected mass's raw peaks + show per-charge z=N labels. Confirmed 1:MANY (a raw m/z peak can be a signal peak for several masses at different charges) -> one row per (peak, mass). - quant_traces + trace_in_feature: per-feature running trace id minted at the trace explode, carried through the point comma-split, so the 3D breaks the polyline PER-TRACE (fixes 3-quant-005: two same-(charge,isotope) traces no longer merge). - proteins + is_best_per_scan: 1 for the max-Score row per Scan (ordinal rank, ties keep-first) -> backs the ProteinTable best-per-spectrum default view. Additive; tests + fixtures updated (59 passed).

… chrome Wire the new OpenMS-Insight selection-driven LinePlot/Plot3D features + schema data into the viewers (match-full-interaction-model directive): - anno_spectrum: selection-driven highlight of the SELECTED mass via the anno_highlight_link frame (highlight_selection=mass, highlight_link_path, key=peak_id, match=mass_in_scan, charge labels z={}, deconv_peaks_toggle=True); exposes peak_id via a private anno_peak interactivity slot (does not drive the shared mass slot). Removes the static is_signal highlight. (3-anno-001) - deconv_spectrum: selective highlight of the selected mass (highlight_match_column =mass_in_scan). (residual: selected-mass value label needs a match-column label producer in the LinePlot -- noted for follow-up) - 3D_SN_plot: title_selection -> dynamic "" / "Precursor signals" / "Mass signals". (3-3d-001) - quant_traces_3d: series_column=trace_in_feature (per-trace break; fixes the same-(charge,isotope) merge). (3-quant-005) - go_to_fields per oracle for every table (no internal-carrier leakage). (3-tables-003) - FDR: title "FDR Plot" + trace names "Target QScores"/"Decoy QScores". (3-fdr-001/002) - feature table title "Feature groups". (3-feat-001) - best-per-spectrum toggle: per-experiment checkbox -> make_builders(best_per_spectrum) -> protein_table sources is_best_per_scan-filtered data under a distinct cache_id. (3-tables-002) gate GREEN; 69 passed; nondivergence GREEN.

Wire the new LinePlot match-column value-label producer into deconv_spectrum: highlight_value_column="mass" + highlight_value_template="{:.2f}" so the selected mass deconvolved stick gets its MonoMass value label (oracle mass.toFixed(2)), closing the round-9 finding. Test extended to assert exactly one value label at the selected stick. 69 passed.

…forms (3-best-002) is_best_per_scan now flags EVERY proteoform whose Scan is missing (NaN / null / non-numeric), matching the oracle filterBestPerSpectrum (keeps each row where typeof scan !== "number" || isNaN(scan)) instead of collapsing them into a single .over(Scan) group (which flagged only one -> hid the rest in best-per-spectrum mode). A missing Scan from protein.tsv arrives as float NaN, so the check casts to f64 (non-numeric -> null) and treats null|NaN as missing (dtype-safe; is_nan errors on an int column without the cast). Edge-case-gated (real protein.tsv populates Scan) but a real parity divergence. Test added.

Round 10: 9 units CLEAN (template grid/common/filemanager/page, nondivergence, builders, deconv-viewer, quant-viewer, insight). Deconv value-label (3-deconv-001) verified resolved. Two findings: 3-best-002 (best-per-spectrum must pass through missing-Scan proteoforms; fixed) and 3-cascade-001 (protein selection must cascade-clear the dependent aa/tag selections; fix in progress).

Wire clears_selections=["aa","tag"] onto the protein_table, reproducing the oracle ProteinTable.updateSelectedProtein cascade (clears selectedAA/selectedTag/tagData on every protein click). Switching proteoform now resets the dependent residue (aa) + tag selections, so the tag-table interval filter + tagger overlay follow the new proteoform instead of going stale/empty. Test asserts the protein_table args carry clearsSelections=["aa","tag"].

3-best-003 (schema): is_best_per_scan now ranks the highest REAL Score first -- NaN/missing/non-numeric Score is pushed to -inf before rank(descending) so it sorts LAST, mirroring the oracle toScore (NaN -> -Infinity). Previously polars ranked NaN as largest, flagging a no-score proteoform best. Test added. 3-cascade-002 (render): scan_table gains clears_selections=["mass"], reproducing the oracle TabulatorScanTable updateSelectedScan -> updateSelectedMass(0): a scan click clears the mass selection, and the mass_table (default_row=0) re-defaults to mass_in_scan 0 of the new scan via the bridge _auto_selection -- so the deconv/anno spectra + 3D show the new scan first mass, not a stale ordinal. Test asserts the cascade arg AND that the mass_table auto-selects 0 when mass is unset.

Round 12: 10 units clean (grid empty-row divergence sanctioned as exception-b: degenerate malformed-upload-only, no well-formed-data difference). Round-11 fixes (Score-NaN 3-best-003, scan->mass 3-cascade-002) + anno-peak click all verified resolved/correct. One new finding 3-seqview-001: SequenceView residue-click model (coverage- vs fragment-gating, no toggle-clear, no residue->mass) diverges from the oracle; re-opens the converged SequenceView -- escalating scope.

…001) Add fragment_mass_identifier="mass" to the FLASHTnT sequence_view builder so the oracle two-path residue click is fully reproduced (maintainer: "both should be supported as in the FLASHTnT Viewer"): - PATH 1 (aa / sequence-tag): coverage_column="coverage" (already wired) now makes residue_identifier="aa" coverage-gated + toggling via the Insight SequenceView (tag-covered residues drive the tagger/tag table; re-click clears). - PATH 2 (mass / fragment): a residue click on a fragment-matched residue publishes that fragment peak mass_in_scan to "mass" (resolved via the existing interactivity "mass" column), reproducing updateMassTableFromFragmentMass -> updateSelectedMass. FLASHDeconv sequence_view (no coverage/residue wiring) is unchanged. Test asserts the two-path wiring.

…ied)

…view-002) Add fragment_mass_identifier="mass" to the FLASHDeconv sequence_view branch so a fragment-residue click in the (global) deconv Sequence View publishes the fragment peak mass_in_scan to the shared mass selection -- driving the mass table, deconv/anno spectra, and 3D S/N plot (all in the deconv default layout), matching the oracle aminoAcidSelected -> updateSelectedMass which runs on every tool. PATH 2 only (no tags/coverage on the global sequence). Test asserts deconv fragment_mass_identifier == mass and coverage_column is None.

Round 13: 9 units clean (SequenceView two-path PATH-1/PATH-2 verified for TnT; gate GREEN). 3 findings, all SequenceView: 3-seqview-002 deconv residue->mass (fixed), 3-seqview-003 inbound mass->fragment-table highlight, 3-seqview-004 mass-info header. Comprehensive SequenceView parity pass in progress.

…ghlight (3-seqview-003/004) - schema._build_seq_tnt: add observed_mass = computed_mass (= ProteoformMass), the oracle SequenceView header observed proteoform mass. - render._sequence_view (tnt): observed_mass_column="observed_mass" + mass_header_title="Proteoform" (renders Theoretical/Observed/Delta Mass header) + mass_selection_identifier="mass" (a mass selected elsewhere highlights the matching fragment-table row -- oracle updateFragmentTableFromMassSelection). - render._sequence_view (deconv): mass_selection_identifier="mass" (inbound highlight; no header -- global sequence, not a proteoform). Tests assert the header + inbound wiring on both branches.

…ighlight verified)

FLASHTnT sequence_view passes theoretical_mass_label="Theoretical protein mass" + observed_mass_label="Observed proteoform mass" (oracle preparePrecursorInfo proteoform branch) instead of the generic precursor-branch defaults. Test asserts.

…erified)

- schema._build_seq_deconv: add per-scan observed_mass = PrecursorMass (NULL for MS1 scans where PrecursorMass==0, so the SequenceView hides the header there -- matching the oracle precursor branch which renders no header for MS1; vs the TnT proteoform branch which shows "-" for a missing mass). - render._sequence_view (deconv): observed_mass_column="observed_mass" + mass_header_title="Precursor" (generic "Theoretical mass"/"Observed mass" labels = oracle preparePrecursorInfo precursor branch). Completes the deconv half of the mass-info header (3-seqview-004 did the TnT proteoform half). Test asserts deconv observed_mass_column/title + seq_deconv observed_mass column.

…ader fixes verified)

ms1_raw_heatmap/ms2_raw_heatmap plot raw m/z (from the annotated spectra), so the oracle PlotlyHeatmap yAxisLabel returns "m/z" for Raw MS1/MS2 Heatmaps -- only the DECONV heatmaps are "Monoisotopic Mass". Fix the two raw heatmaps y_label="m/z"; test now expects "m/z" for raw, "Monoisotopic Mass" for deconv.

…t ion) Round 16: 8 units clean (X-residue + deconv precursor header verified). 2 findings: 3-heatmap-001 raw-heatmap y-label (fixed: "m/z"); 3-seqview-008 SequenceView TSG path omits the full-length terminal fragment ion vs oracle (fix in progress).

…tmap fixes verified)

Round 17: 9 units clean (terminal-ion + raw-heatmap fixes verified; internal-fragment _terminal_collision_masses expansion confirmed oracle-faithful improvement, not a regression). 3 findings: 3-seqview-009/010 (HIGH) SequenceView computes fragments on the FULL protein not the proteoform sub-sequence (wrong masses/grid for truncated proteoforms) + no undetermined-terminus suppression; 3-seqview-011 (low) docstring accuracy. Proteoform-aware fragment-handling fix in progress.

…ng (3-seqview-009/010) seq_tnt carries the full sequence + 0-based proteoform terminals; end-to-end SequenceView-from-seq_tnt fragment grid == oracle sub-region (truncated + undetermined). No FLASHApp source change needed (tnt.py already slices; render.py already wires proteoform_start/end_column -- the Insight side now consumes them for fragments).

…ling verified)

Wire interactivity on all 4 heatmap builders (oracle PlotlyHeatmap click -> updateSelectedScan for all + updateSelectedMass for deconv MS1/MS2): deconv heatmaps interactivity={scan:scan_idx, mass:mass_idx}; raw heatmaps interactivity={scan:scan_idx}. The reused heatmap caches carry scan_idx(=scan_id)/mass_idx(=mass_in_scan) from getMSSignalDF. conftest heatmap fixture extended with scan_idx/mass_idx; test asserts the per-heatmap interactivity mapping.

…ority fixes verified)

First fully-clean round since round 11 -- the SequenceView interaction tail (residue click, mass header, X-residue, terminal ion, proteoform sub-region, ion-priority) + heatmap navigation are all resolved. All 53 findings raised across rounds 5-18 marked resolved (each fixed + confirmed by the all-units-clean round 19). Need rounds 20-21 clean to converge.

… rounds 19-20-21) Rounds 19, 20, 21 each fully clean (all 11 Phase-3 units pass the three critics + GREEN machine gate); 0 open findings. STATUS: CONVERGED. Round 21 note: an environment restart mid-round killed 4 in-flight review agents; they were re-launched and all confirmed CLEAN (schema 1/0, builders+deconv 2/0, tnt+quant 2/0, insight 1/0).

…pin openms-insight==0.1.15 The migration replaced the local Vue component (openms-streamlit-vue-component) with the openms-insight package, which ships its own Vue bundle. Per PHASE3_PLAN.md 5.5: - requirements.txt: openms-insight 0.1.11 -> 0.1.15 (the published 0.1.14 is Phase-1&2 WITHOUT the Phase-3 component features; 0.1.15 is the new Insight release carrying them). - Dockerfile + Dockerfile.arm: remove the node:21 js-build stage that cloned+built openms-streamlit-vue-component and the COPY of its dist into /app/js-component/dist. - Remove the openms-streamlit-vue-component git submodule (.gitmodules + gitlink). The migrated app has no js-component references (verified); local tests unaffected. REMAINING EXTERNAL STEP: publish openms-insight 0.1.15 to PyPI (maintainer release action) so a clean pip install -r requirements.txt / Docker build resolves it.

…PyPI dep) Add an insight-build stage (node:21) that clones the openms-insight branch claude/kind-heisenberg-u6dVm, builds its Vue bundle (npm install && npm run build), and syncs dist into openms_insight/js-component/dist. The compile-openms stage then pip-installs that source tree (hatchling force-includes the bundled dist) BEFORE requirements, so openms-insight==0.1.15 resolves from the branch instead of PyPI. Makes the image self-contained without a PyPI publish. (Verified: a source build produces a wheel that bundles assets/index.js.)

…lim image)

…esh-clone fix) In a fresh clone openms_insight/js-component/ does not exist (it only ever held the gitignored dist/), so cp had no parent dir and the build failed. mkdir -p it first. Validated end-to-end against a fresh clone of the branch: clone -> npm build -> sync -> pip wheel produces a wheel bundling the Vue dist (7 files incl. index.js).

coderabbitai · 2026-06-05T09:33:13Z

Warning

Review limit reached

@t0mdavid-m, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 35 minutes and 19 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more credits in the billing tab to continue.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 70be6150-9827-41e6-8fa9-77b63abf905c

📥 Commits

Reviewing files that changed from the base of the PR and between 988ff11 and 30d6f59.

📒 Files selected for processing (2)

src/render/render.py
tests/test_render_builders.py

📝 Walkthrough

Walkthrough

This PR migrates FLASHApp viewers to OpenMS‑Insight: adds tidy-parquet schema adapters, a builder factory, a reusable linked-grid + LayoutManager, viewer rewires for lazy per-dataset rendering, Docker/CI integration to build/install openms‑insight, migration governance scripts/config, and comprehensive tests/fixtures.

Changes

FLASHApp Phase 3 Migration

Layer / File(s)	Summary
Phase 3 governance, gates, and review harness `migration/README.md`, `migration/REVIEW.md`, `migration/specs/PHASE3_PLAN.md`, `migration/units.yaml`, `migration/nondivergence.py`, `migration/run_review.py`	Adds Phase 3 plan/specs, units registry, nondivergence comparator, and a review ledger CLI with `record`/`gate`/`report`.
Docker & CI: build and install openms-insight `Dockerfile`, `Dockerfile.arm`, `requirements.txt`, `.github/workflows/unit-tests.yml`	Adds Node 21 `insight-build` stage to build OpenMS‑Insight JS dist and pip-install the package during image build; removes legacy local js-build artifact copy; pins `openms-insight==0.1.15`; extends unit-tests workflow to build/install openms-insight before pytest.
FileManager: parquet path access for Insight `src/workflow/FileManager.py`	Adds `as_path` precedence to `get_results()`, guards missing DB rows, and adds `result_path()` helper returning on-disk parquet path for a single `(dataset_id, name_tag)`.
Schema Adapter: build_insight_caches (tidy Parquet) `src/render/schema.py`	New Polars-based adapter with primitives to explode list/nested columns, comma-split traces, normalize KDEs, mint stable IDs, and tool-specific builders producing idempotent tidy Parquet outputs for flashdeconv/flashtnt/flashquant behind `build_insight_caches()`.
Render Builder Factory: make_builders & warm_insight_caches `src/render/render.py`	Replaces index/state-tracking renderer with `make_builders()` returning zero-arg component factories, defines table column chrome and sequence-view wiring, and adds `warm_insight_caches()` for best-effort cache warm-up.
Frozen Grid & LayoutManager `src/view/grid.py`	Adds `render_linked_grid()` (shared `StateManager` for cross-panel interactivity), `Store` protocol, and `LayoutManager` UI for editing/validating/persisting layouts with JSON import/export; adds `show_linked_grid` helper in `src/common/common.py`.
Viewer rewires: FLASHDeconv / FLASHQuant / FLASHTnT `content/FLASHDeconv/FLASHDeconvViewer.py`, `content/FLASHQuant/FLASHQuantViewer.py`, `content/FLASHTnT/FLASHTnTViewer.py`	Rewrites viewers to use id-based per-experiment selectboxes, lazily call `build_insight_caches()` and `make_builders()`, render via `show_linked_grid()`, persist layout metadata, and add FLASHTnT best-per-spectrum/settings controls.
Workflow hooks: cache warming `src/Workflow.py`	Adds non-fatal `warm_insight_caches()` calls in TagWorkflow and DeconvWorkflow guarded by try/except.
Test infra: fixtures and end-to-end coverage `tests/conftest.py`, `tests/test_render_builders.py`, `tests/test_render_schema.py`	Adds Streamlit-mocking fixtures, synthetic oracle caches, and comprehensive tests for schema primitives, `build_insight_caches()` idempotency/edge-cases, `make_builders()` outputs, `render_linked_grid` wiring, table chrome, highlighting, and warm-insight behavior.
Legacy stack removals `src/render/StateTracker.py`, `src/render/components.py`, `src/render/initialize.py`, `src/render/update.py`, `src/render/util.py`, `tests/test_selection_clear.py`, `.gitmodules`, `openms-streamlit-vue-component`	Removes prior StateTracker/component wrappers/initialize/update utilities/hash helper and selection-clear tests; removes submodule entry for the old Vue component and stops copying its dist artifact.

Possibly Related PRs

OpenMS/FLASHApp#22: Also touches the openms-streamlit-vue-component submodule metadata.
OpenMS/FLASHApp#57: Related to previous viewer selection UI changes this migration replaces.
OpenMS/FLASHApp#21: Earlier viewer selection/validation work that overlaps with the refactor.

Poem

🐇 I hopped through parquet rows and frozen grid delight,
Swapped index keys for values, tuned the builder light.
Vue crumbs tucked away, Insight bakes the view,
Layouts freeze, caches warm — the panels hum anew.
Phase Three stitched, the rabbit claps tonight.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 59.21% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly and concisely summarizes the main objective: rebuilding FLASHApp viewers on OpenMS-Insight using a frozen template from Phase 3 of the migration.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch claude/kind-heisenberg-u6dVm

coderabbitai

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/workflow/FileManager.py (1)

480-486: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle missing cache rows before iterating fetchone().

If dataset_id has no row in one of the cache tables, fetchone() returns None and both loops raise TypeError before partial/KeyError handling runs. That turns a normal cache miss into a hard crash and breaks the new result_path() contract.

Suggested fix

         result = self.cache_cursor.fetchone()
+        if result is None:
+            result = (None,) * len(file_columns)
         for c, r in zip(file_columns, result):
             if r is None:
                 if partial:
                     continue
                 else:
@@
         result = self.cache_cursor.fetchone()
+        if result is None:
+            result = (None,) * len(data_columns)
         for c, r in zip(data_columns, result):
             if r is None:
                 if partial:
                     continue
                 else:

Also applies to: 498-504

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/workflow/FileManager.py` around lines 480 - 486, The SELECT followed by
fetchone() can return None for a cache miss, so before zipping over file_columns
and result ensure you check "result is None" and treat it as a cache miss (do
not iterate). For both occurrences (the block using self.cache_cursor,
file_columns and dataset_id starting at the shown snippet and the similar block
at 498-504) either raise KeyError(dataset_id) so upstream partial/KeyError
handling runs, or return the missing-value expected by result_path(); do this
check immediately after result = self.cache_cursor.fetchone() to avoid the
TypeError.

🧹 Nitpick comments (1)

migration/nondivergence.py (1)

56-57: ⚡ Quick win

Resolve nondivergence pair paths from a fixed base, not process CWD.

Line 56-Line 57 currently makes relative paths caller-dependent. Resolve relative entries from repo/config root so checks are deterministic across runners.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@migration/nondivergence.py` around lines 56 - 57, Current code creates
Path(pair[0]) and Path(pair[1]) which resolves relative paths against the
process CWD; change this to resolve against the repository/config root instead.
Modify the logic around a, b (and subsequent ea, eb) so you first compute a
stable base Path (e.g., repo_root or config_root) and then construct a =
base.joinpath(pair[0]) and b = base.joinpath(pair[1]) (or otherwise resolve pair
entries against that base) so existence checks ea, eb are deterministic across
runners; update any uses of a and b accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@content/FLASHDeconv/FLASHDeconvViewer.py`:
- Around line 47-49: The current mapping using to_id =
{file_manager.get_display_name(r): r for r in results} is lossy when display
names are duplicated; update the lookup to preserve all datasets by either (a)
keying by a unique identifier (use the raw dataset id from each result instead
of file_manager.get_display_name) where the UI can still show display names, or
(b) build a multimap from display name to a list of result ids (e.g., to_id:
dict[str, list] and append each r) so duplicates are retained; apply the same
change to the other mapping logic around the block handling lines 57–64 to
ensure no dataset is overwritten or lost.
- Line 67: The tool key passed to show_linked_grid is not panel-scoped and can
collide when two panels use the same ds; update the call to generate a unique
panel-scoped tool string (e.g., include a panel identifier such as an explicit
panel index, a unique id derived from exp_layout or builders, or a UUID) and
pass that as the tool argument to show_linked_grid (replace the current
f"flashdeconv_{ds}" with something like f"flashdeconv_{panel_id}_{ds}" or
similar) so the linked-grid state is isolated per panel; change the code that
constructs the tool key where show_linked_grid([exp_layout], builders, tool=...)
is invoked.

In `@content/FLASHQuant/FLASHQuantViewer.py`:
- Around line 29-30: The current code builds names =
[file_manager.get_display_name(r) for r in results] and to_id =
{file_manager.get_display_name(r): r for r in results}, which loses datasets
when display names collide; update the mapping and selection logic to use a
unique key instead of the display name (or map a display name to a list of
results). For example, keep a list of tuples (display_name, result) or build
to_id using a unique identifier from each result (e.g., r.id or r.path) while
still presenting file_manager.get_display_name(r) in the UI, and change the
lookup code that currently uses the display name (the selection at Line 38 and
the surrounding block that spans the same section) to resolve selection via that
unique identifier or disambiguate by choosing the correct entry from the list of
results for that display name.

In `@content/FLASHTnT/FLASHTnTViewer.py`:
- Around line 42-43: The code builds names = [file_manager.get_display_name(r)
for r in results] and to_id = {file_manager.get_display_name(r): r for r in
results}, which will drop entries when multiple results share the same display
name; change the mapping so resolution is based on a unique identifier instead
of display name (e.g., use r.id or r.path) or map display names to lists of
results: create to_id_by_unique = {r.id: r for r in results} and keep names as
display labels, or build to_id = defaultdict(list) and append each r under
file_manager.get_display_name(r) so duplicates are preserved; then update any
lookup code that uses to_id (the lookup at or around the code that resolves
selections) to use the unique-id map or to handle list values accordingly (e.g.,
disambiguate by id or present multiple choices).
- Line 76: The linked-grid tool key is currently only namespaced by dataset id
via tool=f"flashtnt_{ds}", which allows two side-by-side panels selecting the
same dataset to collide; update the call to show_linked_grid to include the
panel index (e.g., tool=f"flashtnt_{ds}_{panel_idx}" or similar) so the key is
unique per panel (locate the show_linked_grid([exp_layout], builders, tool=...)
invocation and add the panel index variable used when rendering panels).

In `@migration/units.yaml`:
- Around line 8-14: Replace hard-coded absolute paths in migration/units.yaml
(oracle_root, template_repo, nondivergence_pairs) with repo-relative or
environment-parameterized tokens (e.g., use a ${REPO_ROOT} or ${TEMPLATE_REPO}
placeholder) and update the code that reads units.yaml to expand those tokens at
runtime (resolve via an env var or compute repo root with os.getcwd()/git
rev-parse and join paths). Specifically change the values for oracle_root and
template_repo and the paths inside nondivergence_pairs to use placeholders, and
update the loader/validator that reads units.yaml to perform
environment-variable expansion and path normalization so CI and other machines
can resolve the locations correctly.

In `@src/render/render.py`:
- Around line 624-626: The sequence_view factory is always registered and
immediately triggers result_path() for seq_deconv/seq_tnt even if
build_insight_caches skipped them, causing cache misses; update the registration
site to first check for the presence of the sequence cache(s) before adding the
"sequence_view" entry (e.g. query the cache mapping returned by
build_insight_caches or call file_manager.has_cache/exists for the
seq_deconv/seq_tnt result paths) and only assign "sequence_view": lambda:
_sequence_view(...) when that check succeeds so datasets without sequence data
won't construct the panel.

In `@src/render/schema.py`:
- Around line 551-555: The lambda passed to pl.col("Scan").map_elements() only
checks for s is not None, so NaN/pd.NA still reach int(s) and crash; update the
guard to treat all missing/null-like values as missing (return -1) by using a
null check that covers NaN/pd.NA (e.g., pd.isna(s) or math.isnan where
appropriate) before calling int(s), keeping the rest of the expression
(scan_to_deconv lookup, alias("scan_id"), return_dtype=pl.Int64) unchanged so
missing scans yield -1 and valid scans are int-cast and mapped via
scan_to_deconv.

In `@src/view/grid.py`:
- Around line 243-266: The upload flow needs to validate JSON and avoid index
errors in expand: wrap json.load(uploaded) in a try/except to catch
JSONDecodeError and return/raise a controlled validation error, verify the
decoded object is the expected list-of-lists structure before calling expand,
and inside Grid.expand change the name-lookup to a safe pattern (e.g., check
membership of col in self.component_names or use a dict mapping name->option and
skip unknown IDs) rather than calling self.component_names.index(col) which
raises ValueError; ensure expand returns a consistent validated structure (or
raises a clear validation error) instead of letting runtime exceptions
propagate.

---

Outside diff comments:
In `@src/workflow/FileManager.py`:
- Around line 480-486: The SELECT followed by fetchone() can return None for a
cache miss, so before zipping over file_columns and result ensure you check
"result is None" and treat it as a cache miss (do not iterate). For both
occurrences (the block using self.cache_cursor, file_columns and dataset_id
starting at the shown snippet and the similar block at 498-504) either raise
KeyError(dataset_id) so upstream partial/KeyError handling runs, or return the
missing-value expected by result_path(); do this check immediately after result
= self.cache_cursor.fetchone() to avoid the TypeError.

---

Nitpick comments:
In `@migration/nondivergence.py`:
- Around line 56-57: Current code creates Path(pair[0]) and Path(pair[1]) which
resolves relative paths against the process CWD; change this to resolve against
the repository/config root instead. Modify the logic around a, b (and subsequent
ea, eb) so you first compute a stable base Path (e.g., repo_root or config_root)
and then construct a = base.joinpath(pair[0]) and b = base.joinpath(pair[1]) (or
otherwise resolve pair entries against that base) so existence checks ea, eb are
deterministic across runners; update any uses of a and b accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9ef92ede-ee8d-47e5-83d9-a18e8557d9c0

📥 Commits

Reviewing files that changed from the base of the PR and between a89427d and de4b7d5.

📒 Files selected for processing (31)

.gitmodules
Dockerfile
Dockerfile.arm
content/FLASHDeconv/FLASHDeconvViewer.py
content/FLASHQuant/FLASHQuantViewer.py
content/FLASHTnT/FLASHTnTViewer.py
migration/README.md
migration/REVIEW.md
migration/nondivergence.py
migration/review-log/phase-3.jsonl
migration/run_review.py
migration/specs/.gitkeep
migration/specs/PHASE3_PLAN.md
migration/units.yaml
openms-streamlit-vue-component
requirements.txt
src/common/common.py
src/render/StateTracker.py
src/render/components.py
src/render/initialize.py
src/render/render.py
src/render/schema.py
src/render/update.py
src/render/util.py
src/view/__init__.py
src/view/grid.py
src/workflow/FileManager.py
tests/conftest.py
tests/test_render_builders.py
tests/test_render_schema.py
tests/test_selection_clear.py

💤 Files with no reviewable changes (8)

src/render/StateTracker.py
.gitmodules
tests/test_selection_clear.py
src/render/util.py
src/render/initialize.py
src/render/components.py
src/render/update.py
openms-streamlit-vue-component

coderabbitai · 2026-06-05T09:33:17Z

+  oracle_root: /home/user/FLASHApp/src/render
+  template_repo: /home/user/streamlit-template
+  # Pairs checked by nondivergence.py: FLASHApp must reuse the template module verbatim.
+  # Filled in once the template grid module is frozen (Phase 3 step 2).
+  nondivergence_pairs:
+    - [/home/user/FLASHApp/src/view/grid.py, /home/user/streamlit-template/src/view/grid.py]
+


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Hard-coded /home/user/... paths make the phase harness non-portable.

This config will fail on any machine/CI runner that doesn’t mirror that exact filesystem layout. Please switch targets/oracles/cwd entries to repo-relative or env-parameterized paths and resolve them in the scripts.

Also applies to: 22-50, 55-119

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@migration/units.yaml` around lines 8 - 14, Replace hard-coded absolute paths in migration/units.yaml (oracle_root, template_repo, nondivergence_pairs) with repo-relative or environment-parameterized tokens (e.g., use a ${REPO_ROOT} or ${TEMPLATE_REPO} placeholder) and update the code that reads units.yaml to expand those tokens at runtime (resolve via an env var or compute repo root with os.getcwd()/git rev-parse and join paths). Specifically change the values for oracle_root and template_repo and the paths inside nondivergence_pairs to use placeholders, and update the loader/validator that reads units.yaml to perform environment-variable expansion and path normalization so CI and other machines can resolve the locations correctly.

coderabbitai · 2026-06-05T09:33:17Z

+def show_linked_grid(
+    layout,
+    builders,
+    *,
+    tool,
+    side_by_side=False,
+    grid_key="linked_grid",
+    height=None,
+    column_heights=None,
+):
+    """Render an N-experiment OpenMS-Insight linked grid.
+
+    Thin convenience over ``src.view.grid.render_linked_grid`` (the frozen,
+    tool-agnostic grid) that handles the multi-experiment + side-by-side page
+    concern. ``layout`` is ``List[experiment]``; each experiment is the nested
+    rows list consumed by ``render_linked_grid``. One independent
+    ``StateManager`` is created per experiment (``session_key=f"{tool}__exp{i}"``)
+    so experiments never cross-link. When exactly two experiments and
+    ``side_by_side=True`` they render in two ``st.columns``; otherwise they stack
+    with ``st.divider()``.
+
+    Experiment *selection* (the per-experiment ``st.selectbox``) stays in the
+    viewer page because it is tool/data specific; this helper only owns the
+    grid + side-by-side rendering.
+    """
+    from src.view.grid import render_linked_grid
+
+    def _one(exp_idx, exp_layout, container):
+        with container:
+            render_linked_grid(
+                exp_layout,
+                builders,
+                state_key=f"{tool}__exp{exp_idx}",
+                grid_key=f"{grid_key}_{exp_idx}",
+                height=height,
+                column_heights=column_heights,
+            )
+
+    if len(layout) == 2 and side_by_side:
+        c1, c2 = st.columns(2)
+        _one(0, layout[0], c1)
+        _one(1, layout[1], c2)
+    else:
+        for i, exp_layout in enumerate(layout):
+            if i:
+                st.divider()
+            _one(i, exp_layout, st.container())


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Make experiment rendering input/state explicitly per-experiment.

This helper currently applies one builders map to every experiment and uses state_key=f"{tool}__exp{idx}". That can leak stale selections across dataset switches and prevents clean per-dataset isolation when experiments differ.

💡 Suggested API shape

def show_linked_grid( layout, - builders, + builders, *, tool, + state_keys=None, side_by_side=False, @@ ): @@ - def _one(exp_idx, exp_layout, container): + def _one(exp_idx, exp_layout, container): + exp_builders = ( + builders[exp_idx] if isinstance(builders, (list, tuple)) else builders + ) + exp_state_key = ( + state_keys[exp_idx] + if state_keys is not None + else f"{tool}__exp{exp_idx}" + ) with container: render_linked_grid( exp_layout, - builders, - state_key=f"{tool}__exp{exp_idx}", + exp_builders, + state_key=exp_state_key, grid_key=f"{grid_key}_{exp_idx}", height=height, column_heights=column_heights, )

coderabbitai · 2026-06-05T09:33:17Z

+    def expand(self, trimmed: list, drop_empty_experiments: bool = True) -> list:
+        """internal names -> labels, dropping empty cells/rows.
+
+        ``drop_empty_experiments`` (default True, the edit-mode behavior) also drops
+        a wholly-empty experiment. The upload path passes False to match the oracle
+        ``handleSettingButtons``, whose inline expand keeps an empty experiment as a
+        ``[]`` stub so ``num_experiments`` stays ``len(uploaded)`` and the
+        reset-on-count-mismatch never fires (which would wipe the upload).
+        """
+        expanded = []
+        for exp in trimmed:
+            rows = []
+            for row in exp:
+                cols = []
+                for col in row:
+                    if col:
+                        cols.append(
+                            self.component_options[self.component_names.index(col)]
+                        )
+                if cols:
+                    rows.append(cols)
+            if rows or not drop_empty_experiments:
+                expanded.append(rows)
+        return expanded


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Harden uploaded layout handling to avoid runtime crashes.

json.load(uploaded) and the expand() name lookup path can throw on invalid JSON or unknown/stale component IDs, which currently crashes the layout page instead of surfacing a validation error.

💡 Suggested fix

def _handle_setting_buttons(self) -> None: @@ - if uploaded is not None: - uploaded_layout = json.load(uploaded) + if uploaded is not None: + try: + uploaded_layout = json.load(uploaded) + except json.JSONDecodeError: + st.session_state[self._k("component_error")] = "Invalid JSON layout file" + return + if not isinstance(uploaded_layout, list): + st.session_state[self._k("component_error")] = "Invalid layout format" + return @@ - st.session_state[self._k("layout")] = self.expand( + st.session_state[self._k("layout")] = self.expand( uploaded_layout, drop_empty_experiments=False )

def expand(self, trimmed: list, drop_empty_experiments: bool = True) -> list: @@ for col in row: if col: - cols.append( - self.component_options[self.component_names.index(col)] - ) + if col not in self.component_names: + st.session_state[self._k("component_error")] = ( + f"Unknown component in saved layout: {col}" + ) + continue + cols.append( + self.component_options[self.component_names.index(col)] + )

Also applies to: 423-438

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/view/grid.py` around lines 243 - 266, The upload flow needs to validate JSON and avoid index errors in expand: wrap json.load(uploaded) in a try/except to catch JSONDecodeError and return/raise a controlled validation error, verify the decoded object is the expected list-of-lists structure before calling expand, and inside Grid.expand change the name-lookup to a safe pattern (e.g., check membership of col in self.component_names or use a dict mapping name->option and skip unknown IDs) rather than calling self.component_names.index(col) which raises ValueError; ensure expand returns a consistent validated structure (or raises a clear validation error) instead of letting runtime exceptions propagate.

…ache-miss guards - Viewers: select experiments by stable dataset id (display name via format_func) so duplicate display names can't collapse distinct datasets; scope linked-grid tool keys by experiment index to isolate side-by-side panel state. - render.make_builders: only register sequence_view when its seq_deconv/seq_tnt cache exists, so datasets without sequence data don't hit a cache-miss crash. - schema._build_tags: treat NaN/None/pd.NA tag scans as missing (scan_id = -1) instead of crashing in int(). - FileManager.get_results: handle a missing cache row (fetchone() is None) as a normal cache miss instead of raising TypeError.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/workflow/FileManager.py (1)

535-542: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

result_path() should enforce parquet-only contract.

result_path() is documented/typed to return a parquet path string, but it currently forwards whatever get_results(..., as_path=True) returns. For non-.pq tags, that can be a loaded Python object, which breaks the method contract and can cause downstream data_path= misuse.

Proposed fix

 def result_path(self, dataset_id: str, name_tag: str) -> str:
@@
-    return self.get_results(dataset_id, [name_tag], as_path=True)[name_tag]
+    value = self.get_results(dataset_id, [name_tag], as_path=True)[name_tag]
+    if not isinstance(value, str):
+        raise KeyError(
+            f"{name_tag} for {dataset_id} is not a parquet-backed result path"
+        )
+    return value

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/workflow/FileManager.py` around lines 535 - 542, The result_path method
currently returns whatever get_results(..., as_path=True)[name_tag] yields,
which can be a non-path object; change result_path to fetch value =
self.get_results(dataset_id, [name_tag], as_path=True)[name_tag], then validate
that value is a string path ending with ".pq" (and keep letting KeyError bubble
if the tag is missing); if the value is not a string or does not end with ".pq"
raise a TypeError with a clear message indicating result_path enforces a parquet
path contract. Ensure references inside the method remain to result_path and
get_results so the behavior is localized.

.github/workflows/unit-tests.yml (1)

44-49: ⚠️ Potential issue | 🟠 Major

Exclude openms-insight from the subsequent requirements.txt install.

requirements.txt includes openms-insight==0.1.15 (line 62), but the workflow already installs the built local /tmp/openms-insight. Leaving the pin in place can still make pip install -r requirements.txt try to resolve openms-insight again (despite it requiring the prebuilt/mirrored js-component/dist to be installable in this repo).

Suggested fix

-          pip install -r requirements.txt
+          grep -vi 'openms-insight' requirements.txt > /tmp/requirements-no-insight.txt
+          pip install -r /tmp/requirements-no-insight.txt

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/unit-tests.yml around lines 44 - 49, The workflow
currently installs the local wheel with "pip install /tmp/openms-insight" and
then re-runs "pip install -r requirements.txt" which contains a pinned
openms-insight entry; modify the job in .github/workflows/unit-tests.yml so the
requirements install does not attempt to reinstall openms-insight — e.g.,
generate or use a filtered requirements file before running "pip install -r
requirements.txt" or run pip install with a constraint/exclude to skip
"openms-insight==0.1.15"; ensure the change targets the sequence around the
existing "pip install /tmp/openms-insight" and "pip install -r requirements.txt"
commands so only the local package is installed.

Source: Learnings

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/unit-tests.yml:
- Around line 27-30: The workflow is using the mutable tag actions/setup-node@v4
which weakens CI supply-chain security; update the uses line in
.github/workflows/unit-tests.yml to pin to a specific commit SHA for
actions/setup-node (replace actions/setup-node@v4 with
actions/setup-node@<full-commit-sha>), keeping the existing node-version setting
intact and using the commit SHA from the official actions/setup-node repository
release you trust.
- Around line 38-39: The CI job currently clones the mutable branch reference
"claude/kind-heisenberg-u6dVm" via the git clone command and can drift; change
the workflow to fetch and checkout a specific immutable commit SHA instead of
relying on -b branch (or add an explicit git checkout <COMMIT_SHA> after clone),
and apply the same hardening to the Dockerfile/Dockerfile.arm flow that consumes
insight-ref.json by resolving and pinning the referenced commit SHA before
building; update references to the git clone invocation and the insight-ref.json
fetch so all checkouts use a concrete commit SHA rather than a branch name.

---

Outside diff comments:
In @.github/workflows/unit-tests.yml:
- Around line 44-49: The workflow currently installs the local wheel with "pip
install /tmp/openms-insight" and then re-runs "pip install -r requirements.txt"
which contains a pinned openms-insight entry; modify the job in
.github/workflows/unit-tests.yml so the requirements install does not attempt to
reinstall openms-insight — e.g., generate or use a filtered requirements file
before running "pip install -r requirements.txt" or run pip install with a
constraint/exclude to skip "openms-insight==0.1.15"; ensure the change targets
the sequence around the existing "pip install /tmp/openms-insight" and "pip
install -r requirements.txt" commands so only the local package is installed.

In `@src/workflow/FileManager.py`:
- Around line 535-542: The result_path method currently returns whatever
get_results(..., as_path=True)[name_tag] yields, which can be a non-path object;
change result_path to fetch value = self.get_results(dataset_id, [name_tag],
as_path=True)[name_tag], then validate that value is a string path ending with
".pq" (and keep letting KeyError bubble if the tag is missing); if the value is
not a string or does not end with ".pq" raise a TypeError with a clear message
indicating result_path enforces a parquet path contract. Ensure references
inside the method remain to result_path and get_results so the behavior is
localized.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5cf58efb-9598-46e6-8c10-83db94c2b574

📥 Commits

Reviewing files that changed from the base of the PR and between de4b7d5 and 94ed99a.

📒 Files selected for processing (7)

.github/workflows/unit-tests.yml
content/FLASHDeconv/FLASHDeconvViewer.py
content/FLASHQuant/FLASHQuantViewer.py
content/FLASHTnT/FLASHTnTViewer.py
src/render/render.py
src/render/schema.py
src/workflow/FileManager.py

🚧 Files skipped from review as they are similar to previous changes (5)

content/FLASHQuant/FLASHQuantViewer.py
content/FLASHTnT/FLASHTnTViewer.py
content/FLASHDeconv/FLASHDeconvViewer.py
src/render/render.py
src/render/schema.py

coderabbitai · 2026-06-07T11:52:28Z

+      - name: Set up Node
+        uses: actions/setup-node@v4
+        with:
+          node-version: "21"


⚠️ Potential issue | 🟠 Major

Pin actions/setup-node to a full commit SHA.

.github/workflows/unit-tests.yml uses the mutable action tag actions/setup-node@v4, which weakens CI supply-chain integrity—pin it to a specific commit SHA instead of v4.

🧰 Tools

🪛 zizmor (1.25.2)

[error] 28-28: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/unit-tests.yml around lines 27 - 30, The workflow is using the mutable tag actions/setup-node@v4 which weakens CI supply-chain security; update the uses line in .github/workflows/unit-tests.yml to pin to a specific commit SHA for actions/setup-node (replace actions/setup-node@v4 with actions/setup-node@<full-commit-sha>), keeping the existing node-version setting intact and using the commit SHA from the official actions/setup-node repository release you trust.

coderabbitai · 2026-06-07T11:52:28Z

+          git clone -b claude/kind-heisenberg-u6dVm --single-branch --depth 1 \
+            https://github.com/t0mdavid-m/openms-insight.git /tmp/openms-insight


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # Locate and print the relevant section of the workflow around the cited lines FILE=".github/workflows/unit-tests.yml" if [ ! -f "$FILE" ]; then echo "Missing $FILE" exit 1 fi nl -ba "$FILE" | sed -n '1,120p' echo "----" nl -ba "$FILE" | sed -n '20,80p' echo "----" # Also search for any other openms-insight clone/install logic in the repo rg -n "openms-insight|/tmp/openms-insight|npm run build|js-component/dist|requirements\.txt" .github/workflows/unit-tests.yml .claude/hooks/session-start.sh Dockerfile* . || true

Repository: OpenMS/FLASHApp

Length of output: 103

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE=".github/workflows/unit-tests.yml" if [ ! -f "$FILE" ]; then echo "Missing $FILE" exit 1 fi echo "---- $FILE (lines ~20-80) ----" awk 'NR>=20 && NR<=80 {printf "%5d %s\n", NR, $0}' "$FILE" echo "---- matches for openms-insight related logic ----" rg -n "openms-insight|/tmp/openms-insight|js-component/dist|npm run build|requirements\.txt" .github/workflows .claude/hooks Dockerfile* 2>/dev/null || true echo "---- any checkout/pin patterns in the workflow ----" rg -n "git clone|checkout .*sha|checkout --detach|fetch --depth|--depth 1" .github/workflows/unit-tests.yml 2>/dev/null || true

Repository: OpenMS/FLASHApp

Length of output: 5956

🏁 Script executed:

#!/bin/bash set -euo pipefail for f in Dockerfile Dockerfile.arm; do echo "===== $f (around insight clone) =====" [ -f "$f" ] || continue # print likely relevant region awk 'NR>=60 && NR<=160 {printf "%5d %s\n", NR, $0}' "$f" echo done echo "===== search for insight-ref.json usage =====" rg -n "insight-ref\.json|INSIGHT_BRANCH|openms-insight" Dockerfile Dockerfile.arm

Repository: OpenMS/FLASHApp

Length of output: 12567

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n "insight-ref\.json|insight-ref|INSIGHT_BRANCH=.*|git.*checkout.*openms-insight" -S . \ || true # Specifically ensure there is no follow-up that extracts a SHA from insight-ref.json rg -n "jq |\.object\.sha|\"sha\"|grep .*insight-ref\.json|cat .*insight-ref\.json" Dockerfile Dockerfile.arm .github/workflows .claude/hooks 2>/dev/null || true # List any other workflows that clone openms-insight rg -n "openms-insight\.git|/tmp/openms-insight|openms_insight/js-component/dist" .github/workflows .claude/hooks Dockerfile* 2>/dev/null || true

Repository: OpenMS/FLASHApp

Length of output: 2552

Use an immutable openms-insight revision in CI checkout.

.github/workflows/unit-tests.yml clones claude/kind-heisenberg-u6dVm via git clone -b ... --depth 1 with no commit SHA pinning, so CI results can silently drift as the branch moves. The same mutable-branch pattern is also used in Dockerfile/Dockerfile.arm (the insight-ref.json fetch is not followed by a SHA checkout).

Suggested hardening

- git clone -b claude/kind-heisenberg-u6dVm --single-branch --depth 1 \ - https://github.com/t0mdavid-m/openms-insight.git /tmp/openms-insight + git clone https://github.com/t0mdavid-m/openms-insight.git /tmp/openms-insight + git -C /tmp/openms-insight checkout <pinned-openms-insight-commit-sha>

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/unit-tests.yml around lines 38 - 39, The CI job currently clones the mutable branch reference "claude/kind-heisenberg-u6dVm" via the git clone command and can drift; change the workflow to fetch and checkout a specific immutable commit SHA instead of relying on -b branch (or add an explicit git checkout <COMMIT_SHA> after clone), and apply the same hardening to the Dockerfile/Dockerfile.arm flow that consumes insight-ref.json by resolving and pinning the referenced commit SHA before building; update references to the git clone invocation and the insight-ref.json fetch so all checkouts use a concrete commit SHA rather than a branch name.

The viewer built every panel's Insight cache lazily on first experiment selection: build_insight_caches() for the tidy parquet, then constructing each default-layout component (data_path= -> subprocess preprocessing -> on-disk cache). That one-time cost landed on the viewer's hot path. Move it into the workflow run. After the tidy parse, TagWorkflow (flashtnt) and DeconvWorkflow (flashdeconv) call a new render.warm_insight_caches(): build the tidy caches, then construct each default-layout builder once so its {cache_id}/ cache is written. The viewer (with OpenMS-Insight M1) then reconstructs from those caches instead of preprocessing, so opening an experiment is fast. Scope + safety: - Warms only DEFAULT_WARM_PANELS per tool (mirrors each viewer DEFAULT_LAYOUT) rather than every builder make_builders returns, so secondary heatmaps and the other tools' panels are not over-built or spawned with missing data. - Best-effort: a missing source parquet or any construction error is logged and skipped, and the call site guards the import too, so cache warming can never fail the workflow that produced the results. - Construction only (no rendering), so it is Streamlit-free and safe in the worker. The cache_path matches the viewer's for all tools (share_cache). - dataset_id is timestamped per run, so each run warms fresh caches (no cross-run staleness). FLASHQuant (no execution() here, 2 light panels, no heatmap) is left to M1's lazy reuse. Adds tests: warming writes the flashdeconv/flashtnt default-panel caches and never raises on a dataset with no backing data.

Wrap render_linked_grid's panel loop in OpenMS-Insight's batch_rerun() so a cross-link cascade settles in one pass + a single rerun, instead of one rerun (one full-page pass) per panel. The shared StateManager already lets a downstream panel read the upstream selection an earlier panel set in the same pass, so one pass applies the whole scan->mass->spectra->3D cascade. The import is soft: against an OpenMS-Insight build without batch_rerun it falls back to a no-op context manager (the previous per-panel rerun behavior), so the grid keeps working regardless of the installed Insight version.

deconv.py already stores, beside each full-resolution heatmap, logspaced pre-downsampled levels ({base}_{N}, N from compute_compression_levels) -- but the Insight Heatmap builders read full-res and the levels went unused. The Heatmap keeps whatever frame it is handed as its FINEST pyramid level and re-downsamples per zoom region, so handing it the LARGEST pre-built level (closest to full-res) lets it build its pyramid from less data, trimming the one-time cache-build cost + memory. Add _heatmap_data_path: pick the largest {base}_{N} level available for the dataset (stored_data columns are a cross-dataset union, so try largest-first and skip ones unset for this dataset), falling back to full-res {base} when no levels were built (small datasets, full_count <= 20000) or a cache predates them. Wire the four heatmap builders through it via heat(...). Tradeoff (bounded): the finest level is ~the next logspace step below full (e.g. 200000 of 500000), so max-zoom detail and click-target density drop by that factor -- imperceptible at default zoom (the component renders min_points either way), only visible under very tight zoom. Falls back transparently, so existing full-res-only caches are unaffected. Adds test_heatmap_data_path_prefers_finest_level (finest-first, cross-dataset skip, full-res fallback). Full render-builders suite (33) stays green.

Flip _heatmap_data_path to pick the smallest {base}_{N} downsample level (was largest). The Insight Heatmap renders only ~min_points (default 10000) at any view -- downsampling to min_points at full zoom and re-binning the visible region to min_points on zoom -- and caps its own cache at 2*min_points (=20000, exactly the smallest stored level). So the rendered default/moderate view is identical regardless of which level is fed; the input size only sets how much data Insight reads to build its pyramid. Picking the largest level therefore minimized the very build-cost/memory win the levels were precomputed for (a 5M-point heatmap built from 2M rows instead of 20k). The smallest level (20000) saturates the detail Insight keeps while giving the full build-cost saving. The only cost is click-target density under extreme zoom; precise selection there remains available via the scan/mass tables. Verified by fan-out review: the downsample preserves scan_idx/mass_idx, so the heatmap click cascade is intact; full render-builders suite (33) stays green. Updates the test to assert smallest-first selection (+ the cross-dataset skip).

claude added 30 commits June 2, 2026 18:46

Phase 3: record round 3 (all 11 units clean, gate green) — 1st consec…

13e9e06

…utive clean round

Phase 3: record round 4 (all 11 units clean, gate green) — 2nd consec…

2e74076

…utive clean round

Phase 3: round-11 machine gate GREEN (round-10 fixes verified)

fb8680e

Phase 3: round-12 machine gate GREEN (round-11 fixes verified)

414a7d4

Phase 3: round-13 machine gate GREEN (SequenceView two-path fix verif…

0531cdc

…ied)

Phase 3: round-14 machine gate GREEN (SequenceView header + inbound h…

a392da0

…ighlight verified)

claude added 21 commits June 4, 2026 09:10

Phase 3: round-15 machine gate GREEN (SequenceView header label fix v…

50abbd4

…erified)

Phase 3: round-16 machine gate GREEN (X-residue + deconv precursor he…

9f3e9fb

…ader fixes verified)

Phase 3: round-17 machine gate GREEN (terminal fragment ion + raw hea…

377b654

…tmap fixes verified)

Phase 3: round-18 machine gate GREEN (proteoform-region fragment hand…

67c437e

…ling verified)

Phase 3: round-19 machine gate GREEN (heatmap interactivity + ion-pri…

c6ee4af

…ority fixes verified)

Phase 3: round-20 machine gate GREEN

93195d8

Phase 3: round 20 ALL UNITS CLEAN (streak 2/3)

14cfece

Phase 3: round-21 machine gate GREEN (final convergence round)

340444b

Docker: drop js-component/node_modules from the insight-build COPY (s…

dcd9b49

…lim image)

coderabbitai Bot reviewed Jun 5, 2026

View reviewed changes

claude added 2 commits June 5, 2026 12:02

CI: build openms-insight from source before pytest (mirrors Dockerfile)

94ed99a

coderabbitai Bot reviewed Jun 7, 2026

View reviewed changes

claude added 4 commits June 11, 2026 12:49

		git clone -b claude/kind-heisenberg-u6dVm --single-branch --depth 1 \
		https://github.com/t0mdavid-m/openms-insight.git /tmp/openms-insight

Conversation

t0mdavid-m commented Jun 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Changes

New Modules

Removed Modules

Page Updates

Data Layer

Testing & Migration Harness

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Possibly Related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

t0mdavid-m commented Jun 5, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading