Skip to content

Phase 3: Rebuild FLASHApp viewers on OpenMS-Insight via frozen template#93

Open
t0mdavid-m wants to merge 57 commits into
developfrom
claude/kind-heisenberg-u6dVm
Open

Phase 3: Rebuild FLASHApp viewers on OpenMS-Insight via frozen template#93
t0mdavid-m wants to merge 57 commits into
developfrom
claude/kind-heisenberg-u6dVm

Conversation

@t0mdavid-m

@t0mdavid-m t0mdavid-m commented Jun 5, 2026

Copy link
Copy Markdown
Member

Summary

This PR completes Phase 3 of the FLASHApp migration: rebuild the three viewer pages (FLASHDeconv, FLASHTnT, FLASHQuant) on top of OpenMS-Insight using a reusable, tool-agnostic grid module from a frozen streamlit-template package. The old bespoke Vue mega-component and index-based selection logic are replaced with value-based filters/interactivity and a shared StateManager for cross-linked panels.

Key Changes

New Modules

  • src/view/grid.py — Tool-agnostic linked-grid renderer (render_linked_grid() + LayoutManager class). Distills the old render.py::render_grid() + both FLASH*LayoutManager classes into a single reusable template that can be frozen and vendored byte-for-byte unchanged into downstream apps.

  • src/render/schema.py — Adapter layer that converts FLASHApp's wide, list-column, index-addressed FileManager caches into tidy parquet with stable value IDs (scan_id, mass_id, peak_id, protein_id, etc.). Enables OpenMS-Insight's value-based filters/interactivity model. Includes helpers for exploding nested signal/noise peaks, KDE distributions, and protein/tag data.

  • src/render/render.py (repurposed) — Now a builder factory. make_builders(file_manager, dataset_id, tool, settings=None) returns a {comp_name: () -> BaseComponent} map. Each zero-arg factory closes over dataset context and uses file_manager.result_path() to feed data_path= to Insight components. Includes oracle Tabulator column definitions (titles, formatters, sorters) ported verbatim so migrated Tables show the same curated columns and formatting.

Removed Modules

  • src/render/update.py — Index-based selection logic (scanIndex / massIndex / proteinIndex / xRange/yRange slicing). Replaced by value-based filters/interactivity in Insight components.
  • src/render/initialize.py — Per-panel cache loading. Now handled by Insight's data_path= + subprocess preprocessing.
  • src/render/StateTracker.py — Local state reconciler. Replaced 1:1 by openms_insight.StateManager.
  • src/render/components.py — Vue component registration. No longer needed.
  • src/render/util.py — Hash utility (unused).
  • .gitmodules + openms-streamlit-vue-component/ — Old Vue submodule removed.

Page Updates

  • content/FLASHDeconv/FLASHDeconvViewer.py, content/FLASHTnT/FLASHTnTViewer.py, content/FLASHQuant/FLASHQuantViewer.py — Simplified to call show_linked_grid() with a layout and builders. No more render_grid() / StateTracker / initialize_data / update_data / filter_data calls.

  • src/common/common.py — Added show_linked_grid() convenience wrapper for rendering N-experiment grids with the frozen template.

Data Layer

  • src/workflow/FileManager.py — Added result_path(dataset_id, name_tag) method to expose parquet file paths for data_path= (Insight's subprocess preprocessing entry point).

Testing & Migration Harness

  • tests/test_render_schema.py — Construct-smoke for build_insight_caches(): synthetic FileManager caches → tidy parquet with stable IDs and correct explode shapes.
  • **`tests/test_render

https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne

Summary by CodeRabbit

  • New Features

    • Viewer pages rebuilt to use a unified linked-grid with per-experiment layout editor, value-based selection, reusable Insight builders, and best-effort cache warming.
  • Documentation

    • Added Phase 3 migration plan, review guidance, and convergence tooling docs.
  • Tests

    • Large new suites validating builders, schema transforms, linked-grid wiring, and cache warming.
  • Chores

    • Runtime now installs openms-insight; legacy local JS submodule removed.

claude added 30 commits June 2, 2026 18:46
Scaffolds Phase-3 tracking: rebuild FLASHApp viewer pages on OpenMS-Insight
via a reusable, frozen streamlit-template grid (no FLASHApp fork). No app code changes.

- units.yaml: 10 units (template build+freeze, then FLASHApp rebuild) with oracles.
- run_review.py: shared record/gate/report convergence driver.
- nondivergence.py: asserts FLASHApp grid code == frozen template (normalized hash).
- review-log/ ledger, REVIEW.md rollup, README.md.

https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne
… non-divergence)

Concrete spec from the planning fan-out: streamlit-template src/view/grid.py
(render_linked_grid + LayoutManager), FileManager data layer, show_linked_grid,
visualization_template demo; FLASHApp src/render/schema.py (tidy parquet, stable IDs);
builders factory + StateManager per (tool,experiment); 3 viewer rebuilds; index->value
selection map; non-divergence via vendored grid.py == frozen template grid.py.

https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne
MSG
- src/view/grid.py: byte-identical vendored copy of the frozen streamlit-template
  src/view/grid.py (the render_linked_grid + LayoutManager SSOT). Plain copy synced from
  the template, NOT a git submodule.
- src/view/__init__.py: package marker.
- migration/units.yaml: register nondivergence_pairs = [FLASHApp/src/view/grid.py,
  streamlit-template/src/view/grid.py] so nondivergence.py enforces byte-identical
  (normalized-hash) equality of the two side-by-side checkouts.

nondivergence GREEN (vendored == frozen template).

https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne
…rozen template grid

- src/render/schema.py (NEW): build_insight_caches(file_manager, dataset_id, tool) ->
  Insight-ready tidy parquet (stable scan_id/mass_id/peak_id/protein_id/tag_id/feature_id;
  exploded SignalPeaks/NoisyPeaks; comma-split quant traces; long-format target/decoy KDE).
- src/render/render.py: make_builders(file_manager, dataset_id, tool) factory (comp_name ->
  zero-arg -> BaseComponent), cache_id=f"{tool}__{dataset_id}__{comp}", data_path=result_path;
  index->value selection map (scanIndex->scan_id, massIndex->mass_in_scan/mass_id,
  proteinIndex+proteoform_scan_map->protein_id filter, StateTracker->StateManager).
- 3 viewer pages shrink to page_setup -> pick experiment -> load layout -> show_linked_grid
  (imports render_linked_grid/LayoutManager from the vendored src/view/grid.py, unchanged).
- src/common/common.py: show_linked_grid; FileManager: get_results(as_path)/result_path.
- DELETED dead layer: components.py, initialize.py, update.py, StateTracker.py, util.py.
- requirements.txt: + openms-insight.
- tests/: conftest + test_render_schema (9) + test_render_builders (8).
- Known seams (for review loop): Plot3D x/y/z config not forwarded through Insight's
  data_path subprocess (worked around with in-process data= for the small Plot3D frames);
  FLASHTnT SequenceView peaks not scan->protein remapped on protein selection.

Verified: pytest 45 passed/2 skipped, parse OK, nondivergence GREEN (grid.py untouched).

https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne
Replace the Phase-0 ast.parse placeholders with: nondivergence + template pytest (74,
incl. test_view_grid construct-smoke) + FLASHApp pytest (45, incl. render schema/builders
construct-smoke) + viewer-parse. Real machine gate for the review loop.

https://claude.ai/code/session_017kD4FyAsNvW6VFTZwVvSne
Resolves the high/med round-1 review findings on the FLASHApp rebuild:

schema.py:
- _explode_list_cols: drop null/empty list cells before exploding (no more
  phantom null rows for empty spectra / zero-mass scans).
- deconv_spectrum_tidy: alias MonoMass->mass (SequenceView requires a 'mass'
  column) and expose the per-scan ordinal as mass_in_scan (the oracle
  massIndex space the 3D / mass-table share) instead of dropping it.
- proteins: denormalize scan_id (=deconv_index via build_proteoform_scan_map)
  so a protein-row click resolves to its scan (value-based proteoform_scan_map).
- tags: key on scan_id (not a collapsed last-wins protein_id) so every tag on
  the selected proteoform's scan shows for ANY proteoform on that scan (oracle
  filtered by Scan).

render.py:
- mass_table / deconv_spectrum: 'mass' selection = mass_in_scan (per-scan
  ordinal the 3D consumes), not a global id; deconv x_column -> mass.
- anno_spectrum: remove mass interactivity (raw-m/z click never matched the
  deconvolved MonoMass in the oracle, so it selected nothing).
- protein_table: click sets BOTH protein and scan; tag_table + augmented
  spectrum + sequence-view peaks follow via scan; combined_spectrum filters
  by scan (was an unset 'spectrum' slot -> blank).
- SequenceView (tnt): filters by protein (sequence) AND scan (peaks).

viewers: blank-until-pick experiment selector (oracle parity; also avoids
eager cache builds on page load).

tests updated to assert the parity-correct wiring + a null-guard regression.
- combined_spectrum: resolve the tag-table's scalar tag_id to the tag's
  fragment masses + sequence via the tags frame (FLASHTnT only; FLASHDeconv
  has no tags frame so the resolve kwargs are omitted).
- sequence_view (tnt): publish residue clicks as the 'aa' selection the
  tagger consumes for the tag-relative selectedAA (gold highlight).
- test: the tagger resolves tag_id -> {sequence, masses, selectedAA}, tag/aa
  are re-render dependencies, and FLASHDeconv has no tag resolution.

Resolves the last round-1 finding (3-tnt-003, tagger overlay dead).
…rowing

Resolves the round-2 review findings:

- 3-builders-003 (high): 3D Precursor Signals showed nothing until a mass was
  clicked because mass was a hard filter. Now optional_filters=["mass"] so the
  scan's full Signal/Noisy scatter shows immediately, narrowing to one mass only
  when selected (oracle parity).
- 3-quant-001 (med): quant feature-trace 3D axes were swapped/mislabeled. Now
  x=m/z, y=retention time, z=intensity with explicit labels (oracle
  FLASHQuantView), title 'Feature group signals'.
- 3-tnt-005 (low): clicking a sequence residue now narrows the tag table to tags
  spanning it via interval_filters={"aa": ("StartPos", "EndPos")}, reusing the
  SequenceView 'aa' selection (oracle TabulatorTagTable secondary filter).

migration: gate now also runs the OpenMS-Insight gate (pytest/build/vitest/
parity) since Phase 3 re-opened Insight; insight:tagger-seqview tracked as a
no-regression unit. Tests added for all three fixes.
…or grid

- 3-schema-004 (high): the Precursor Signals 3D plotted raw m/z on a 'Mass'
  axis. The oracle (get3DplotInputFromSNRPeaks) uses x = mz*charge; add a
  precomputed 'mass' column to precursor_signals and point Plot3D x_column there
  (Plot3D's default x_label 'Mass' matches the oracle axis title). Charge states
  no longer collapse to their m/z positions.
- re-vendor src/view/grid.py from the frozen template (empty-experiment upload
  fix); nondivergence GREEN.
- tests: precursor_signals carries mass==mz*charge (explode unit + integration);
  3D builder x/y/z = mass/charge/intensity with label 'Mass'.

Also records round-5 review (9 clean, 2 findings: 3-schema-004, 3-grid-003).
…ution lines

Resolves round-6 review findings:
- 3-builders-005 (med): revert the 3D Precursor Signals optional_filters=[mass].
  The oracle frontend (getPrecursorSignal) renders EMPTY when no mass is selected
  (the precursor-scan lookup fails on the scan-filtered per_scan_data); it only
  draws SignalPeaks[mass_index] once a mass is chosen. mass is again a REQUIRED
  filter. (Round-2's 3-builders-003 misread update.py's data-prep as the displayed
  behavior.)
- 3-builders-004 (low): pass oracle axis titles to the plain spectra + heatmaps
  (Monoisotopic Mass / m/z / Intensity / Retention Time) instead of raw column names.
- 3-quant-002 (med): quant feature-trace 3D uses stem=False so each charge is one
  connected elution line (oracle FLASHQuantView mode:lines), not per-point spikes.
- 3-schema-006 (nit): correct the anno is_signal test comment.

tests: axis-title coverage; quant stem=False; corrected is_signal comment.
(OpenMS-Insight tagger config-key leak 3-tnt-006 fixed in the Insight commit.)
…ch-oracle-chrome)

Per the chosen full-presentation-parity bar:
- All 5 tables now pass column_definitions (oracle Tabulator titles, fixed +
  placeholder(-1->'-') formatters, sorters) + per-table initial_sort
  (Protein/Tag = Score desc). The Table renders ONLY the curated oracle columns;
  internal carrier columns (scan_id, mzs, ProteinIndex, full sequence, ...) stay
  in the data for filters/interactivity/index but are no longer displayed.
  Coverage(%) omitted (oracle commented it out); de-duped FLASHQuant's duplicated
  'Feature Group Quantity' column.
- quant_traces_3d: series_column='isotope' (breaks the polyline between isotopes
  within a charge, reproducing the oracle -1000 z-sentinel gaps) +
  category_name_template='Charge: {}' (legend 'Charge: 2'), via the new Plot3D
  features.
- conftest: synthetic protein_dfs/tag_dfs fixtures extended with the real
  FLASHTagger columns (Score, MatchingFragments, ModCount, TagCount, Coverage(%),
  Nmass/Cmass/DeltaMass) so initial_sort + the placeholder formatter are exercised.

Resolves 3-tables-001, 3-quant-003, 3-quant-004. (ProteinTable best-per-spectrum
toggle still pending.)
Round 8 (match-oracle-chrome bar): template/grid/common/filemanager/page + insight
+ deconv-viewer CLEAN; table chrome + quant isotope-breaks/legend verified resolved.
Remaining findings: 3-quant-005 (per-trace-id for same-isotope duplicate traces),
3-tables-002 (best-per-spectrum toggle), 3-tables-003 (go_to_fields), 3-fdr-001/002
(FDR title+trace names), 3-feat-001 (feature-table title), 3-anno-001 (anno spectrum
selection-driven highlight model). User directive: match the full interaction model.
…can flag

Data-layer support for the full-interaction-model parity work (render wiring in a
follow-up):
- anno_highlight_link (NEW tidy frame: scan_id, peak_id, mass_in_scan, charge):
  maps each annotated signal peak to the deconvolved mass(es) it belongs to + its
  charge, keyed by anno_spectrum_tidy.peak_id, so the annotated spectrum can
  highlight ONLY the selected mass's raw peaks + show per-charge z=N labels.
  Confirmed 1:MANY (a raw m/z peak can be a signal peak for several masses at
  different charges) -> one row per (peak, mass).
- quant_traces + trace_in_feature: per-feature running trace id minted at the
  trace explode, carried through the point comma-split, so the 3D breaks the
  polyline PER-TRACE (fixes 3-quant-005: two same-(charge,isotope) traces no
  longer merge).
- proteins + is_best_per_scan: 1 for the max-Score row per Scan (ordinal rank,
  ties keep-first) -> backs the ProteinTable best-per-spectrum default view.
Additive; tests + fixtures updated (59 passed).
… chrome

Wire the new OpenMS-Insight selection-driven LinePlot/Plot3D features + schema
data into the viewers (match-full-interaction-model directive):
- anno_spectrum: selection-driven highlight of the SELECTED mass via the
  anno_highlight_link frame (highlight_selection=mass, highlight_link_path,
  key=peak_id, match=mass_in_scan, charge labels z={}, deconv_peaks_toggle=True);
  exposes peak_id via a private anno_peak interactivity slot (does not drive the
  shared mass slot). Removes the static is_signal highlight. (3-anno-001)
- deconv_spectrum: selective highlight of the selected mass (highlight_match_column
  =mass_in_scan). (residual: selected-mass value label needs a match-column label
  producer in the LinePlot -- noted for follow-up)
- 3D_SN_plot: title_selection -> dynamic "" / "Precursor signals" / "Mass signals".
  (3-3d-001)
- quant_traces_3d: series_column=trace_in_feature (per-trace break; fixes the
  same-(charge,isotope) merge). (3-quant-005)
- go_to_fields per oracle for every table (no internal-carrier leakage). (3-tables-003)
- FDR: title "FDR Plot" + trace names "Target QScores"/"Decoy QScores". (3-fdr-001/002)
- feature table title "Feature groups". (3-feat-001)
- best-per-spectrum toggle: per-experiment checkbox -> make_builders(best_per_spectrum)
  -> protein_table sources is_best_per_scan-filtered data under a distinct cache_id.
  (3-tables-002)
gate GREEN; 69 passed; nondivergence GREEN.
Wire the new LinePlot match-column value-label producer into deconv_spectrum:
highlight_value_column="mass" + highlight_value_template="{:.2f}" so the selected
mass deconvolved stick gets its MonoMass value label (oracle mass.toFixed(2)),
closing the round-9 finding. Test extended to assert exactly one value label at
the selected stick. 69 passed.
…forms (3-best-002)

is_best_per_scan now flags EVERY proteoform whose Scan is missing (NaN / null /
non-numeric), matching the oracle filterBestPerSpectrum (keeps each row where
typeof scan !== "number" || isNaN(scan)) instead of collapsing them into a single
.over(Scan) group (which flagged only one -> hid the rest in best-per-spectrum
mode). A missing Scan from protein.tsv arrives as float NaN, so the check casts to
f64 (non-numeric -> null) and treats null|NaN as missing (dtype-safe; is_nan errors
on an int column without the cast). Edge-case-gated (real protein.tsv populates
Scan) but a real parity divergence. Test added.
Round 10: 9 units CLEAN (template grid/common/filemanager/page, nondivergence,
builders, deconv-viewer, quant-viewer, insight). Deconv value-label (3-deconv-001)
verified resolved. Two findings: 3-best-002 (best-per-spectrum must pass through
missing-Scan proteoforms; fixed) and 3-cascade-001 (protein selection must
cascade-clear the dependent aa/tag selections; fix in progress).
Wire clears_selections=["aa","tag"] onto the protein_table, reproducing the oracle
ProteinTable.updateSelectedProtein cascade (clears selectedAA/selectedTag/tagData
on every protein click). Switching proteoform now resets the dependent residue (aa)
+ tag selections, so the tag-table interval filter + tagger overlay follow the new
proteoform instead of going stale/empty. Test asserts the protein_table args carry
clearsSelections=["aa","tag"].
3-best-003 (schema): is_best_per_scan now ranks the highest REAL Score first --
NaN/missing/non-numeric Score is pushed to -inf before rank(descending) so it
sorts LAST, mirroring the oracle toScore (NaN -> -Infinity). Previously polars
ranked NaN as largest, flagging a no-score proteoform best. Test added.

3-cascade-002 (render): scan_table gains clears_selections=["mass"], reproducing
the oracle TabulatorScanTable updateSelectedScan -> updateSelectedMass(0): a scan
click clears the mass selection, and the mass_table (default_row=0) re-defaults to
mass_in_scan 0 of the new scan via the bridge _auto_selection -- so the
deconv/anno spectra + 3D show the new scan first mass, not a stale ordinal.
Test asserts the cascade arg AND that the mass_table auto-selects 0 when mass is
unset.
Round 12: 10 units clean (grid empty-row divergence sanctioned as exception-b:
degenerate malformed-upload-only, no well-formed-data difference). Round-11 fixes
(Score-NaN 3-best-003, scan->mass 3-cascade-002) + anno-peak click all verified
resolved/correct. One new finding 3-seqview-001: SequenceView residue-click model
(coverage- vs fragment-gating, no toggle-clear, no residue->mass) diverges from the
oracle; re-opens the converged SequenceView -- escalating scope.
…001)

Add fragment_mass_identifier="mass" to the FLASHTnT sequence_view builder so the
oracle two-path residue click is fully reproduced (maintainer: "both should be
supported as in the FLASHTnT Viewer"):
- PATH 1 (aa / sequence-tag): coverage_column="coverage" (already wired) now makes
  residue_identifier="aa" coverage-gated + toggling via the Insight SequenceView
  (tag-covered residues drive the tagger/tag table; re-click clears).
- PATH 2 (mass / fragment): a residue click on a fragment-matched residue publishes
  that fragment peak mass_in_scan to "mass" (resolved via the existing interactivity
  "mass" column), reproducing updateMassTableFromFragmentMass -> updateSelectedMass.
FLASHDeconv sequence_view (no coverage/residue wiring) is unchanged. Test asserts
the two-path wiring.
…view-002)

Add fragment_mass_identifier="mass" to the FLASHDeconv sequence_view branch so a
fragment-residue click in the (global) deconv Sequence View publishes the fragment
peak mass_in_scan to the shared mass selection -- driving the mass table, deconv/anno
spectra, and 3D S/N plot (all in the deconv default layout), matching the oracle
aminoAcidSelected -> updateSelectedMass which runs on every tool. PATH 2 only (no
tags/coverage on the global sequence). Test asserts deconv fragment_mass_identifier
== mass and coverage_column is None.
Round 13: 9 units clean (SequenceView two-path PATH-1/PATH-2 verified for TnT; gate
GREEN). 3 findings, all SequenceView: 3-seqview-002 deconv residue->mass (fixed),
3-seqview-003 inbound mass->fragment-table highlight, 3-seqview-004 mass-info header.
Comprehensive SequenceView parity pass in progress.
…ghlight (3-seqview-003/004)

- schema._build_seq_tnt: add observed_mass = computed_mass (= ProteoformMass), the
  oracle SequenceView header observed proteoform mass.
- render._sequence_view (tnt): observed_mass_column="observed_mass" +
  mass_header_title="Proteoform" (renders Theoretical/Observed/Delta Mass header) +
  mass_selection_identifier="mass" (a mass selected elsewhere highlights the matching
  fragment-table row -- oracle updateFragmentTableFromMassSelection).
- render._sequence_view (deconv): mass_selection_identifier="mass" (inbound highlight;
  no header -- global sequence, not a proteoform).
Tests assert the header + inbound wiring on both branches.
claude added 21 commits June 4, 2026 09:10
FLASHTnT sequence_view passes theoretical_mass_label="Theoretical protein mass" +
observed_mass_label="Observed proteoform mass" (oracle preparePrecursorInfo
proteoform branch) instead of the generic precursor-branch defaults. Test asserts.
- schema._build_seq_deconv: add per-scan observed_mass = PrecursorMass (NULL for MS1
  scans where PrecursorMass==0, so the SequenceView hides the header there -- matching
  the oracle precursor branch which renders no header for MS1; vs the TnT proteoform
  branch which shows "-" for a missing mass).
- render._sequence_view (deconv): observed_mass_column="observed_mass" +
  mass_header_title="Precursor" (generic "Theoretical mass"/"Observed mass" labels =
  oracle preparePrecursorInfo precursor branch). Completes the deconv half of the
  mass-info header (3-seqview-004 did the TnT proteoform half).
Test asserts deconv observed_mass_column/title + seq_deconv observed_mass column.
ms1_raw_heatmap/ms2_raw_heatmap plot raw m/z (from the annotated spectra), so the
oracle PlotlyHeatmap yAxisLabel returns "m/z" for Raw MS1/MS2 Heatmaps -- only the
DECONV heatmaps are "Monoisotopic Mass". Fix the two raw heatmaps y_label="m/z";
test now expects "m/z" for raw, "Monoisotopic Mass" for deconv.
…t ion)

Round 16: 8 units clean (X-residue + deconv precursor header verified). 2 findings:
3-heatmap-001 raw-heatmap y-label (fixed: "m/z"); 3-seqview-008 SequenceView TSG path
omits the full-length terminal fragment ion vs oracle (fix in progress).
Round 17: 9 units clean (terminal-ion + raw-heatmap fixes verified; internal-fragment
_terminal_collision_masses expansion confirmed oracle-faithful improvement, not a
regression). 3 findings: 3-seqview-009/010 (HIGH) SequenceView computes fragments on
the FULL protein not the proteoform sub-sequence (wrong masses/grid for truncated
proteoforms) + no undetermined-terminus suppression; 3-seqview-011 (low) docstring
accuracy. Proteoform-aware fragment-handling fix in progress.
…ng (3-seqview-009/010)

seq_tnt carries the full sequence + 0-based proteoform terminals; end-to-end
SequenceView-from-seq_tnt fragment grid == oracle sub-region (truncated + undetermined).
No FLASHApp source change needed (tnt.py already slices; render.py already wires
proteoform_start/end_column -- the Insight side now consumes them for fragments).
Wire interactivity on all 4 heatmap builders (oracle PlotlyHeatmap click ->
updateSelectedScan for all + updateSelectedMass for deconv MS1/MS2): deconv heatmaps
interactivity={scan:scan_idx, mass:mass_idx}; raw heatmaps interactivity={scan:scan_idx}.
The reused heatmap caches carry scan_idx(=scan_id)/mass_idx(=mass_in_scan) from
getMSSignalDF. conftest heatmap fixture extended with scan_idx/mass_idx; test asserts
the per-heatmap interactivity mapping.
First fully-clean round since round 11 -- the SequenceView interaction tail
(residue click, mass header, X-residue, terminal ion, proteoform sub-region,
ion-priority) + heatmap navigation are all resolved. All 53 findings raised across
rounds 5-18 marked resolved (each fixed + confirmed by the all-units-clean round 19).
Need rounds 20-21 clean to converge.
… rounds 19-20-21)

Rounds 19, 20, 21 each fully clean (all 11 Phase-3 units pass the three critics +
GREEN machine gate); 0 open findings. STATUS: CONVERGED.

Round 21 note: an environment restart mid-round killed 4 in-flight review agents;
they were re-launched and all confirmed CLEAN (schema 1/0, builders+deconv 2/0,
tnt+quant 2/0, insight 1/0).
…pin openms-insight==0.1.15

The migration replaced the local Vue component (openms-streamlit-vue-component) with
the openms-insight package, which ships its own Vue bundle. Per PHASE3_PLAN.md 5.5:
- requirements.txt: openms-insight 0.1.11 -> 0.1.15 (the published 0.1.14 is Phase-1&2
  WITHOUT the Phase-3 component features; 0.1.15 is the new Insight release carrying them).
- Dockerfile + Dockerfile.arm: remove the node:21 js-build stage that cloned+built
  openms-streamlit-vue-component and the COPY of its dist into /app/js-component/dist.
- Remove the openms-streamlit-vue-component git submodule (.gitmodules + gitlink).
The migrated app has no js-component references (verified); local tests unaffected.

REMAINING EXTERNAL STEP: publish openms-insight 0.1.15 to PyPI (maintainer release
action) so a clean pip install -r requirements.txt / Docker build resolves it.
…PyPI dep)

Add an insight-build stage (node:21) that clones the openms-insight branch
claude/kind-heisenberg-u6dVm, builds its Vue bundle (npm install && npm run build),
and syncs dist into openms_insight/js-component/dist. The compile-openms stage then
pip-installs that source tree (hatchling force-includes the bundled dist) BEFORE
requirements, so openms-insight==0.1.15 resolves from the branch instead of PyPI.
Makes the image self-contained without a PyPI publish. (Verified: a source build
produces a wheel that bundles assets/index.js.)
…esh-clone fix)

In a fresh clone openms_insight/js-component/ does not exist (it only ever held the
gitignored dist/), so cp had no parent dir and the build failed. mkdir -p it first.
Validated end-to-end against a fresh clone of the branch: clone -> npm build -> sync
-> pip wheel produces a wheel bundling the Vue dist (7 files incl. index.js).
@coderabbitai

coderabbitai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@t0mdavid-m, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 35 minutes and 19 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more credits in the billing tab to continue.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 70be6150-9827-41e6-8fa9-77b63abf905c

📥 Commits

Reviewing files that changed from the base of the PR and between 988ff11 and 30d6f59.

📒 Files selected for processing (2)
  • src/render/render.py
  • tests/test_render_builders.py
📝 Walkthrough

Walkthrough

This PR migrates FLASHApp viewers to OpenMS‑Insight: adds tidy-parquet schema adapters, a builder factory, a reusable linked-grid + LayoutManager, viewer rewires for lazy per-dataset rendering, Docker/CI integration to build/install openms‑insight, migration governance scripts/config, and comprehensive tests/fixtures.

Changes

FLASHApp Phase 3 Migration

Layer / File(s) Summary
Phase 3 governance, gates, and review harness
migration/README.md, migration/REVIEW.md, migration/specs/PHASE3_PLAN.md, migration/units.yaml, migration/nondivergence.py, migration/run_review.py
Adds Phase 3 plan/specs, units registry, nondivergence comparator, and a review ledger CLI with record/gate/report.
Docker & CI: build and install openms-insight
Dockerfile, Dockerfile.arm, requirements.txt, .github/workflows/unit-tests.yml
Adds Node 21 insight-build stage to build OpenMS‑Insight JS dist and pip-install the package during image build; removes legacy local js-build artifact copy; pins openms-insight==0.1.15; extends unit-tests workflow to build/install openms-insight before pytest.
FileManager: parquet path access for Insight
src/workflow/FileManager.py
Adds as_path precedence to get_results(), guards missing DB rows, and adds result_path() helper returning on-disk parquet path for a single (dataset_id, name_tag).
Schema Adapter: build_insight_caches (tidy Parquet)
src/render/schema.py
New Polars-based adapter with primitives to explode list/nested columns, comma-split traces, normalize KDEs, mint stable IDs, and tool-specific builders producing idempotent tidy Parquet outputs for flashdeconv/flashtnt/flashquant behind build_insight_caches().
Render Builder Factory: make_builders & warm_insight_caches
src/render/render.py
Replaces index/state-tracking renderer with make_builders() returning zero-arg component factories, defines table column chrome and sequence-view wiring, and adds warm_insight_caches() for best-effort cache warm-up.
Frozen Grid & LayoutManager
src/view/grid.py
Adds render_linked_grid() (shared StateManager for cross-panel interactivity), Store protocol, and LayoutManager UI for editing/validating/persisting layouts with JSON import/export; adds show_linked_grid helper in src/common/common.py.
Viewer rewires: FLASHDeconv / FLASHQuant / FLASHTnT
content/FLASHDeconv/FLASHDeconvViewer.py, content/FLASHQuant/FLASHQuantViewer.py, content/FLASHTnT/FLASHTnTViewer.py
Rewrites viewers to use id-based per-experiment selectboxes, lazily call build_insight_caches() and make_builders(), render via show_linked_grid(), persist layout metadata, and add FLASHTnT best-per-spectrum/settings controls.
Workflow hooks: cache warming
src/Workflow.py
Adds non-fatal warm_insight_caches() calls in TagWorkflow and DeconvWorkflow guarded by try/except.
Test infra: fixtures and end-to-end coverage
tests/conftest.py, tests/test_render_builders.py, tests/test_render_schema.py
Adds Streamlit-mocking fixtures, synthetic oracle caches, and comprehensive tests for schema primitives, build_insight_caches() idempotency/edge-cases, make_builders() outputs, render_linked_grid wiring, table chrome, highlighting, and warm-insight behavior.
Legacy stack removals
src/render/StateTracker.py, src/render/components.py, src/render/initialize.py, src/render/update.py, src/render/util.py, tests/test_selection_clear.py, .gitmodules, openms-streamlit-vue-component
Removes prior StateTracker/component wrappers/initialize/update utilities/hash helper and selection-clear tests; removes submodule entry for the old Vue component and stops copying its dist artifact.

Possibly Related PRs

  • OpenMS/FLASHApp#22: Also touches the openms-streamlit-vue-component submodule metadata.
  • OpenMS/FLASHApp#57: Related to previous viewer selection UI changes this migration replaces.
  • OpenMS/FLASHApp#21: Earlier viewer selection/validation work that overlaps with the refactor.

Poem

🐇 I hopped through parquet rows and frozen grid delight,
Swapped index keys for values, tuned the builder light.
Vue crumbs tucked away, Insight bakes the view,
Layouts freeze, caches warm — the panels hum anew.
Phase Three stitched, the rabbit claps tonight.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 59.21% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely summarizes the main objective: rebuilding FLASHApp viewers on OpenMS-Insight using a frozen template from Phase 3 of the migration.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/kind-heisenberg-u6dVm

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/workflow/FileManager.py (1)

480-486: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle missing cache rows before iterating fetchone().

If dataset_id has no row in one of the cache tables, fetchone() returns None and both loops raise TypeError before partial/KeyError handling runs. That turns a normal cache miss into a hard crash and breaks the new result_path() contract.

Suggested fix
         result = self.cache_cursor.fetchone()
+        if result is None:
+            result = (None,) * len(file_columns)
         for c, r in zip(file_columns, result):
             if r is None:
                 if partial:
                     continue
                 else:
@@
         result = self.cache_cursor.fetchone()
+        if result is None:
+            result = (None,) * len(data_columns)
         for c, r in zip(data_columns, result):
             if r is None:
                 if partial:
                     continue
                 else:

Also applies to: 498-504

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/workflow/FileManager.py` around lines 480 - 486, The SELECT followed by
fetchone() can return None for a cache miss, so before zipping over file_columns
and result ensure you check "result is None" and treat it as a cache miss (do
not iterate). For both occurrences (the block using self.cache_cursor,
file_columns and dataset_id starting at the shown snippet and the similar block
at 498-504) either raise KeyError(dataset_id) so upstream partial/KeyError
handling runs, or return the missing-value expected by result_path(); do this
check immediately after result = self.cache_cursor.fetchone() to avoid the
TypeError.
🧹 Nitpick comments (1)
migration/nondivergence.py (1)

56-57: ⚡ Quick win

Resolve nondivergence pair paths from a fixed base, not process CWD.

Line 56-Line 57 currently makes relative paths caller-dependent. Resolve relative entries from repo/config root so checks are deterministic across runners.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@migration/nondivergence.py` around lines 56 - 57, Current code creates
Path(pair[0]) and Path(pair[1]) which resolves relative paths against the
process CWD; change this to resolve against the repository/config root instead.
Modify the logic around a, b (and subsequent ea, eb) so you first compute a
stable base Path (e.g., repo_root or config_root) and then construct a =
base.joinpath(pair[0]) and b = base.joinpath(pair[1]) (or otherwise resolve pair
entries against that base) so existence checks ea, eb are deterministic across
runners; update any uses of a and b accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@content/FLASHDeconv/FLASHDeconvViewer.py`:
- Around line 47-49: The current mapping using to_id =
{file_manager.get_display_name(r): r for r in results} is lossy when display
names are duplicated; update the lookup to preserve all datasets by either (a)
keying by a unique identifier (use the raw dataset id from each result instead
of file_manager.get_display_name) where the UI can still show display names, or
(b) build a multimap from display name to a list of result ids (e.g., to_id:
dict[str, list] and append each r) so duplicates are retained; apply the same
change to the other mapping logic around the block handling lines 57–64 to
ensure no dataset is overwritten or lost.
- Line 67: The tool key passed to show_linked_grid is not panel-scoped and can
collide when two panels use the same ds; update the call to generate a unique
panel-scoped tool string (e.g., include a panel identifier such as an explicit
panel index, a unique id derived from exp_layout or builders, or a UUID) and
pass that as the tool argument to show_linked_grid (replace the current
f"flashdeconv_{ds}" with something like f"flashdeconv_{panel_id}_{ds}" or
similar) so the linked-grid state is isolated per panel; change the code that
constructs the tool key where show_linked_grid([exp_layout], builders, tool=...)
is invoked.

In `@content/FLASHQuant/FLASHQuantViewer.py`:
- Around line 29-30: The current code builds names =
[file_manager.get_display_name(r) for r in results] and to_id =
{file_manager.get_display_name(r): r for r in results}, which loses datasets
when display names collide; update the mapping and selection logic to use a
unique key instead of the display name (or map a display name to a list of
results). For example, keep a list of tuples (display_name, result) or build
to_id using a unique identifier from each result (e.g., r.id or r.path) while
still presenting file_manager.get_display_name(r) in the UI, and change the
lookup code that currently uses the display name (the selection at Line 38 and
the surrounding block that spans the same section) to resolve selection via that
unique identifier or disambiguate by choosing the correct entry from the list of
results for that display name.

In `@content/FLASHTnT/FLASHTnTViewer.py`:
- Around line 42-43: The code builds names = [file_manager.get_display_name(r)
for r in results] and to_id = {file_manager.get_display_name(r): r for r in
results}, which will drop entries when multiple results share the same display
name; change the mapping so resolution is based on a unique identifier instead
of display name (e.g., use r.id or r.path) or map display names to lists of
results: create to_id_by_unique = {r.id: r for r in results} and keep names as
display labels, or build to_id = defaultdict(list) and append each r under
file_manager.get_display_name(r) so duplicates are preserved; then update any
lookup code that uses to_id (the lookup at or around the code that resolves
selections) to use the unique-id map or to handle list values accordingly (e.g.,
disambiguate by id or present multiple choices).
- Line 76: The linked-grid tool key is currently only namespaced by dataset id
via tool=f"flashtnt_{ds}", which allows two side-by-side panels selecting the
same dataset to collide; update the call to show_linked_grid to include the
panel index (e.g., tool=f"flashtnt_{ds}_{panel_idx}" or similar) so the key is
unique per panel (locate the show_linked_grid([exp_layout], builders, tool=...)
invocation and add the panel index variable used when rendering panels).

In `@migration/units.yaml`:
- Around line 8-14: Replace hard-coded absolute paths in migration/units.yaml
(oracle_root, template_repo, nondivergence_pairs) with repo-relative or
environment-parameterized tokens (e.g., use a ${REPO_ROOT} or ${TEMPLATE_REPO}
placeholder) and update the code that reads units.yaml to expand those tokens at
runtime (resolve via an env var or compute repo root with os.getcwd()/git
rev-parse and join paths). Specifically change the values for oracle_root and
template_repo and the paths inside nondivergence_pairs to use placeholders, and
update the loader/validator that reads units.yaml to perform
environment-variable expansion and path normalization so CI and other machines
can resolve the locations correctly.

In `@src/render/render.py`:
- Around line 624-626: The sequence_view factory is always registered and
immediately triggers result_path() for seq_deconv/seq_tnt even if
build_insight_caches skipped them, causing cache misses; update the registration
site to first check for the presence of the sequence cache(s) before adding the
"sequence_view" entry (e.g. query the cache mapping returned by
build_insight_caches or call file_manager.has_cache/exists for the
seq_deconv/seq_tnt result paths) and only assign "sequence_view": lambda:
_sequence_view(...) when that check succeeds so datasets without sequence data
won't construct the panel.

In `@src/render/schema.py`:
- Around line 551-555: The lambda passed to pl.col("Scan").map_elements() only
checks for s is not None, so NaN/pd.NA still reach int(s) and crash; update the
guard to treat all missing/null-like values as missing (return -1) by using a
null check that covers NaN/pd.NA (e.g., pd.isna(s) or math.isnan where
appropriate) before calling int(s), keeping the rest of the expression
(scan_to_deconv lookup, alias("scan_id"), return_dtype=pl.Int64) unchanged so
missing scans yield -1 and valid scans are int-cast and mapped via
scan_to_deconv.

In `@src/view/grid.py`:
- Around line 243-266: The upload flow needs to validate JSON and avoid index
errors in expand: wrap json.load(uploaded) in a try/except to catch
JSONDecodeError and return/raise a controlled validation error, verify the
decoded object is the expected list-of-lists structure before calling expand,
and inside Grid.expand change the name-lookup to a safe pattern (e.g., check
membership of col in self.component_names or use a dict mapping name->option and
skip unknown IDs) rather than calling self.component_names.index(col) which
raises ValueError; ensure expand returns a consistent validated structure (or
raises a clear validation error) instead of letting runtime exceptions
propagate.

---

Outside diff comments:
In `@src/workflow/FileManager.py`:
- Around line 480-486: The SELECT followed by fetchone() can return None for a
cache miss, so before zipping over file_columns and result ensure you check
"result is None" and treat it as a cache miss (do not iterate). For both
occurrences (the block using self.cache_cursor, file_columns and dataset_id
starting at the shown snippet and the similar block at 498-504) either raise
KeyError(dataset_id) so upstream partial/KeyError handling runs, or return the
missing-value expected by result_path(); do this check immediately after result
= self.cache_cursor.fetchone() to avoid the TypeError.

---

Nitpick comments:
In `@migration/nondivergence.py`:
- Around line 56-57: Current code creates Path(pair[0]) and Path(pair[1]) which
resolves relative paths against the process CWD; change this to resolve against
the repository/config root instead. Modify the logic around a, b (and subsequent
ea, eb) so you first compute a stable base Path (e.g., repo_root or config_root)
and then construct a = base.joinpath(pair[0]) and b = base.joinpath(pair[1]) (or
otherwise resolve pair entries against that base) so existence checks ea, eb are
deterministic across runners; update any uses of a and b accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9ef92ede-ee8d-47e5-83d9-a18e8557d9c0

📥 Commits

Reviewing files that changed from the base of the PR and between a89427d and de4b7d5.

📒 Files selected for processing (31)
  • .gitmodules
  • Dockerfile
  • Dockerfile.arm
  • content/FLASHDeconv/FLASHDeconvViewer.py
  • content/FLASHQuant/FLASHQuantViewer.py
  • content/FLASHTnT/FLASHTnTViewer.py
  • migration/README.md
  • migration/REVIEW.md
  • migration/nondivergence.py
  • migration/review-log/phase-3.jsonl
  • migration/run_review.py
  • migration/specs/.gitkeep
  • migration/specs/PHASE3_PLAN.md
  • migration/units.yaml
  • openms-streamlit-vue-component
  • requirements.txt
  • src/common/common.py
  • src/render/StateTracker.py
  • src/render/components.py
  • src/render/initialize.py
  • src/render/render.py
  • src/render/schema.py
  • src/render/update.py
  • src/render/util.py
  • src/view/__init__.py
  • src/view/grid.py
  • src/workflow/FileManager.py
  • tests/conftest.py
  • tests/test_render_builders.py
  • tests/test_render_schema.py
  • tests/test_selection_clear.py
💤 Files with no reviewable changes (8)
  • src/render/StateTracker.py
  • .gitmodules
  • tests/test_selection_clear.py
  • src/render/util.py
  • src/render/initialize.py
  • src/render/components.py
  • src/render/update.py
  • openms-streamlit-vue-component

Comment thread content/FLASHDeconv/FLASHDeconvViewer.py Outdated
Comment thread content/FLASHDeconv/FLASHDeconvViewer.py Outdated
Comment thread content/FLASHQuant/FLASHQuantViewer.py Outdated
Comment thread content/FLASHTnT/FLASHTnTViewer.py Outdated
Comment thread content/FLASHTnT/FLASHTnTViewer.py Outdated
Comment thread migration/units.yaml
Comment on lines +8 to +14
oracle_root: /home/user/FLASHApp/src/render
template_repo: /home/user/streamlit-template
# Pairs checked by nondivergence.py: FLASHApp must reuse the template module verbatim.
# Filled in once the template grid module is frozen (Phase 3 step 2).
nondivergence_pairs:
- [/home/user/FLASHApp/src/view/grid.py, /home/user/streamlit-template/src/view/grid.py]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Hard-coded /home/user/... paths make the phase harness non-portable.

This config will fail on any machine/CI runner that doesn’t mirror that exact filesystem layout. Please switch targets/oracles/cwd entries to repo-relative or env-parameterized paths and resolve them in the scripts.

Also applies to: 22-50, 55-119

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@migration/units.yaml` around lines 8 - 14, Replace hard-coded absolute paths
in migration/units.yaml (oracle_root, template_repo, nondivergence_pairs) with
repo-relative or environment-parameterized tokens (e.g., use a ${REPO_ROOT} or
${TEMPLATE_REPO} placeholder) and update the code that reads units.yaml to
expand those tokens at runtime (resolve via an env var or compute repo root with
os.getcwd()/git rev-parse and join paths). Specifically change the values for
oracle_root and template_repo and the paths inside nondivergence_pairs to use
placeholders, and update the loader/validator that reads units.yaml to perform
environment-variable expansion and path normalization so CI and other machines
can resolve the locations correctly.

Comment thread src/common/common.py
Comment on lines +965 to +1011
def show_linked_grid(
layout,
builders,
*,
tool,
side_by_side=False,
grid_key="linked_grid",
height=None,
column_heights=None,
):
"""Render an N-experiment OpenMS-Insight linked grid.

Thin convenience over ``src.view.grid.render_linked_grid`` (the frozen,
tool-agnostic grid) that handles the multi-experiment + side-by-side page
concern. ``layout`` is ``List[experiment]``; each experiment is the nested
rows list consumed by ``render_linked_grid``. One independent
``StateManager`` is created per experiment (``session_key=f"{tool}__exp{i}"``)
so experiments never cross-link. When exactly two experiments and
``side_by_side=True`` they render in two ``st.columns``; otherwise they stack
with ``st.divider()``.

Experiment *selection* (the per-experiment ``st.selectbox``) stays in the
viewer page because it is tool/data specific; this helper only owns the
grid + side-by-side rendering.
"""
from src.view.grid import render_linked_grid

def _one(exp_idx, exp_layout, container):
with container:
render_linked_grid(
exp_layout,
builders,
state_key=f"{tool}__exp{exp_idx}",
grid_key=f"{grid_key}_{exp_idx}",
height=height,
column_heights=column_heights,
)

if len(layout) == 2 and side_by_side:
c1, c2 = st.columns(2)
_one(0, layout[0], c1)
_one(1, layout[1], c2)
else:
for i, exp_layout in enumerate(layout):
if i:
st.divider()
_one(i, exp_layout, st.container())

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Make experiment rendering input/state explicitly per-experiment.

This helper currently applies one builders map to every experiment and uses state_key=f"{tool}__exp{idx}". That can leak stale selections across dataset switches and prevents clean per-dataset isolation when experiments differ.

💡 Suggested API shape
 def show_linked_grid(
     layout,
-    builders,
+    builders,
     *,
     tool,
+    state_keys=None,
     side_by_side=False,
@@
 ):
@@
-    def _one(exp_idx, exp_layout, container):
+    def _one(exp_idx, exp_layout, container):
+        exp_builders = (
+            builders[exp_idx] if isinstance(builders, (list, tuple)) else builders
+        )
+        exp_state_key = (
+            state_keys[exp_idx]
+            if state_keys is not None
+            else f"{tool}__exp{exp_idx}"
+        )
         with container:
             render_linked_grid(
                 exp_layout,
-                builders,
-                state_key=f"{tool}__exp{exp_idx}",
+                exp_builders,
+                state_key=exp_state_key,
                 grid_key=f"{grid_key}_{exp_idx}",
                 height=height,
                 column_heights=column_heights,
             )

Comment thread src/render/render.py Outdated
Comment thread src/render/schema.py
Comment thread src/view/grid.py
Comment on lines +243 to +266
def expand(self, trimmed: list, drop_empty_experiments: bool = True) -> list:
"""internal names -> labels, dropping empty cells/rows.

``drop_empty_experiments`` (default True, the edit-mode behavior) also drops
a wholly-empty experiment. The upload path passes False to match the oracle
``handleSettingButtons``, whose inline expand keeps an empty experiment as a
``[]`` stub so ``num_experiments`` stays ``len(uploaded)`` and the
reset-on-count-mismatch never fires (which would wipe the upload).
"""
expanded = []
for exp in trimmed:
rows = []
for row in exp:
cols = []
for col in row:
if col:
cols.append(
self.component_options[self.component_names.index(col)]
)
if cols:
rows.append(cols)
if rows or not drop_empty_experiments:
expanded.append(rows)
return expanded

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Harden uploaded layout handling to avoid runtime crashes.

json.load(uploaded) and the expand() name lookup path can throw on invalid JSON or unknown/stale component IDs, which currently crashes the layout page instead of surfacing a validation error.

💡 Suggested fix
 def _handle_setting_buttons(self) -> None:
@@
-        if uploaded is not None:
-            uploaded_layout = json.load(uploaded)
+        if uploaded is not None:
+            try:
+                uploaded_layout = json.load(uploaded)
+            except json.JSONDecodeError:
+                st.session_state[self._k("component_error")] = "Invalid JSON layout file"
+                return
+            if not isinstance(uploaded_layout, list):
+                st.session_state[self._k("component_error")] = "Invalid layout format"
+                return
@@
-                st.session_state[self._k("layout")] = self.expand(
+                st.session_state[self._k("layout")] = self.expand(
                     uploaded_layout, drop_empty_experiments=False
                 )
 def expand(self, trimmed: list, drop_empty_experiments: bool = True) -> list:
@@
                 for col in row:
                     if col:
-                        cols.append(
-                            self.component_options[self.component_names.index(col)]
-                        )
+                        if col not in self.component_names:
+                            st.session_state[self._k("component_error")] = (
+                                f"Unknown component in saved layout: {col}"
+                            )
+                            continue
+                        cols.append(
+                            self.component_options[self.component_names.index(col)]
+                        )

Also applies to: 423-438

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/view/grid.py` around lines 243 - 266, The upload flow needs to validate
JSON and avoid index errors in expand: wrap json.load(uploaded) in a try/except
to catch JSONDecodeError and return/raise a controlled validation error, verify
the decoded object is the expected list-of-lists structure before calling
expand, and inside Grid.expand change the name-lookup to a safe pattern (e.g.,
check membership of col in self.component_names or use a dict mapping
name->option and skip unknown IDs) rather than calling
self.component_names.index(col) which raises ValueError; ensure expand returns a
consistent validated structure (or raises a clear validation error) instead of
letting runtime exceptions propagate.

claude added 2 commits June 5, 2026 12:02
…ache-miss guards

- Viewers: select experiments by stable dataset id (display name via format_func)
  so duplicate display names can't collapse distinct datasets; scope linked-grid
  tool keys by experiment index to isolate side-by-side panel state.
- render.make_builders: only register sequence_view when its seq_deconv/seq_tnt
  cache exists, so datasets without sequence data don't hit a cache-miss crash.
- schema._build_tags: treat NaN/None/pd.NA tag scans as missing (scan_id = -1)
  instead of crashing in int().
- FileManager.get_results: handle a missing cache row (fetchone() is None) as a
  normal cache miss instead of raising TypeError.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/workflow/FileManager.py (1)

535-542: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

result_path() should enforce parquet-only contract.

result_path() is documented/typed to return a parquet path string, but it currently forwards whatever get_results(..., as_path=True) returns. For non-.pq tags, that can be a loaded Python object, which breaks the method contract and can cause downstream data_path= misuse.

Proposed fix
 def result_path(self, dataset_id: str, name_tag: str) -> str:
@@
-    return self.get_results(dataset_id, [name_tag], as_path=True)[name_tag]
+    value = self.get_results(dataset_id, [name_tag], as_path=True)[name_tag]
+    if not isinstance(value, str):
+        raise KeyError(
+            f"{name_tag} for {dataset_id} is not a parquet-backed result path"
+        )
+    return value
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/workflow/FileManager.py` around lines 535 - 542, The result_path method
currently returns whatever get_results(..., as_path=True)[name_tag] yields,
which can be a non-path object; change result_path to fetch value =
self.get_results(dataset_id, [name_tag], as_path=True)[name_tag], then validate
that value is a string path ending with ".pq" (and keep letting KeyError bubble
if the tag is missing); if the value is not a string or does not end with ".pq"
raise a TypeError with a clear message indicating result_path enforces a parquet
path contract. Ensure references inside the method remain to result_path and
get_results so the behavior is localized.
.github/workflows/unit-tests.yml (1)

44-49: ⚠️ Potential issue | 🟠 Major

Exclude openms-insight from the subsequent requirements.txt install.

requirements.txt includes openms-insight==0.1.15 (line 62), but the workflow already installs the built local /tmp/openms-insight. Leaving the pin in place can still make pip install -r requirements.txt try to resolve openms-insight again (despite it requiring the prebuilt/mirrored js-component/dist to be installable in this repo).

Suggested fix
-          pip install -r requirements.txt
+          grep -vi 'openms-insight' requirements.txt > /tmp/requirements-no-insight.txt
+          pip install -r /tmp/requirements-no-insight.txt
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/unit-tests.yml around lines 44 - 49, The workflow
currently installs the local wheel with "pip install /tmp/openms-insight" and
then re-runs "pip install -r requirements.txt" which contains a pinned
openms-insight entry; modify the job in .github/workflows/unit-tests.yml so the
requirements install does not attempt to reinstall openms-insight — e.g.,
generate or use a filtered requirements file before running "pip install -r
requirements.txt" or run pip install with a constraint/exclude to skip
"openms-insight==0.1.15"; ensure the change targets the sequence around the
existing "pip install /tmp/openms-insight" and "pip install -r requirements.txt"
commands so only the local package is installed.

Source: Learnings

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/unit-tests.yml:
- Around line 27-30: The workflow is using the mutable tag actions/setup-node@v4
which weakens CI supply-chain security; update the uses line in
.github/workflows/unit-tests.yml to pin to a specific commit SHA for
actions/setup-node (replace actions/setup-node@v4 with
actions/setup-node@<full-commit-sha>), keeping the existing node-version setting
intact and using the commit SHA from the official actions/setup-node repository
release you trust.
- Around line 38-39: The CI job currently clones the mutable branch reference
"claude/kind-heisenberg-u6dVm" via the git clone command and can drift; change
the workflow to fetch and checkout a specific immutable commit SHA instead of
relying on -b branch (or add an explicit git checkout <COMMIT_SHA> after clone),
and apply the same hardening to the Dockerfile/Dockerfile.arm flow that consumes
insight-ref.json by resolving and pinning the referenced commit SHA before
building; update references to the git clone invocation and the insight-ref.json
fetch so all checkouts use a concrete commit SHA rather than a branch name.

---

Outside diff comments:
In @.github/workflows/unit-tests.yml:
- Around line 44-49: The workflow currently installs the local wheel with "pip
install /tmp/openms-insight" and then re-runs "pip install -r requirements.txt"
which contains a pinned openms-insight entry; modify the job in
.github/workflows/unit-tests.yml so the requirements install does not attempt to
reinstall openms-insight — e.g., generate or use a filtered requirements file
before running "pip install -r requirements.txt" or run pip install with a
constraint/exclude to skip "openms-insight==0.1.15"; ensure the change targets
the sequence around the existing "pip install /tmp/openms-insight" and "pip
install -r requirements.txt" commands so only the local package is installed.

In `@src/workflow/FileManager.py`:
- Around line 535-542: The result_path method currently returns whatever
get_results(..., as_path=True)[name_tag] yields, which can be a non-path object;
change result_path to fetch value = self.get_results(dataset_id, [name_tag],
as_path=True)[name_tag], then validate that value is a string path ending with
".pq" (and keep letting KeyError bubble if the tag is missing); if the value is
not a string or does not end with ".pq" raise a TypeError with a clear message
indicating result_path enforces a parquet path contract. Ensure references
inside the method remain to result_path and get_results so the behavior is
localized.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5cf58efb-9598-46e6-8c10-83db94c2b574

📥 Commits

Reviewing files that changed from the base of the PR and between de4b7d5 and 94ed99a.

📒 Files selected for processing (7)
  • .github/workflows/unit-tests.yml
  • content/FLASHDeconv/FLASHDeconvViewer.py
  • content/FLASHQuant/FLASHQuantViewer.py
  • content/FLASHTnT/FLASHTnTViewer.py
  • src/render/render.py
  • src/render/schema.py
  • src/workflow/FileManager.py
🚧 Files skipped from review as they are similar to previous changes (5)
  • content/FLASHQuant/FLASHQuantViewer.py
  • content/FLASHTnT/FLASHTnTViewer.py
  • content/FLASHDeconv/FLASHDeconvViewer.py
  • src/render/render.py
  • src/render/schema.py

Comment on lines +27 to +30
- name: Set up Node
uses: actions/setup-node@v4
with:
node-version: "21"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Pin actions/setup-node to a full commit SHA.

.github/workflows/unit-tests.yml uses the mutable action tag actions/setup-node@v4, which weakens CI supply-chain integrity—pin it to a specific commit SHA instead of v4.

🧰 Tools
🪛 zizmor (1.25.2)

[error] 28-28: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/unit-tests.yml around lines 27 - 30, The workflow is using
the mutable tag actions/setup-node@v4 which weakens CI supply-chain security;
update the uses line in .github/workflows/unit-tests.yml to pin to a specific
commit SHA for actions/setup-node (replace actions/setup-node@v4 with
actions/setup-node@<full-commit-sha>), keeping the existing node-version setting
intact and using the commit SHA from the official actions/setup-node repository
release you trust.

Comment on lines +38 to +39
git clone -b claude/kind-heisenberg-u6dVm --single-branch --depth 1 \
https://github.com/t0mdavid-m/openms-insight.git /tmp/openms-insight

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate and print the relevant section of the workflow around the cited lines
FILE=".github/workflows/unit-tests.yml"
if [ ! -f "$FILE" ]; then
  echo "Missing $FILE"
  exit 1
fi

nl -ba "$FILE" | sed -n '1,120p'
echo "----"
nl -ba "$FILE" | sed -n '20,80p'
echo "----"

# Also search for any other openms-insight clone/install logic in the repo
rg -n "openms-insight|/tmp/openms-insight|npm run build|js-component/dist|requirements\.txt" .github/workflows/unit-tests.yml .claude/hooks/session-start.sh Dockerfile* . || true

Repository: OpenMS/FLASHApp

Length of output: 103


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE=".github/workflows/unit-tests.yml"
if [ ! -f "$FILE" ]; then
  echo "Missing $FILE"
  exit 1
fi

echo "---- $FILE (lines ~20-80) ----"
awk 'NR>=20 && NR<=80 {printf "%5d %s\n", NR, $0}' "$FILE"

echo "---- matches for openms-insight related logic ----"
rg -n "openms-insight|/tmp/openms-insight|js-component/dist|npm run build|requirements\.txt" .github/workflows .claude/hooks Dockerfile* 2>/dev/null || true

echo "---- any checkout/pin patterns in the workflow ----"
rg -n "git clone|checkout .*sha|checkout --detach|fetch --depth|--depth 1" .github/workflows/unit-tests.yml 2>/dev/null || true

Repository: OpenMS/FLASHApp

Length of output: 5956


🏁 Script executed:

#!/bin/bash
set -euo pipefail

for f in Dockerfile Dockerfile.arm; do
  echo "===== $f (around insight clone) ====="
  [ -f "$f" ] || continue
  # print likely relevant region
  awk 'NR>=60 && NR<=160 {printf "%5d %s\n", NR, $0}' "$f"
  echo
done

echo "===== search for insight-ref.json usage ====="
rg -n "insight-ref\.json|INSIGHT_BRANCH|openms-insight" Dockerfile Dockerfile.arm

Repository: OpenMS/FLASHApp

Length of output: 12567


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n "insight-ref\.json|insight-ref|INSIGHT_BRANCH=.*|git.*checkout.*openms-insight" -S . \
  || true

# Specifically ensure there is no follow-up that extracts a SHA from insight-ref.json
rg -n "jq |\.object\.sha|\"sha\"|grep .*insight-ref\.json|cat .*insight-ref\.json" Dockerfile Dockerfile.arm .github/workflows .claude/hooks 2>/dev/null || true

# List any other workflows that clone openms-insight
rg -n "openms-insight\.git|/tmp/openms-insight|openms_insight/js-component/dist" .github/workflows .claude/hooks Dockerfile* 2>/dev/null || true

Repository: OpenMS/FLASHApp

Length of output: 2552


Use an immutable openms-insight revision in CI checkout.

.github/workflows/unit-tests.yml clones claude/kind-heisenberg-u6dVm via git clone -b ... --depth 1 with no commit SHA pinning, so CI results can silently drift as the branch moves. The same mutable-branch pattern is also used in Dockerfile/Dockerfile.arm (the insight-ref.json fetch is not followed by a SHA checkout).

Suggested hardening
-          git clone -b claude/kind-heisenberg-u6dVm --single-branch --depth 1 \
-            https://github.com/t0mdavid-m/openms-insight.git /tmp/openms-insight
+          git clone https://github.com/t0mdavid-m/openms-insight.git /tmp/openms-insight
+          git -C /tmp/openms-insight checkout <pinned-openms-insight-commit-sha>
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/unit-tests.yml around lines 38 - 39, The CI job currently
clones the mutable branch reference "claude/kind-heisenberg-u6dVm" via the git
clone command and can drift; change the workflow to fetch and checkout a
specific immutable commit SHA instead of relying on -b branch (or add an
explicit git checkout <COMMIT_SHA> after clone), and apply the same hardening to
the Dockerfile/Dockerfile.arm flow that consumes insight-ref.json by resolving
and pinning the referenced commit SHA before building; update references to the
git clone invocation and the insight-ref.json fetch so all checkouts use a
concrete commit SHA rather than a branch name.

claude added 4 commits June 11, 2026 12:49
The viewer built every panel's Insight cache lazily on first experiment
selection: build_insight_caches() for the tidy parquet, then constructing
each default-layout component (data_path= -> subprocess preprocessing ->
on-disk cache). That one-time cost landed on the viewer's hot path.

Move it into the workflow run. After the tidy parse, TagWorkflow (flashtnt)
and DeconvWorkflow (flashdeconv) call a new render.warm_insight_caches():
build the tidy caches, then construct each default-layout builder once so
its {cache_id}/ cache is written. The viewer (with OpenMS-Insight M1) then
reconstructs from those caches instead of preprocessing, so opening an
experiment is fast.

Scope + safety:
- Warms only DEFAULT_WARM_PANELS per tool (mirrors each viewer DEFAULT_LAYOUT)
  rather than every builder make_builders returns, so secondary heatmaps and
  the other tools' panels are not over-built or spawned with missing data.
- Best-effort: a missing source parquet or any construction error is logged
  and skipped, and the call site guards the import too, so cache warming can
  never fail the workflow that produced the results.
- Construction only (no rendering), so it is Streamlit-free and safe in the
  worker. The cache_path matches the viewer's for all tools (share_cache).
- dataset_id is timestamped per run, so each run warms fresh caches (no
  cross-run staleness).

FLASHQuant (no execution() here, 2 light panels, no heatmap) is left to M1's
lazy reuse.

Adds tests: warming writes the flashdeconv/flashtnt default-panel caches and
never raises on a dataset with no backing data.
Wrap render_linked_grid's panel loop in OpenMS-Insight's batch_rerun() so a
cross-link cascade settles in one pass + a single rerun, instead of one
rerun (one full-page pass) per panel. The shared StateManager already lets a
downstream panel read the upstream selection an earlier panel set in the same
pass, so one pass applies the whole scan->mass->spectra->3D cascade.

The import is soft: against an OpenMS-Insight build without batch_rerun it
falls back to a no-op context manager (the previous per-panel rerun behavior),
so the grid keeps working regardless of the installed Insight version.
deconv.py already stores, beside each full-resolution heatmap, logspaced
pre-downsampled levels ({base}_{N}, N from compute_compression_levels) -- but
the Insight Heatmap builders read full-res and the levels went unused. The
Heatmap keeps whatever frame it is handed as its FINEST pyramid level and
re-downsamples per zoom region, so handing it the LARGEST pre-built level
(closest to full-res) lets it build its pyramid from less data, trimming the
one-time cache-build cost + memory.

Add _heatmap_data_path: pick the largest {base}_{N} level available for the
dataset (stored_data columns are a cross-dataset union, so try largest-first
and skip ones unset for this dataset), falling back to full-res {base} when no
levels were built (small datasets, full_count <= 20000) or a cache predates
them. Wire the four heatmap builders through it via heat(...).

Tradeoff (bounded): the finest level is ~the next logspace step below full
(e.g. 200000 of 500000), so max-zoom detail and click-target density drop by
that factor -- imperceptible at default zoom (the component renders min_points
either way), only visible under very tight zoom. Falls back transparently, so
existing full-res-only caches are unaffected.

Adds test_heatmap_data_path_prefers_finest_level (finest-first, cross-dataset
skip, full-res fallback). Full render-builders suite (33) stays green.
Flip _heatmap_data_path to pick the smallest {base}_{N} downsample level
(was largest). The Insight Heatmap renders only ~min_points (default 10000)
at any view -- downsampling to min_points at full zoom and re-binning the
visible region to min_points on zoom -- and caps its own cache at 2*min_points
(=20000, exactly the smallest stored level). So the rendered default/moderate
view is identical regardless of which level is fed; the input size only sets
how much data Insight reads to build its pyramid.

Picking the largest level therefore minimized the very build-cost/memory win
the levels were precomputed for (a 5M-point heatmap built from 2M rows instead
of 20k). The smallest level (20000) saturates the detail Insight keeps while
giving the full build-cost saving. The only cost is click-target density under
extreme zoom; precise selection there remains available via the scan/mass
tables.

Verified by fan-out review: the downsample preserves scan_idx/mass_idx, so the
heatmap click cascade is intact; full render-builders suite (33) stays green.
Updates the test to assert smallest-first selection (+ the cross-dataset skip).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants