feat: implement full TMT analysis pipeline, result visualizations, and Windows compatibility.#26
Conversation
* Add Matomo Tag Manager as third analytics tracking mode Adds Matomo Tag Manager support alongside existing Google Analytics and Piwik Pro integrations. Includes settings.json configuration (url + tag), build-time script injection via hook-analytics.py, Klaro GDPR consent banner integration, and runtime consent granting via MTM data layer API. https://claude.ai/code/session_0165AXHkmRZ6bx23n7Tbyz8h * Fix Matomo Tag Manager snippet to match official docs - Accept full container JS URL instead of separate url + tag fields, supporting both self-hosted and Matomo Cloud URL patterns - Match the official snippet: var _mtm alias, _mtm.push shorthand - Remove redundant type="text/javascript" attribute - Remove unused "tag" field from settings.json https://claude.ai/code/session_0165AXHkmRZ6bx23n7Tbyz8h * Split Matomo config into base url + tag fields Separate the Matomo setting into `url` (base URL, e.g. https://cdn.matomo.cloud/openms.matomo.cloud) and `tag` (container ID, e.g. yDGK8bfY), consistent with how other providers use a tag field. The script constructs the full path: {url}/container_{tag}.js https://claude.ai/code/session_0165AXHkmRZ6bx23n7Tbyz8h * install matomo tag --------- Co-authored-by: Claude <noreply@anthropic.com>
* Initial plan * fix: remove duplicate address entry in config.toml Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com>
…til.SameFileError (#349) * Initial plan * Fix integration test failures: restore sys.modules mocks, handle SameFileError, update CI workflow Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com> * Remove unnecessary pyopenms mock from test_topp_workflow_parameter.py, simplify test_parameter_presets.py Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com> * Fix Windows build: correct site-packages path in cleanup step Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com>
…(#351) On Windows, 0.0.0.0 is not a valid connect address — the browser fails to open http://0.0.0.0:8501. By removing the address entry from the bundled .streamlit/config.toml, Streamlit defaults to localhost, which works correctly for local deployments. Docker deployments are unaffected as they pass --server.address 0.0.0.0 on the command line. https://claude.ai/code/session_016amsLCZeFogTksmtk1geb5 Co-authored-by: Claude <noreply@anthropic.com>
Resolves the issue where multiple instances of the same TOPP tool could not have independent custom defaults. The .ini file (keyed by tool name) only supported one set of defaults — the first instance's defaults were baked in and subsequent instances were silently ignored. Changes: - Add get_merged_params() to ParameterManager for centralized three-layer parameter resolution (ini defaults < _defaults < user overrides) - Replace .ini file mutation with _defaults seeding in params.json, keyed by tool instance name - Update save_parameters() to compare against _defaults when determining which values are non-default - Update run_topp() and non_default_params_summary() to use merged params - Resolve instance names to real tool names across save_parameters, input_TOPP, run_topp, and get_topp_parameters
The .ini file is no longer mutated with custom defaults — it stays pristine. All parameters (_defaults + user overrides) are already passed as CLI flags via get_merged_params(), making -ini redundant.
…orkflow - Add visualization scripts for abundance, database search, heatmap, etc. - Add `get_abundance_data` helper in `results_helpers.py` to load data. - Update `src/Workflow.py` to support TMT analysis pipeline and group assignment UI. - Register new result visualization pages and clean up legacy menus in `app.py`.
- Update `StreamlitUI` to handle `flag_parameters` and render them as selectboxes. - Persist flag parameter metadata into session state and JSON configurations. - Update `CommandExecutor` to conditionally append flags without appending values. - Add readable command logging/printing before execution.
- Use Windows `taskkill` command via subprocess to forcefully terminate process trees on Windows. - Fallback to `os.kill` with `SIGTERM` on Linux/macOS environments. - Improve exception handling and stale PID file cleanup logic during process stop.
📝 WalkthroughWalkthroughThis PR adds a complete TMT proteomics analysis pipeline (IsobaricAnalyzer → ProteinQuantifier) to ChangesTMT Proteomics Analysis Pipeline and Tool Instance Names
Matomo Analytics Integration
Windows Deployment: Streamlit Address Binding Fix
Sequence DiagramssequenceDiagram
rect rgba(173, 216, 230, 0.5)
Note over Workflow,ProteinQuantifier: TMT Analysis Execution
end
participant Workflow
participant ParameterManager
participant CommandExecutor
participant TOPPTools as TOPP Tool Chain
participant ResultsHelpers
Workflow->>ParameterManager: get_merged_params(tool_instance_name)
ParameterManager-->>CommandExecutor: merged ini < _defaults < user params
Workflow->>CommandExecutor: run_topp("IsobaricAnalyzer", ..., tool_instance_name)
CommandExecutor->>TOPPTools: IsobaricAnalyzer → CometAdapter → PercolatorAdapter
CommandExecutor->>TOPPTools: IDFilter → IDMapper → FileMerger → ProteinInference
CommandExecutor->>TOPPTools: IDConflictResolver → ProteinQuantifier → quant CSV
TOPPTools-->>Workflow: idXML + quant CSV outputs
Workflow->>ResultsHelpers: parse_idxml, build_spectra_cache
Workflow->>ResultsHelpers: get_abundance_data(workspace)
ResultsHelpers-->>Workflow: (pivot_df, expr_df, group_map)
Workflow-->>ResultsPages: visualization cache ready
sequenceDiagram
rect rgba(144, 238, 144, 0.5)
Note over User,MatomoTMC: Matomo Consent Flow
end
participant User
participant captcha_control
participant KlaroConsent as gdpr_consent Klaro
participant page_setup
participant MatomoTMC as Matomo Tag Manager
User->>captcha_control: page load (matomo enabled in settings)
captcha_control->>KlaroConsent: render(matomo=mt, google_analytics=ga)
KlaroConsent-->>User: Klaro dialog with matomo service
User->>KlaroConsent: accept matomo
KlaroConsent-->>page_setup: tracking_consent["matomo"] = True
page_setup->>MatomoTMC: html() → _mtm.push(MTMSetConsentGiven)
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 15
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/workflow/StreamlitUI.py (1)
609-622:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMutable default arguments in
input_TOPPsignature.
flag_parameters: List[str] = []andcustom_defaults: dict = {}use mutable defaults, which can cause shared state between calls. Additionally,tool_instance_name: str = Noneshould use explicit| None.🐛 Proposed fix
`@st.fragment` def input_TOPP( self, topp_tool_name: str, num_cols: int = 4, exclude_parameters: List[str] = [], include_parameters: List[str] = [], - flag_parameters: List[str] = [], + flag_parameters: List[str] | None = None, display_tool_name: bool = True, display_subsections: bool = True, display_subsection_tabs: bool = False, - custom_defaults: dict = {}, - tool_instance_name: str = None, + custom_defaults: dict | None = None, + tool_instance_name: str | None = None, ) -> None:Then at the start of the method:
if flag_parameters is None: flag_parameters = [] if custom_defaults is None: custom_defaults = {}Note:
exclude_parametersandinclude_parametershave the same issue (pre-existing).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/workflow/StreamlitUI.py` around lines 609 - 622, The input_TOPP method uses mutable default arguments (empty list for flag_parameters and empty dict for custom_defaults) which can cause shared state between function calls, and tool_instance_name should use explicit | None type annotation. Change the method signature to use None as defaults for these three parameters (flag_parameters: List[str] = None, custom_defaults: dict = None, and tool_instance_name: str | None = None), then add initialization checks at the start of the input_TOPP method body to set flag_parameters to an empty list if None and custom_defaults to an empty dict if None.src/workflow/CommandExecutor.py (1)
286-294:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winVariable
iis shadowed inside the loop, causing potential logic errors.The loop variable
ifromfor i in range(n_processes)(line 278) is reassigned to0on line 288 whenlen(value) == 1. This shadows the outer loop variable for the remainder of the iteration, which works coincidentally but is confusing and fragile.🐛 Proposed fix - use a different variable name
# when multiple input/output files exist (e.g., multiple mzMLs and featureXMLs), but only one additional input file (e.g., one input database file) if len(value) == 1: - i = 0 + idx = 0 + else: + idx = i # when the entry is a list of collected files to be passed as one [["sample1", "sample2"]] - if isinstance(value[i], list): - command += value[i] + if isinstance(value[idx], list): + command += value[idx] # standard case, files was a list of strings, take the file name at index else: - command += [value[i]] + command += [value[idx]]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/workflow/CommandExecutor.py` around lines 286 - 294, The loop variable i from the outer loop (for i in range(n_processes) at line 278) is being shadowed when reassigned to 0 on line 288 when len(value) == 1. To fix this, use a different variable name (such as value_index or idx) instead of i for the conditional index assignment and all subsequent uses (the isinstance check and the command append operations). This removes the shadowing and makes the intent clearer that this is a local index for the current value, not the outer loop iteration counter.
🧹 Nitpick comments (8)
content/results_abundance.py (1)
6-8: ⚡ Quick winUnused imports.
ttest_indfromscipy.statsandmultipletestsfromstatsmodelsare imported but not used in this file. They're used inresults_helpers.pyinstead.♻️ Proposed fix
import numpy as np from pathlib import Path -from scipy.stats import ttest_ind from src.common.common import page_setup -from statsmodels.stats.multitest import multipletests from src.common.results_helpers import get_workflow_dir, get_abundance_data🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@content/results_abundance.py` around lines 6 - 8, Remove the unused imports from the file. Delete the import statement for ttest_ind from scipy.stats on line 6 and the import statement for multipletests from statsmodels.stats.multitest on line 8. Keep the import for page_setup from src.common.common as it is used in this file. These imports are not used in results_abundance.py and should be removed to clean up the imports.src/workflow/ParameterManager.py (2)
168-199: ⚡ Quick winUse explicit
| Nonetype annotation fortool_instance_name.♻️ Proposed fix
- def get_topp_parameters(self, tool: str, tool_instance_name: str = None) -> dict: + def get_topp_parameters(self, tool: str, tool_instance_name: str | None = None) -> dict:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/workflow/ParameterManager.py` around lines 168 - 199, The type annotation for the tool_instance_name parameter in the get_topp_parameters method is incomplete. Currently it is annotated as str with a default value of None, which creates a type mismatch. Update the type annotation to explicitly indicate that tool_instance_name can be either a string or None by changing it from str to str | None, ensuring the type annotation matches the default value and actual usage where the parameter can be None.
144-166: ⚡ Quick winUse explicit
| Nonetype annotation instead of implicitOptional.PEP 484 prohibits using
= Noneas an implicit optional. The type hint should explicitly indicate the parameter can beNone.♻️ Proposed fix
- def get_merged_params(self, tool_instance_name: str, ini_params: dict = None) -> dict: + def get_merged_params(self, tool_instance_name: str, ini_params: dict | None = None) -> dict:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/workflow/ParameterManager.py` around lines 144 - 166, The parameter ini_params in the get_merged_params method has a default value of None but uses an implicit optional type annotation (dict = None) which violates PEP 484. Update the type hint for ini_params to explicitly indicate it accepts None by changing the annotation from dict to dict | None, resulting in ini_params: dict | None = None.src/Workflow.py (1)
1-1: ⚡ Quick winUnused import
from altair import value.This import is not used anywhere in the file and should be removed.
♻️ Proposed fix
-from altair import value import streamlit as st🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/Workflow.py` at line 1, Remove the unused import statement `from altair import value` from the file since it is not referenced anywhere in the code. Simply delete this import line to clean up the imports section.src/common/results_helpers.py (3)
171-181: ⚡ Quick winAdd
strict=Truetozip()to catch length mismatches.If
mz_arrayandint_arrayhave different lengths due to a data anomaly, silent truncation would occur. Addingstrict=Trueensures an error is raised if the arrays differ in length.♻️ Proposed fix
- for mz, intensity in zip(mz_array, int_array): + for mz, intensity in zip(mz_array, int_array, strict=True):🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/common/results_helpers.py` around lines 171 - 181, The zip() call in the peak iteration loop lacks the strict=True parameter, which means if mz_array and int_array have different lengths due to data anomalies, silent truncation would occur instead of raising an error. Add strict=True to the zip(mz_array, int_array) call to ensure that an error is raised immediately if the two arrays have mismatched lengths, improving data integrity and making bugs easier to detect.
40-43: 💤 Low valueRedundant type conversion for Charge column.
Line 41 converts
Chargeto string, then line 42 converts it back to int forCharge_num. If both columns are needed, createCharge_numfirst and derive the string version from it.♻️ Suggested simplification
if not df.empty: - df["Charge"] = df["Charge"].astype(str) - df["Charge_num"] = df["Charge"].astype(int) + df["Charge_num"] = df["Charge"] + df["Charge"] = df["Charge"].astype(str)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/common/results_helpers.py` around lines 40 - 43, The code in the condition block where df is not empty is performing redundant type conversions on the Charge column. Instead of converting Charge to string first and then converting it back to int for Charge_num, reverse the order: first convert the Charge column to int and assign it to Charge_num, then convert Charge (or derive it from Charge_num) to string. This eliminates the unnecessary intermediate string conversion and makes the code more efficient.
371-458: ⚡ Quick winRemove commented-out dead code.
This large block of commented-out alternative logic (88 lines) adds noise and maintenance burden. If it's needed for reference, consider moving it to documentation or a separate branch. If it's obsolete, remove it entirely.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/common/results_helpers.py` around lines 371 - 458, Remove the entire commented-out code block that spans 88 lines in the results_helpers.py file. This dead code includes commented logic for group mapping, statistical calculations with t-tests and multiple testing corrections, and pivot table construction. Simply delete all the commented lines from the beginning of the group mapping section through the end of the return statement to clean up the codebase and reduce maintenance burden.content/results_database_search.py (1)
42-46: 🏗️ Heavy liftRefactor interactive sections into
@st.fragmentblocks to avoid full-page reruns. The current pages place interactive widgets/rendering in top-level flow, so every interaction reruns the full script path.
content/results_database_search.py#L42-L46: wrap file-selection + dependent visualization rendering in a fragment function.content/results_filtered.py#L42-L46: wrap file-selection + dependent visualization rendering in a fragment function.content/results_heatmap.py#L32-L33: move slider-driven heatmap rendering into a fragment.content/results_pathway.py#L187-L188: gate expensive enrichment/rendering through a fragmentized interaction flow.content/results_pca.py#L78-L84: wrap PCA figure rendering interaction path in a fragment.content/results_rescoring.py#L43-L47: wrap file-selection + dependent visualization rendering in a fragment function.content/results_volcano.py#L39-L53: wrap threshold controls + figure rendering in a fragment.As per coding guidelines, "Use
@st.fragmentdecorator for interactive UI updates without full page reloads".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@content/results_database_search.py` around lines 42 - 46, Refactor interactive widget interactions to use the `@st.fragment` decorator to avoid full-page script reruns. In content/results_database_search.py (L42-46), wrap the st.selectbox call and its dependent visualization rendering in a fragment function. In content/results_filtered.py (L42-46), apply the same pattern for file-selection and dependent rendering. In content/results_heatmap.py (L32-33), move the slider-driven heatmap rendering into a fragment. In content/results_pathway.py (L187-188), refactor to gate expensive enrichment and rendering through a fragmentized interaction flow. In content/results_pca.py (L78-84), wrap the PCA figure rendering interaction path in a fragment. In content/results_rescoring.py (L43-47), wrap file-selection and visualization rendering in a fragment. In content/results_volcano.py (L39-53), wrap threshold controls and figure rendering in a fragment. For each location, create a fragment function that encapsulates the interactive widgets and their dependent visualization/rendering logic, then call that function from the main page flow to ensure isolated reruns.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/ci.yml:
- Around line 14-15: Update the GitHub Actions in the CI workflow to use the
latest versions pinned to commit SHAs for security. Replace the
`actions/checkout@v4` action with the latest v4 commit SHA (pinned with a
version comment like `# v4.x.x`). Replace the `actions/setup-python@v4` action
with the latest v6 commit SHA (since v6 is the current version as of 2026) and
also include a version comment like `# v6.x.x`. This mitigates supply-chain risk
by preventing version tags from being redirected to malicious code while
ensuring you are using the latest stable releases of these actions.
In `@content/results_abundance.py`:
- Around line 43-70: The code contains extensive debug diagnostics (st.write,
st.error with "Debug:" prefix, ParameterManager inspection) that should not be
in production, and the if result is None block on lines 68-70 is unreachable
because st.stop() on line 66 terminates execution. Remove all debug output
statements from the exception handler and the detailed diagnostic checks
(Path/workflow directory checks, csv file listing, and ParameterManager
inspection). Keep the essential exception logging and remove the first st.stop()
call so the code proceeds to the proper user-facing info message when result is
None from the get_abundance_data call.
In `@content/results_database_search.py`:
- Around line 51-53: The cache preflight validation is incomplete across all
three pages and only checks for the table_ cache prefix, leaving the
visualization components vulnerable to runtime failures from missing cache
artifacts. In content/results_database_search.py at lines 51-53, expand the
cache validation check to validate all four required cache ID prefixes (table_,
heatmap_, seqview_, and lineplot_) instead of only table_. Apply the identical
full preflight cache check validation at content/results_filtered.py lines 51-53
before component initialization. Apply the same complete cache validation check
at content/results_rescoring.py lines 52-54 before component initialization.
Each location should verify the existence of all four cache prefixes in a single
unified check before any Heatmap, SequenceView, or LinePlot components are
instantiated.
In `@content/results_heatmap.py`:
- Around line 40-45: The hierarchical clustering operations using linkage and
pdist will fail when heatmap_z has only one row or one column, as pdist cannot
compute distances for a single sample. Before calling
linkage(pdist(heatmap_z.values)) at line 41, add a guard check that
heatmap_z.shape[0] > 1; if false, set row_order to a default unclustered order
(such as range(heatmap_z.shape[0])). Similarly, before calling
linkage(pdist(heatmap_z.T.values)) at line 44, add a guard check that
heatmap_z.shape[1] > 1; if false, set col_order to a default unclustered order
(such as range(heatmap_z.shape[1])). This ensures the code handles single-row or
single-column edge cases gracefully by falling back to unclustered plotting.
- Line 66: The code uses the `width` parameter in st.plotly_chart() calls which
requires Streamlit 1.51.0 or later but may be incompatible with 1.49.x. Fix this
across all four affected files: in content/results_heatmap.py at line 66,
content/results_pathway.py at line 250, content/results_pca.py at line 93, and
content/results_volcano.py at line 97. Replace the `width="stretch"` parameter
with `use_container_width=True` in each st.plotly_chart() call, or alternatively
upgrade Streamlit to version 1.51.0 or later and keep the width parameter
(preferred approach).
In `@content/results_pathway.py`:
- Around line 63-66: The mg.querymany() call lacks error handling for external
network failures that could crash the page render. Wrap the mg.querymany()
method call in a try/except block to catch potential API and network errors. In
the except clause, log a recoverable warning message (using appropriate logging)
and preserve any previously saved GO results (res_go) by keeping it from prior
iterations or initializing an empty DataFrame rather than letting the exception
propagate and break the page rendering.
- Around line 179-188: The _run_go_enrichment(pivot_df, results_dir) function at
line 187 executes unconditionally on every rerun, causing repeated remote
annotation and enrichment computation. Add a cache/freshness check before
calling _run_go_enrichment() to gate the execution—only run the function if the
output file go_results.json is missing from the results_dir or if the input data
has changed. This prevents unnecessary recomputation while ensuring fresh
results when needed.
In `@content/results_volcano.py`:
- Around line 55-95: Add an explicit empty-data guard immediately after line 55
where volcano_df is created to check if the dataframe is empty after the dropna
operation. If volcano_df is empty, return or raise an informative
exception/error message before proceeding to any figure construction. This
prevents the downstream calculation of max_abs_fc from producing NaN values,
which would result in invalid xaxis_range and broken visualization logic.
In `@src/common/results_helpers.py`:
- Around line 209-212: The broad exception handling in the try-except block
around the pd.read_csv call catches all Exception types and returns None
silently, which hides the actual error. Replace the generic except Exception
clause with either specific exception types like pd.errors.ParserError,
FileNotFoundError, and IOError, or add logging of the exception details before
returning None so that debugging information is preserved. If you choose to log,
use an appropriate logger to capture the full exception context before the
return statement.
In `@src/Workflow.py`:
- Around line 528-604: Extract the duplicated Table, Heatmap, SequenceView, and
LinePlot initialization blocks that appear across three locations
(src/Workflow.py lines 528-604 for Comet results, lines 620-692 for Percolator
results, and lines 711-783 for filtered results) into a new helper method named
_build_visualization_cache. This helper should accept parameters for
idxml_files, cache_dir, spectra_df, filename_to_index, mzml_dir, frag_tol, and
frag_tol_is_ppm, contain the for loop over idxml files and all component
initialization logic, and return the updated spectra_df and filename_to_index.
Then replace each of the three duplicated blocks with a single call to this
helper method, passing in the appropriate arguments for that stage
(comet_results, percolator_results, and filtered_results respectively).
- Around line 135-164: The custom_defaults dictionary passed to the
self.ui.input_TOPP() call for CometAdapter contains a duplicate key
"PeptideIndexing:unmatched_action" that appears twice with the same value
"warn". Remove one of the two occurrences of this key from the dictionary to
eliminate the redundancy and avoid confusion in future maintenance.
- Around line 996-999: The log2 fold change calculation in the ternary operator
checks only if np.mean(g1) > 0, which doesn't fully prevent division by zero
issues. When np.mean(g2) is also zero or when both means are zero, the
expression still produces invalid results like infinity or NaN. Refactor the
ternary condition to check that both np.mean(g1) and np.mean(g2) are greater
than zero before performing the division, and return an appropriate default
value (such as 0 or NaN) when either mean is zero or both are zero, ensuring
robust handling of all edge cases in the log2fc calculation.
In `@src/workflow/CommandExecutor.py`:
- Line 220: The method run_topp has a mutable default argument custom_params:
dict = {} which can cause unexpected state sharing across function calls. Change
the default parameter from custom_params: dict = {} to custom_params: dict =
None in the method signature, then add a check at the start of the method body
that initializes custom_params to an empty dictionary if it is None, following
the pattern: if custom_params is None: custom_params = {}
In `@src/workflow/StreamlitUI.py`:
- Around line 876-889: The type check at line 879 in the bool value handling
section uses `type(p["value"]) == str` which doesn't account for string
subclasses and is not idiomatic Python. Replace `type(p["value"]) == str` with
`isinstance(p["value"], str)` to properly check if the value is a string type
and follow Python best practices.
In `@tests/test_parameter_defaults.py`:
- Around line 180-192: The test method
test_get_topp_parameters_includes_defaults does not actually test the function
named in its title and docstring. The test currently calls get_merged_params()
at line 190, but the test name and docstring claim it validates
get_topp_parameters(). Either rename the test method to accurately reflect that
it tests get_merged_params() (and update the docstring accordingly), or refactor
the test to actually call get_topp_parameters() with a prepared .ini file
fixture to validate the true contract being described in the docstring.
---
Outside diff comments:
In `@src/workflow/CommandExecutor.py`:
- Around line 286-294: The loop variable i from the outer loop (for i in
range(n_processes) at line 278) is being shadowed when reassigned to 0 on line
288 when len(value) == 1. To fix this, use a different variable name (such as
value_index or idx) instead of i for the conditional index assignment and all
subsequent uses (the isinstance check and the command append operations). This
removes the shadowing and makes the intent clearer that this is a local index
for the current value, not the outer loop iteration counter.
In `@src/workflow/StreamlitUI.py`:
- Around line 609-622: The input_TOPP method uses mutable default arguments
(empty list for flag_parameters and empty dict for custom_defaults) which can
cause shared state between function calls, and tool_instance_name should use
explicit | None type annotation. Change the method signature to use None as
defaults for these three parameters (flag_parameters: List[str] = None,
custom_defaults: dict = None, and tool_instance_name: str | None = None), then
add initialization checks at the start of the input_TOPP method body to set
flag_parameters to an empty list if None and custom_defaults to an empty dict if
None.
---
Nitpick comments:
In `@content/results_abundance.py`:
- Around line 6-8: Remove the unused imports from the file. Delete the import
statement for ttest_ind from scipy.stats on line 6 and the import statement for
multipletests from statsmodels.stats.multitest on line 8. Keep the import for
page_setup from src.common.common as it is used in this file. These imports are
not used in results_abundance.py and should be removed to clean up the imports.
In `@content/results_database_search.py`:
- Around line 42-46: Refactor interactive widget interactions to use the
`@st.fragment` decorator to avoid full-page script reruns. In
content/results_database_search.py (L42-46), wrap the st.selectbox call and its
dependent visualization rendering in a fragment function. In
content/results_filtered.py (L42-46), apply the same pattern for file-selection
and dependent rendering. In content/results_heatmap.py (L32-33), move the
slider-driven heatmap rendering into a fragment. In content/results_pathway.py
(L187-188), refactor to gate expensive enrichment and rendering through a
fragmentized interaction flow. In content/results_pca.py (L78-84), wrap the PCA
figure rendering interaction path in a fragment. In content/results_rescoring.py
(L43-47), wrap file-selection and visualization rendering in a fragment. In
content/results_volcano.py (L39-53), wrap threshold controls and figure
rendering in a fragment. For each location, create a fragment function that
encapsulates the interactive widgets and their dependent visualization/rendering
logic, then call that function from the main page flow to ensure isolated
reruns.
In `@src/common/results_helpers.py`:
- Around line 171-181: The zip() call in the peak iteration loop lacks the
strict=True parameter, which means if mz_array and int_array have different
lengths due to data anomalies, silent truncation would occur instead of raising
an error. Add strict=True to the zip(mz_array, int_array) call to ensure that an
error is raised immediately if the two arrays have mismatched lengths, improving
data integrity and making bugs easier to detect.
- Around line 40-43: The code in the condition block where df is not empty is
performing redundant type conversions on the Charge column. Instead of
converting Charge to string first and then converting it back to int for
Charge_num, reverse the order: first convert the Charge column to int and assign
it to Charge_num, then convert Charge (or derive it from Charge_num) to string.
This eliminates the unnecessary intermediate string conversion and makes the
code more efficient.
- Around line 371-458: Remove the entire commented-out code block that spans 88
lines in the results_helpers.py file. This dead code includes commented logic
for group mapping, statistical calculations with t-tests and multiple testing
corrections, and pivot table construction. Simply delete all the commented lines
from the beginning of the group mapping section through the end of the return
statement to clean up the codebase and reduce maintenance burden.
In `@src/Workflow.py`:
- Line 1: Remove the unused import statement `from altair import value` from the
file since it is not referenced anywhere in the code. Simply delete this import
line to clean up the imports section.
In `@src/workflow/ParameterManager.py`:
- Around line 168-199: The type annotation for the tool_instance_name parameter
in the get_topp_parameters method is incomplete. Currently it is annotated as
str with a default value of None, which creates a type mismatch. Update the type
annotation to explicitly indicate that tool_instance_name can be either a string
or None by changing it from str to str | None, ensuring the type annotation
matches the default value and actual usage where the parameter can be None.
- Around line 144-166: The parameter ini_params in the get_merged_params method
has a default value of None but uses an implicit optional type annotation (dict
= None) which violates PEP 484. Update the type hint for ini_params to
explicitly indicate it accepts None by changing the annotation from dict to dict
| None, resulting in ini_params: dict | None = None.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 13dd5c2f-7d70-4269-b7af-2cd7db9558c0
⛔ Files ignored due to path filters (1)
gdpr_consent/dist/bundle.jsis excluded by!**/dist/**
📒 Files selected for processing (31)
.github/workflows/build-windows-executable-app.yaml.github/workflows/ci.yml.github/workflows/test-win-exe-w-embed-py.yaml.streamlit/config.tomlapp.pycontent/results_abundance.pycontent/results_database_search.pycontent/results_filtered.pycontent/results_heatmap.pycontent/results_pathway.pycontent/results_pca.pycontent/results_rescoring.pycontent/results_volcano.pydocs/win_exe_with_embed_py.mddocs/win_exe_with_pyinstaller.mdgdpr_consent/src/main.tshooks/hook-analytics.pysettings.jsonsrc/Workflow.pysrc/common/captcha_.pysrc/common/common.pysrc/common/results_helpers.pysrc/workflow/CommandExecutor.pysrc/workflow/ParameterManager.pysrc/workflow/StreamlitUI.pysrc/workflow/WorkflowManager.pytest_gui.pytests/test_parameter_defaults.pytests/test_parameter_presets.pytests/test_tool_instance_name.pytests/test_topp_workflow_parameter.py
💤 Files with no reviewable changes (2)
- .streamlit/config.toml
- tests/test_topp_workflow_parameter.py
| - uses: actions/checkout@v4 | ||
| - uses: conda-incubator/setup-miniconda@v3 | ||
| - uses: actions/setup-python@v4 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify all workflow action references and locate unpinned versions
rg -n '^\s*-\s*uses:\s*actions/(checkout|setup-python)@' .github/workflowsRepository: OpenMS/quantms-web
Length of output: 1951
🌐 Web query:
actions/setup-python@v4 deprecated end of life status 2024 2025 2026
💡 Result:
As of June 16, 2026, the actions/setup-python@v4 action is not officially marked as deprecated, but it is considered a legacy major version [1][2]. The official recommendation is to use the latest major version, which is v6 (as of the latest release v6.2.0 in January 2026) [1][3]. While v4 may continue to function in many workflows, it relies on older underlying technology (such as older Node.js runtimes) compared to current releases [1][4][3]. Key points regarding the status of setup-python: - Versioning and Support: GitHub Actions maintainers follow a practice of releasing new major versions to accommodate breaking changes, such as upgrades to the underlying Node.js runtime [1][3]. v4 was superseded by v5 (introduced around late 2023) and subsequently v6 [5][3]. - Deprecation Policy: GitHub typically manages the lifecycle of its actions by phasing out older major versions as newer ones become the standard [6]. Users are encouraged to migrate to the latest major version to ensure compatibility with updated GitHub runner environments, which may eventually drop support for the Node.js versions required by legacy actions [1][4][3]. - Usage: If your workflows are currently using @v4, they should be updated to @v6 to leverage the latest features, security patches, and compatibility improvements [1][3]. There is no specific EOL (End of Life) date publicly posted for v4 in the same manner as OS images (like Ubuntu 20.04) [7], but it is effectively obsolete for new or maintained projects [5][3].
Citations:
- 1: https://github.com/actions/setup-python
- 2: https://github.com/actions/setup-python/releases
- 3: https://github.com/actions/setup-python/blob/main/README.md
- 4: Action emits
[DEP0040] DeprecationWarning: Thepunycodemodule is deprecated. Please use a userland alternative instead.actions/setup-python#1208 - 5: actions/setup-python@e9d6f99
- 6: https://github.com/actions/python-versions/blob/main/README.md
- 7: Remove Ubuntu 20.04 from workflows due to deprecation from 2025-04-15 actions/setup-python#1065
🌐 Web query:
GitHub Actions best practices pin commit SHA major version tags security
💡 Result:
The most effective security best practice for GitHub Actions is to pin every action to a full-length commit SHA [1][2][3]. Unlike version tags or branch names, which are mutable labels that can be redirected to malicious code by an attacker with repository access, a full-length commit SHA is cryptographically immutable and guarantees that the exact code you intend to run is executed every time [1][4][5][6]. Security Implications Using version tags (e.g., @v4) or branches (e.g., @main) creates a continuous, implicit trust relationship with the upstream repository [7]. If an attacker compromises the maintainer's account or gains write access to the repository, they can silently force-push updates to existing tags [1][8][6]. Downstream consumers relying on those tags will unknowingly execute the malicious code on their next workflow run, even if their own workflow files remain unchanged [4][7][9]. This has been a recurring vector for supply chain attacks in the GitHub Actions ecosystem [10][8][7]. Implementation and Maintenance Best Practices To manage the security and maintenance burden of SHA pinning: 1. Pin to SHA with a Human-Readable Comment: To maintain clarity, always include the version tag in a comment after the SHA [10]. This allows developers to understand which version they are using while gaining the security of immutability [4][10]. Example: uses: actions/checkout@3df4ab1 # v4.0.0 2. Automate Dependency Updates: Do not manually manage SHAs for all actions. Use dependency management tools like Dependabot or Renovate to automatically create pull requests that update your pinned SHAs [11][1][10]. For example, Renovate supports specific presets (e.g., helpers:pinGitHubActionDigests) that keep these hashes current [11][10]. 3. Enforce via Organization Policy: GitHub provides organization and repository-level policies that can block the use of actions not pinned to a full-length commit SHA [12][2][5]. Enabling this policy is a highly recommended security hardening step, as it forces all workflows to comply with immutable referencing [12][10]. 4. Audit the Dependency Tree: Remember that an action you use may rely on other actions internally. If you are an action maintainer, you should also pin your internal dependencies to full commit SHAs to ensure your consumers remain secure and compliant with organization-level policies [5]. By treating version tags only as discovery labels rather than trust anchors, you significantly reduce the risk of supply chain compromise in your CI/CD pipelines [3].
Citations:
- 1: https://github.com/kubernetes/community/blob/main/github-management/github-actions-policy.md
- 2: https://docs.github.com/en/actions/reference/security/secure-use
- 3: https://samsond.github.io/posts/why-tags-fail/
- 4: https://nearform.com/insights/why-you-should-pin-your-github-actions-by-commit-hash/
- 5: https://www.romainlespinasse.dev/posts/github-actions-commit-sha-pinning/
- 6: https://nefariousplan.com/posts/tj-actions-mutable-tags-were-always-a-lie
- 7: https://safeguard.sh/resources/blog/github-action-tag-redirect-imposter-commit-may-2026
- 8: https://sebastion.dev/intelligence/2026-03-12-poc-supply-chain-compromise-via-mutable-git-tag-poisoning
- 9: https://pipebreach.com/posts/pipeline-pointers-not-code/
- 10: https://architecture.lullabot.com/adr/20260505-pin-github-actions-to-hash/
- 11: https://emmer.dev/blog/pin-your-github-actions-to-protect-against-mutability/
- 12: https://github.blog/changelog/2025-08-15-github-actions-policy-now-supports-blocking-and-sha-pinning-actions/
Update to latest version and pin GitHub Actions to commit SHAs.
Lines 14–15 use outdated action major versions: actions/checkout@v4 and actions/setup-python@v4 are superseded by v4 for checkout and v6 for setup-python (current as of 2026). Additionally, both are unpinned version tags, which introduces supply-chain risk; version tags are mutable and can be redirected to malicious code. Use the latest major versions pinned to full commit SHAs with a version comment for clarity, e.g., actions/checkout@<commit-sha> # v4.x.x and actions/setup-python@<commit-sha> # v6.x.x.
🧰 Tools
🪛 actionlint (1.7.12)
[error] 15-15: the runner of "actions/setup-python@v4" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
🪛 zizmor (1.25.2)
[warning] 14-14: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false
(artipacked)
[error] 14-14: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
[error] 15-15: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/ci.yml around lines 14 - 15, Update the GitHub Actions in
the CI workflow to use the latest versions pinned to commit SHAs for security.
Replace the `actions/checkout@v4` action with the latest v4 commit SHA (pinned
with a version comment like `# v4.x.x`). Replace the `actions/setup-python@v4`
action with the latest v6 commit SHA (since v6 is the current version as of
2026) and also include a version comment like `# v6.x.x`. This mitigates
supply-chain risk by preventing version tags from being redirected to malicious
code while ensuring you are using the latest stable releases of these actions.
Source: Linters/SAST tools
| try: | ||
| result = get_abundance_data(st.session_state["workspace"]) | ||
| except Exception as e: | ||
| st.exception(e) | ||
| result = None | ||
|
|
||
| if result is None: | ||
| ws = st.session_state.get("workspace") | ||
| st.error("Debug: get_abundance_data returned None") | ||
| st.write("workspace:", ws) | ||
| wf = Path(ws) / "topp-workflow" | ||
| st.write("workflow dir exists:", wf.exists(), "->", wf) | ||
| qdir = wf / "results" / "quant_results" | ||
| st.write("quant_dir exists:", qdir.exists(), "->", qdir) | ||
| if qdir.exists(): | ||
| st.write("csv files:", sorted([p.name for p in qdir.glob('*.csv')])) | ||
| # show cached param snapshot if available | ||
| try: | ||
| from src.workflow.ParameterManager import ParameterManager | ||
| pm = ParameterManager(wf) | ||
| st.write("parameters keys (sample):", list(pm.get_parameters_from_json().items())[:20]) | ||
| except Exception as e: | ||
| st.write("Param manager error:", e) | ||
| st.stop() | ||
|
|
||
| if result is None: | ||
| st.info("💡 Please complete the configuration in the 'Configure' page to see results.") | ||
| st.stop() |
There was a problem hiding this comment.
Debug/diagnostic code left in production and unreachable code block.
Lines 43-66 contain extensive debug output (st.exception, st.write, st.error("Debug: ...")) that should be removed or gated behind a debug flag for production. Additionally, lines 68-70 are unreachable because st.stop() on line 66 terminates execution.
♻️ Proposed cleanup
with pre_processing_tab:
- # result = get_abundance_data(st.session_state["workspace"])
- # DEBUG: 상세 원인 출력 (임시)
- try:
- result = get_abundance_data(st.session_state["workspace"])
- except Exception as e:
- st.exception(e)
- result = None
-
- if result is None:
- ws = st.session_state.get("workspace")
- st.error("Debug: get_abundance_data returned None")
- st.write("workspace:", ws)
- wf = Path(ws) / "topp-workflow"
- st.write("workflow dir exists:", wf.exists(), "->", wf)
- qdir = wf / "results" / "quant_results"
- st.write("quant_dir exists:", qdir.exists(), "->", qdir)
- if qdir.exists():
- st.write("csv files:", sorted([p.name for p in qdir.glob('*.csv')]))
- # show cached param snapshot if available
- try:
- from src.workflow.ParameterManager import ParameterManager
- pm = ParameterManager(wf)
- st.write("parameters keys (sample):", list(pm.get_parameters_from_json().items())[:20])
- except Exception as e:
- st.write("Param manager error:", e)
- st.stop()
-
+ result = get_abundance_data(st.session_state["workspace"])
+
if result is None:
st.info("💡 Please complete the configuration in the 'Configure' page to see results.")
st.stop()🧰 Tools
🪛 Ruff (0.15.17)
[warning] 45-45: Do not catch blind exception: Exception
(BLE001)
[warning] 64-64: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@content/results_abundance.py` around lines 43 - 70, The code contains
extensive debug diagnostics (st.write, st.error with "Debug:" prefix,
ParameterManager inspection) that should not be in production, and the if result
is None block on lines 68-70 is unreachable because st.stop() on line 66
terminates execution. Remove all debug output statements from the exception
handler and the detailed diagnostic checks (Path/workflow directory checks, csv
file listing, and ParameterManager inspection). Keep the essential exception
logging and remove the first st.stop() call so the code proceeds to the proper
user-facing info message when result is None from the get_abundance_data call.
| if not (cache_dir / f"table_{cache_id_prefix}").is_dir(): | ||
| st.warning("Visualization cache not found. Please re-run the workflow.") | ||
| st.stop() |
There was a problem hiding this comment.
Preflight cache validation is incomplete across the OpenMS Insight pages. Each page validates only table_* before instantiating Heatmap, SequenceView, and LinePlot, so partial/missing cache artifacts can still trigger runtime failures.
content/results_database_search.py#L51-L53: validate all required cache IDs (table_,heatmap_,seqview_,lineplot_) before component initialization.content/results_filtered.py#L51-L53: add the same full cache preflight check before component initialization.content/results_rescoring.py#L52-L54: add the same full cache preflight check before component initialization.
📍 Affects 3 files
content/results_database_search.py#L51-L53(this comment)content/results_filtered.py#L51-L53content/results_rescoring.py#L52-L54
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@content/results_database_search.py` around lines 51 - 53, The cache preflight
validation is incomplete across all three pages and only checks for the table_
cache prefix, leaving the visualization components vulnerable to runtime
failures from missing cache artifacts. In content/results_database_search.py at
lines 51-53, expand the cache validation check to validate all four required
cache ID prefixes (table_, heatmap_, seqview_, and lineplot_) instead of only
table_. Apply the identical full preflight cache check validation at
content/results_filtered.py lines 51-53 before component initialization. Apply
the same complete cache validation check at content/results_rescoring.py lines
52-54 before component initialization. Each location should verify the existence
of all four cache prefixes in a single unified check before any Heatmap,
SequenceView, or LinePlot components are instantiated.
| if not heatmap_z.empty: | ||
| row_linkage = linkage(pdist(heatmap_z.values), method="average") | ||
| row_order = leaves_list(row_linkage) | ||
|
|
||
| col_linkage = linkage(pdist(heatmap_z.T.values), method="average") | ||
| col_order = leaves_list(col_linkage) |
There was a problem hiding this comment.
Guard hierarchical clustering for 1-row/1-column edge cases.
At Line 41 and Line 44, linkage(pdist(...)) can fail when there is only one protein row or one sample column after filtering (distance matrix is empty). Add a shape guard before clustering and fall back to unclustered plotting.
Proposed fix
-if not heatmap_z.empty:
- row_linkage = linkage(pdist(heatmap_z.values), method="average")
- row_order = leaves_list(row_linkage)
-
- col_linkage = linkage(pdist(heatmap_z.T.values), method="average")
- col_order = leaves_list(col_linkage)
-
- heatmap_clustered = heatmap_z.iloc[row_order, col_order]
+if not heatmap_z.empty:
+ if heatmap_z.shape[0] >= 2 and heatmap_z.shape[1] >= 2:
+ row_linkage = linkage(pdist(heatmap_z.values), method="average")
+ row_order = leaves_list(row_linkage)
+ col_linkage = linkage(pdist(heatmap_z.T.values), method="average")
+ col_order = leaves_list(col_linkage)
+ heatmap_clustered = heatmap_z.iloc[row_order, col_order]
+ else:
+ heatmap_clustered = heatmap_z📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if not heatmap_z.empty: | |
| row_linkage = linkage(pdist(heatmap_z.values), method="average") | |
| row_order = leaves_list(row_linkage) | |
| col_linkage = linkage(pdist(heatmap_z.T.values), method="average") | |
| col_order = leaves_list(col_linkage) | |
| if not heatmap_z.empty: | |
| if heatmap_z.shape[0] >= 2 and heatmap_z.shape[1] >= 2: | |
| row_linkage = linkage(pdist(heatmap_z.values), method="average") | |
| row_order = leaves_list(row_linkage) | |
| col_linkage = linkage(pdist(heatmap_z.T.values), method="average") | |
| col_order = leaves_list(col_linkage) | |
| heatmap_clustered = heatmap_z.iloc[row_order, col_order] | |
| else: | |
| heatmap_clustered = heatmap_z |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@content/results_heatmap.py` around lines 40 - 45, The hierarchical clustering
operations using linkage and pdist will fail when heatmap_z has only one row or
one column, as pdist cannot compute distances for a single sample. Before
calling linkage(pdist(heatmap_z.values)) at line 41, add a guard check that
heatmap_z.shape[0] > 1; if false, set row_order to a default unclustered order
(such as range(heatmap_z.shape[0])). Similarly, before calling
linkage(pdist(heatmap_z.T.values)) at line 44, add a guard check that
heatmap_z.shape[1] > 1; if false, set col_order to a default unclustered order
(such as range(heatmap_z.shape[1])). This ensures the code handles single-row or
single-column edge cases gracefully by falling back to unclustered plotting.
| fig_heatmap.update_xaxes(tickfont=dict(size=10)) | ||
| fig_heatmap.update_yaxes(tickfont=dict(size=8)) | ||
|
|
||
| st.plotly_chart(fig_heatmap, width="stretch") |
There was a problem hiding this comment.
❓ Verification inconclusive
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Streamlit version declarations =="
rg -n --iglob 'requirements*.txt' --iglob 'pyproject.toml' --iglob 'poetry.lock' --iglob 'environment*.yml' '\bstreamlit\b' || true
echo
echo "== plotly_chart call sites (inspect nearby args) =="
rg -n --type=py 'st\.plotly_chart\(' -A6 -B1
echo
echo "If declared Streamlit is 1.49.x, replace width=\"stretch\" with use_container_width=True at the listed sites."Repository: OpenMS/quantms-web
Length of output: 1951
🌐 Web query:
st.plotly_chart width parameter Streamlit API documentation 1.49
💡 Result:
The width parameter was not available in Streamlit version 1.49 [1][2]. The width parameter for st.plotly_chart was introduced in Streamlit version 1.51.0 [1][3]. In Streamlit 1.49, you must use the use_container_width parameter to control the chart's width [4]. For users on versions 1.51.0 and later, the width parameter is available with the following options [5]: - "stretch" (default): Matches the width of the parent container (equivalent to the deprecated use_container_width=True). - "content": Matches the width of the chart content, up to the parent container's width (equivalent to the deprecated use_container_width=False). - Integer: Sets the width in pixels. The use_container_width parameter is deprecated in favor of the width parameter [5][6].
Citations:
- 1: plotly_chart kwargs deprecation warning streamlit/streamlit#13396
- 2: https://github.com/streamlit/streamlit/releases/tag/1.49.0
- 3: https://github.com/streamlit/streamlit/releases/tag/1.51.0
- 4: https://docs.streamlit.io/1.35.0/develop/api-reference/charts/st.plotly_chart
- 5: https://docs.streamlit.io/develop/api-reference/charts/st.plotly_chart
- 6: [AdvancedLayouts] Add width to
st.plotly_chartstreamlit/streamlit#12559
🌐 Web query:
Streamlit st.plotly_chart use_container_width vs width parameter
💡 Result:
In Streamlit, st.plotly_chart has a new width parameter (recommended) and the old use_container_width parameter is deprecated. 1) What width does (recommended) - width="stretch" (default): the chart element width matches the width of the parent container. [1] - width="content": the chart width matches the width of its content, but won’t exceed the parent container width. [1] - width=: fixed pixel width; if larger than the container, it falls back to the container width. [1] 2) What use_container_width does (deprecated) - use_container_width is deprecated and will be removed in a future release. [1][2] - Mapping to width values: - use_container_width=True is equivalent to width="stretch". [1][3] - use_container_width=False is equivalent to width="content". [3] - Parameter precedence: if you explicitly pass use_container_width, it takes precedence over width. [1][3] 3) Practical guidance: which one to use? - Prefer width=... (especially width="stretch" to fill the container). [1] - If you still use use_container_width for backward compatibility, remember it overrides width when provided. [1][3] Example (recommended) st.plotly_chart(fig, width="stretch") st.plotly_chart(fig, width="content") Example (legacy; deprecated) st.plotly_chart(fig, use_container_width=True) # == width="stretch" st.plotly_chart(fig, use_container_width=False) # == width="content"
Citations:
- 1: https://docs.streamlit.io/develop/api-reference/charts/st.plotly_chart
- 2: https://github.com/streamlit/streamlit/blob/develop/lib/streamlit/elements/plotly_chart.py
- 3: [AdvancedLayouts] Add width to
st.plotly_chartstreamlit/streamlit#12559
st.plotly_chart(width="stretch") requires Streamlit 1.51.0 or later; 1.49.x will fail. The width parameter was introduced in Streamlit 1.51.0. If pinned to 1.49.x, replace width="stretch" with use_container_width=True:
content/results_heatmap.py#L66content/results_pathway.py#L250content/results_pca.py#L93content/results_volcano.py#L97
Alternatively, upgrade Streamlit to 1.51.0+ to use the width parameter (preferred).
📍 Affects 4 files
content/results_heatmap.py#L66-L66(this comment)content/results_pathway.py#L250-L250content/results_pca.py#L93-L93content/results_volcano.py#L97-L97
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@content/results_heatmap.py` at line 66, The code uses the `width` parameter
in st.plotly_chart() calls which requires Streamlit 1.51.0 or later but may be
incompatible with 1.49.x. Fix this across all four affected files: in
content/results_heatmap.py at line 66, content/results_pathway.py at line 250,
content/results_pca.py at line 93, and content/results_volcano.py at line 97.
Replace the `width="stretch"` parameter with `use_container_width=True` in each
st.plotly_chart() call, or alternatively upgrade Streamlit to version 1.51.0 or
later and keep the width parameter (preferred approach).
| for idxml_file in comet_results: | ||
| idxml_path = Path(idxml_file) | ||
| cache_id_prefix = idxml_path.stem | ||
|
|
||
| # Parse idXML to DataFrame | ||
| id_df, spectra_data = parse_idxml(idxml_path) | ||
|
|
||
| # Build spectra cache (only once) | ||
| if spectra_df is None: | ||
| filename_to_index = {Path(f).name: i for i, f in enumerate(spectra_data)} | ||
| spectra_df, filename_to_index = build_spectra_cache(mzml_dir, filename_to_index) | ||
|
|
||
| # Initialize Table component (caches itself) | ||
| Table( | ||
| cache_id=f"table_{cache_id_prefix}", | ||
| data=id_df.lazy(), | ||
| cache_path=str(cache_dir), | ||
| interactivity={"file": "file_index", "spectrum": "scan_id", "identification": "id_idx"}, | ||
| column_definitions=[ | ||
| {"field": "sequence", "title": "Sequence"}, | ||
| {"field": "charge", "title": "Z", "sorter": "number"}, | ||
| {"field": "mz", "title": "m/z", "sorter": "number"}, | ||
| {"field": "rt", "title": "RT", "sorter": "number"}, | ||
| {"field": "score", "title": "Score", "sorter": "number"}, | ||
| {"field": "protein_accession", "title": "Proteins"}, | ||
| ], | ||
| initial_sort=[{"column": "score", "dir": "asc"}], | ||
| index_field="id_idx", | ||
| ) | ||
|
|
||
| # Initialize Heatmap component | ||
| Heatmap( | ||
| cache_id=f"heatmap_{cache_id_prefix}", | ||
| data=id_df.lazy(), | ||
| cache_path=str(cache_dir), | ||
| x_column="rt", | ||
| y_column="mz", | ||
| intensity_column="score", | ||
| interactivity={"identification": "id_idx"}, | ||
| ) | ||
|
|
||
| # Initialize SequenceView component | ||
| seq_view = SequenceView( | ||
| cache_id=f"seqview_{cache_id_prefix}", | ||
| sequence_data=id_df.lazy().select(["id_idx", "sequence", "charge", "file_index", "scan_id"]).rename({ | ||
| "id_idx": "sequence_id", | ||
| "charge": "precursor_charge", | ||
| }), | ||
| peaks_data=spectra_df.lazy(), | ||
| filters={ | ||
| "identification": "sequence_id", | ||
| "file": "file_index", | ||
| "spectrum": "scan_id", | ||
| }, | ||
| interactivity={"peak": "peak_id"}, | ||
| cache_path=str(cache_dir), | ||
| deconvolved=False, | ||
| annotation_config={ | ||
| "ion_types": ["b", "y"], | ||
| "neutral_losses": True, | ||
| "tolerance": frag_tol, | ||
| "tolerance_ppm": frag_tol_is_ppm, | ||
| }, | ||
| ) | ||
|
|
||
| # Initialize LinePlot from SequenceView | ||
| LinePlot.from_sequence_view( | ||
| seq_view, | ||
| cache_id=f"lineplot_{cache_id_prefix}", | ||
| cache_path=str(cache_dir), | ||
| title="Annotated Spectrum", | ||
| styling={ | ||
| "unhighlightedColor": "#CCCCCC", | ||
| "highlightColor": "#E74C3C", | ||
| "selectedColor": "#F3A712", | ||
| }, | ||
| ) |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift
Significant code duplication in visualization cache building.
The Table/Heatmap/SequenceView/LinePlot initialization blocks are nearly identical across Comet results (lines 528-604), Percolator results (lines 620-692), and filtered results (lines 711-783). Extract this into a helper function to reduce duplication and maintenance burden.
♻️ Suggested helper function
def _build_visualization_cache(self, idxml_files: list, cache_dir: Path,
spectra_df, filename_to_index: dict,
mzml_dir: Path, frag_tol: float,
frag_tol_is_ppm: bool) -> tuple:
"""Build Table, Heatmap, SequenceView, LinePlot for a list of idXML files."""
for idxml_file in idxml_files:
idxml_path = Path(idxml_file)
cache_id_prefix = idxml_path.stem
id_df, spectra_data = parse_idxml(idxml_path)
if spectra_df is None:
filename_to_index = {Path(f).name: i for i, f in enumerate(spectra_data)}
spectra_df, filename_to_index = build_spectra_cache(mzml_dir, filename_to_index)
# Initialize components...
Table(cache_id=f"table_{cache_id_prefix}", ...)
Heatmap(cache_id=f"heatmap_{cache_id_prefix}", ...)
# etc.
return spectra_df, filename_to_indexThen call it for each stage:
spectra_df, filename_to_index = self._build_visualization_cache(
comet_results, cache_dir, spectra_df, filename_to_index, mzml_dir, frag_tol, frag_tol_is_ppm
)Also applies to: 620-692, 711-783
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/Workflow.py` around lines 528 - 604, Extract the duplicated Table,
Heatmap, SequenceView, and LinePlot initialization blocks that appear across
three locations (src/Workflow.py lines 528-604 for Comet results, lines 620-692
for Percolator results, and lines 711-783 for filtered results) into a new
helper method named _build_visualization_cache. This helper should accept
parameters for idxml_files, cache_dir, spectra_df, filename_to_index, mzml_dir,
frag_tol, and frag_tol_is_ppm, contain the for loop over idxml files and all
component initialization logic, and return the updated spectra_df and
filename_to_index. Then replace each of the three duplicated blocks with a
single call to this helper method, passing in the appropriate arguments for that
stage (comet_results, percolator_results, and filtered_results respectively).
| g2 = row[group_b_cols].values.astype(float) | ||
|
|
||
| # Calculate Log2 Fold Change | ||
| log2fc = np.log2(np.mean(g2) / np.mean(g1)) if np.mean(g1) > 0 else 0 |
There was a problem hiding this comment.
Potential division by zero when calculating log2 fold change.
If np.mean(g1) equals zero (e.g., all control samples have zero intensity), line 999 will produce either inf or a runtime warning. The ternary only checks if > 0 but doesn't handle the case where np.mean(g2) is also zero.
🐛 Proposed fix
- # Calculate Log2 Fold Change
- log2fc = np.log2(np.mean(g2) / np.mean(g1)) if np.mean(g1) > 0 else 0
+ # Calculate Log2 Fold Change
+ mean_g1, mean_g2 = np.mean(g1), np.mean(g2)
+ if mean_g1 > 0 and mean_g2 > 0:
+ log2fc = np.log2(mean_g2 / mean_g1)
+ else:
+ log2fc = np.nan🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/Workflow.py` around lines 996 - 999, The log2 fold change calculation in
the ternary operator checks only if np.mean(g1) > 0, which doesn't fully prevent
division by zero issues. When np.mean(g2) is also zero or when both means are
zero, the expression still produces invalid results like infinity or NaN.
Refactor the ternary condition to check that both np.mean(g1) and np.mean(g2)
are greater than zero before performing the division, and return an appropriate
default value (such as 0 or NaN) when either mean is zero or both are zero,
ensuring robust handling of all edge cases in the log2fc calculation.
| stderr_thread.join() | ||
|
|
||
| def run_topp(self, tool: str, input_output: dict, custom_params: dict = {}) -> bool: | ||
| def run_topp(self, tool: str, input_output: dict, custom_params: dict = {}, tool_instance_name: str = None) -> bool: |
There was a problem hiding this comment.
Mutable default argument custom_params: dict = {} can cause unexpected state sharing.
Using a mutable default like {} means all calls without an explicit custom_params share the same dictionary object. While the current code doesn't mutate it, this is a latent bug waiting to happen.
🐛 Proposed fix
- def run_topp(self, tool: str, input_output: dict, custom_params: dict = {}, tool_instance_name: str = None) -> bool:
+ def run_topp(self, tool: str, input_output: dict, custom_params: dict | None = None, tool_instance_name: str | None = None) -> bool:And at the start of the method:
if custom_params is None:
custom_params = {}🧰 Tools
🪛 Ruff (0.15.17)
[warning] 220-220: Do not use mutable data structures for argument defaults
Replace with None; initialize within function
(B006)
[warning] 220-220: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/workflow/CommandExecutor.py` at line 220, The method run_topp has a
mutable default argument custom_params: dict = {} which can cause unexpected
state sharing across function calls. Change the default parameter from
custom_params: dict = {} to custom_params: dict = None in the method signature,
then add a check at the start of the method body that initializes custom_params
to an empty dictionary if it is None, following the pattern: if custom_params is
None: custom_params = {}
| elif isinstance(p["value"], bool): | ||
| bool_value = ( | ||
| (p["value"] == "true") | ||
| if type(p["value"]) == str | ||
| else p["value"] | ||
| ) | ||
| cols[i].selectbox( | ||
| name, | ||
| value=( | ||
| (p["value"] == "true") | ||
| if type(p["value"]) == str | ||
| else p["value"] | ||
| ), | ||
| options=[True, False], | ||
| index=0 if bool_value else 1, | ||
| format_func=lambda x: "True" if x else "False", | ||
| help=p["description"], | ||
| key=key, | ||
| ) |
There was a problem hiding this comment.
Use isinstance() instead of type() == for type checking.
Line 879 uses type(p["value"]) == str which doesn't account for subclasses and is not idiomatic Python. Use isinstance() for type checks.
🐛 Proposed fix
elif isinstance(p["value"], bool):
bool_value = (
- (p["value"] == "true")
- if type(p["value"]) == str
- else p["value"]
+ p["value"].lower() == "true"
+ if isinstance(p["value"], str)
+ else bool(p["value"])
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| elif isinstance(p["value"], bool): | |
| bool_value = ( | |
| (p["value"] == "true") | |
| if type(p["value"]) == str | |
| else p["value"] | |
| ) | |
| cols[i].selectbox( | |
| name, | |
| value=( | |
| (p["value"] == "true") | |
| if type(p["value"]) == str | |
| else p["value"] | |
| ), | |
| options=[True, False], | |
| index=0 if bool_value else 1, | |
| format_func=lambda x: "True" if x else "False", | |
| help=p["description"], | |
| key=key, | |
| ) | |
| elif isinstance(p["value"], bool): | |
| bool_value = ( | |
| p["value"].lower() == "true" | |
| if isinstance(p["value"], str) | |
| else bool(p["value"]) | |
| ) | |
| cols[i].selectbox( | |
| name, | |
| options=[True, False], | |
| index=0 if bool_value else 1, | |
| format_func=lambda x: "True" if x else "False", | |
| help=p["description"], | |
| key=key, | |
| ) |
🧰 Tools
🪛 Ruff (0.15.17)
[error] 879-879: Use is and is not for type comparisons, or isinstance() for isinstance checks
(E721)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/workflow/StreamlitUI.py` around lines 876 - 889, The type check at line
879 in the bool value handling section uses `type(p["value"]) == str` which
doesn't account for string subclasses and is not idiomatic Python. Replace
`type(p["value"]) == str` with `isinstance(p["value"], str)` to properly check
if the value is a string type and follow Python best practices.
| def test_get_topp_parameters_includes_defaults(self, temp_workflow_dir): | ||
| """get_topp_parameters merges _defaults between ini and user values.""" | ||
| pm = ParameterManager(temp_workflow_dir) | ||
| params_json = { | ||
| "_defaults": {"SomeTool": {"algorithm:param": 42.0}}, | ||
| "SomeTool": {"algorithm:other": 99.0} | ||
| } | ||
| with open(pm.params_file, "w") as f: | ||
| json.dump(params_json, f) | ||
|
|
||
| result = pm.get_merged_params("SomeTool", ini_params={"algorithm:param": 1.0}) | ||
| assert result["algorithm:param"] == 42.0 | ||
| assert result["algorithm:other"] == 99.0 |
There was a problem hiding this comment.
This test doesn’t actually exercise get_topp_parameters().
At Line 190, the test calls get_merged_params(...) directly, so the get_topp_parameters(...) contract in the test name/docstring is not being validated. Either rename the test to match behavior or invoke get_topp_parameters with a prepared .ini fixture.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/test_parameter_defaults.py` around lines 180 - 192, The test method
test_get_topp_parameters_includes_defaults does not actually test the function
named in its title and docstring. The test currently calls get_merged_params()
at line 190, but the test name and docstring claim it validates
get_topp_parameters(). Either rename the test method to accurately reflect that
it tests get_merged_params() (and update the docstring accordingly), or refactor
the test to actually call get_topp_parameters() with a prepared .ini file
fixture to validate the true contract being described in the docstring.
Changes
TMT Analysis Workflow & Interactive Dashboard (
feat)src/Workflow.pyto transition from generic widgets to a comprehensive TMT workflow includingIsobaricAnalyzer,CometAdapter,PerocolatorAdapter,IDFilter,ProteinInterference, andProteinQuantifierapp.pylinked to modular visualization scripts (content/results_*.py) for interactive data exploration (Abundance, Volcano plots, PCA, Heatmaps, Pathway Analysis, etc.).2. Core Engine: Support for No-Value CLI Flags (
feat)StreamlitUIandCommandExecutorto cleanly handle boolean parameters that act as value-less CLI flags (e.g., passing-forceinstead of-force true).3. Cross-Platform Compatibility (
fix)WorkflowManager.py.subprocesswith Windows nativetaskkill /F /T /PIDfor robust tree-termination on Windows, while gracefully falling back to standardos.kill(pid, signal.SIGTERM)on Unix-like environments.Related Issues
Testing Checklist
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Documentation