Skip to content

feat: apply multiple testing correction (p-adj) for statistical significance#12

Open
hjn0415a wants to merge 7 commits intoOpenMS:mainfrom
hjn0415a:feature/p-adj
Open

feat: apply multiple testing correction (p-adj) for statistical significance#12
hjn0415a wants to merge 7 commits intoOpenMS:mainfrom
hjn0415a:feature/p-adj

Conversation

@hjn0415a
Copy link

@hjn0415a hjn0415a commented Feb 13, 2026

  • Integration of statsmodels: Added statsmodels.stats.multitest to perform robust p-value adjustment.

  • Updated load_abundance_data:

  1. Implemented the Benjamini-Hochberg (FDR) method to compute adjusted p-values.
  2. Integrated the p-adj column into the final pivot_df output.
  3. Added logic to handle NaN values during the correction process to ensure data integrity.

Summary by CodeRabbit

  • New Features

    • Added Proteomics LFQ results page with protein abundance analysis and GO enrichment insights
    • GO enrichment results organized by biological process, cellular component, and molecular function
  • Improvements

    • Updated statistical calculations to use adjusted p-values (FDR) in PCA and volcano plot analyses
    • Improved Windows compatibility for workflow management

@coderabbitai
Copy link

coderabbitai bot commented Feb 13, 2026

📝 Walkthrough

Walkthrough

This PR adds GO enrichment analysis capabilities to a proteomics workflow application. It introduces a new Proteomics LFQ results page, implements GO enrichment analysis with MyGeneInfo integration and Fisher's exact testing, migrates statistical filtering from raw p-values to FDR-adjusted p-values across multiple result views, and adds Windows compatibility to workflow management.

Changes

Cohort / File(s) Summary
Navigation & UI
app.py
Added new "Proteomics LFQ" navigation page entry to the Results section.
Proteomics LFQ Results Page
content/results_proteomicslfq.py
New page displaying protein-level abundance data with pivot table sorted by p-values, and GO enrichment results (BP/CC/MF) with Plotly figures and DataFrames.
P-value to P-adj Migration
content/results_pca.py, content/results_volcano.py, src/common/results_helpers.py
Migrated statistical filtering and sorting from "p-value" to FDR-adjusted "p-adj". Results helpers now compute adjusted p-values via statsmodels multipletests and include "p-adj" in output DataFrames.
GO Enrichment Analysis
src/WorkflowTest.py
Added new _run_go_enrichment() method performing GO enrichment: cleans UniProt IDs, queries MyGeneInfo for BP/CC/MF terms, splits genes into background and foreground sets, computes Fisher's exact test per term, and generates/saves Plotly visualizations and result DataFrames.
Dependencies
requirements.txt
Added mygene and statsmodels packages to support GO enrichment queries and statistical testing.
Workflow Management
src/workflow/WorkflowManager.py
Enhanced stop_local_workflow() with Windows compatibility via platform-specific process termination (taskkill vs. os.kill) and improved exception handling for PID file cleanup.
Minor Cleanup
content/workflow_run.py
Removed extraneous whitespace between workflow initialization and execution.

Sequence Diagram(s)

sequenceDiagram
    participant WT as WorkflowTest
    participant AD as get_abundance_data
    participant MGI as MyGeneInfo API
    participant SA as Statistical Analysis
    participant RS as Results Storage

    WT->>AD: request abundance data
    AD-->>WT: return pivot_df with proteins
    WT->>WT: extract UniProt IDs from pivot_df
    WT->>WT: filter genes by p-value & log2FC<br/>(foreground & background sets)
    
    loop for each GO type (BP, CC, MF)
        WT->>MGI: query GO terms for genes
        MGI-->>WT: return GO term mappings
        WT->>SA: compute Fisher's exact test<br/>per GO term
        SA-->>WT: return p-values & odds ratios
        WT->>WT: generate Plotly bar chart
    end
    
    WT->>RS: save figures (JSON) & DataFrames
    WT->>WT: store results in session state
    WT-->>WT: log completion
Loading

Poem

🐰 A workflow enhanced with enrichment so bright,
GO terms now shining with statistical might,
From p-values adjusted to genes rearranged,
Windows and Unix in harmony changed—
The proteomics path now delightfully strange! 🧬

🚥 Pre-merge checks | ✅ 2 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 45.45% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (15 files):

⚔️ .github/workflows/build-windows-executable-app.yaml (content)
⚔️ .streamlit/config.toml (content)
⚔️ Dockerfile (content)
⚔️ app.py (content)
⚔️ content/results_pca.py (content)
⚔️ content/results_volcano.py (content)
⚔️ content/workflow_run.py (content)
⚔️ docker-compose.yml (content)
⚔️ requirements.txt (content)
⚔️ src/WorkflowTest.py (content)
⚔️ src/common/common.py (content)
⚔️ src/common/results_helpers.py (content)
⚔️ src/workflow/CommandExecutor.py (content)
⚔️ src/workflow/StreamlitUI.py (content)
⚔️ src/workflow/WorkflowManager.py (content)

These conflicts must be resolved before merging into main.
Resolve conflicts locally and push changes to this branch.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: applying multiple testing correction (p-adj) for statistical significance, which is the primary focus across multiple files in the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
⚔️ Resolve merge conflicts (beta)
  • Auto-commit resolved conflicts to branch feature/p-adj
  • Post resolved changes as copyable diffs in a comment

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
content/results_pca.py (1)

46-46: ⚠️ Potential issue | 🟡 Minor

Inconsistent user-facing message: still says "p-value" instead of "p-adj".

Line 46 reads "Not enough proteins after p-value filtering for PCA" but filtering now uses p-adj.

Proposed fix
-    st.info("Not enough proteins after p-value filtering for PCA.")
+    st.info("Not enough proteins after p-adj filtering for PCA.")
🤖 Fix all issues with AI agents
In `@content/results_proteomicslfq.py`:
- Line 56: The table is being sorted by "p-value" but should use adjusted
p-values; update the call that renders the dataframe (st.dataframe(...)) to sort
pivot_df by "p-adj" instead of "p-value" (i.e., change
pivot_df.sort_values("p-value") to pivot_df.sort_values("p-adj") when
rendering), and ensure pivot_df contains the "p-adj" column or fall back
gracefully if missing (check pivot_df.columns or use
pivot_df.sort_values(by=["p-adj"] if "p-adj" in pivot_df.columns else
["p-value"])).

In `@src/common/results_helpers.py`:
- Around line 264-270: The issue is that when stats_df is empty the "p-adj"
column is never created and later access raises a KeyError; fix it by ensuring
"p-adj" exists regardless of emptiness—before the existing if not stats_df.empty
block (or in an else branch for the empty case) initialize stats_df["p-adj"] =
np.nan (using the same numpy alias) so that subsequent code referencing
stats_df["p-adj"] (and functions using stats_df, mask, multipletests) always
finds the column.

In `@src/WorkflowTest.py`:
- Around line 815-828: The inline GO enrichment block and the subsequent final
report comment both use the "5️⃣" section number; update the final report
section marker to "6️⃣" to avoid duplicate numbering—locate the comment
preceding the final report (near the block that follows
workspace_path/get_abundance_data and the call to self._run_go_enrichment) and
change "5️⃣ Final report" to "6️⃣ Final report" so section numbers are unique.
- Around line 856-859: The foreground selection for GO enrichment still filters
on analysis_df["p-value"] when the PR intends to use FDR-adjusted p-values;
update the filter in the fg_ids assignment to use the adjusted p-value column
(e.g., "p-adj") instead of "p-value" while still applying the log2FC cutoff
(fc_cutoff) and p_cutoff threshold, and also ensure the earlier dropna call that
constructs pivot_df includes "p-adj" in its subset so p-adj values are present
for filtering; locate and modify the fg_ids computation and the prior dropna
call (references: variable fg_ids, dataframe analysis_df, thresholds p_cutoff
and fc_cutoff, and the "p-adj" column) accordingly.
- Around line 870-872: The mygene query mg.querymany(bg_ids, scopes="uniprot",
fields="go", as_dataframe=False) is a network call with no timeout or error
handling; wrap this call in a try/except around the mg.querymany invocation (the
call that uses bg_ids and returns res_list) and add a timeout parameter if
supported by the client (or call via a requests.Session with timeout) so the
call fails fast; on exception, log the error (including exception details) and
implement a clear fallback (e.g., return an empty res_list or re-raise a custom
exception) so the workflow does not hang indefinitely.
🧹 Nitpick comments (8)
requirements.txt (1)

148-152: Consider pinning versions for mygene and statsmodels.

Both new dependencies are unpinned, which is consistent with other deps at the bottom of this file (e.g., scipy, scikit-learn). However, for reproducibility and to avoid surprise breakage, consider adding version constraints — especially for statsmodels, whose multipletests API you depend on.

src/workflow/WorkflowManager.py (1)

207-212: Prefer subprocess.run over os.system for process termination.

While the int() cast on line 206 ensures pid is always a safe integer (mitigating the shell injection flagged by Ruff S605), os.system is still a suboptimal choice: it spawns a shell, doesn't raise on failure, and returns only an opaque exit code.

Proposed fix
+import subprocess
 ...
                 # Windows
                 if platform.system() == "Windows":
-                    os.system(f"taskkill /F /T /PID {pid}")
+                    subprocess.run(
+                        ["taskkill", "/F", "/T", "/PID", str(pid)],
+                        check=False,
+                    )
                 else:
                     # Linux/macOS
                     os.kill(pid, signal.SIGTERM)
src/WorkflowTest.py (4)

875-875: Use != Truene(True) or invert with ~ for pandas boolean filtering.

res_go["notfound"] != True works but triggers Ruff E712. The notfound column may contain NaN for found entries, so a safe alternative:

Proposed fix
-                    res_go = res_go[res_go["notfound"] != True]
+                    res_go = res_go[~res_go["notfound"].fillna(False).astype(bool)]

885-888: Lambda captures loop variable go_type — but safe here since .apply() executes immediately.

Ruff B023 flags this, but since the lambda is consumed immediately by .apply() within the same iteration, go_type has the correct value. No functional bug. Optionally, you can bind the variable as a default argument to silence the lint:

Optional fix to silence lint
                     res_go[f"{go_type}_terms"] = res_go["go"].apply(
-                        lambda x: extract_go_terms(x, go_type)
+                        lambda x, _gt=go_type: extract_go_terms(x, _gt)
                     )

893-893: Remove extraneous f prefix from strings without placeholders (lines 893, 942, 961).

Ruff F541 flags these. They're harmless but noisy.

Proposed fix
-                    self.logger.log(f"✅ fg_set bg_set are set")
+                    self.logger.log("✅ fg_set bg_set are set")
-                        self.logger.log(f"✅ Plotly Figure generated")
+                        self.logger.log("✅ Plotly Figure generated")
-                    self.logger.log(f"✅ go_type generated")
+                    self.logger.log("✅ go_type generated")

1013-1037: idxml_to_df is duplicated from src/common/results_helpers.py.

This local function (lines 1013–1037) is identical to the one in results_helpers.py (lines 18–43). Consider importing and reusing it to avoid drift. Note that it's already imported at line 16 via parse_idxml, so adding idxml_to_df to the import would be straightforward.

Proposed fix
-from src.common.results_helpers import parse_idxml, build_spectra_cache
+from src.common.results_helpers import parse_idxml, build_spectra_cache, idxml_to_df

Then remove the local idxml_to_df definition at lines 1013–1037.

content/results_proteomicslfq.py (2)

39-39: Single-element st.tabs — intentional placeholder?

st.tabs(["🧬 Protein Table"]) with a single tab adds visual overhead. If more tabs are planned (e.g., a GO enrichment tab to complement the section below), this makes sense as a placeholder. Otherwise, the tab wrapper can be removed.


67-74: Consider handling malformed go_results.json gracefully.

If the JSON file is corrupted or has an unexpected schema, the page will crash. A try/except around the load would improve resilience.

Proposed fix
     import json
     import plotly.io as pio
     
-    with open(go_json_file, "r") as f:
-        go_data = json.load(f)
+    try:
+        with open(go_json_file, "r") as f:
+            go_data = json.load(f)
+    except (json.JSONDecodeError, KeyError) as e:
+        st.error(f"Failed to load GO results: {e}")
+        st.stop()

st.info("No protein-level data available.")
else:
st.session_state["pivot_df"] = pivot_df
st.dataframe(pivot_df.sort_values("p-value"), use_container_width=True)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Sorting by p-value instead of p-adj — inconsistent with the PR's goal.

All other result pages (PCA, Volcano) have been migrated to use p-adj. This page still sorts the protein table by raw p-value.

Proposed fix
-        st.dataframe(pivot_df.sort_values("p-value"), use_container_width=True)
+        st.dataframe(pivot_df.sort_values("p-adj"), use_container_width=True)
🤖 Prompt for AI Agents
In `@content/results_proteomicslfq.py` at line 56, The table is being sorted by
"p-value" but should use adjusted p-values; update the call that renders the
dataframe (st.dataframe(...)) to sort pivot_df by "p-adj" instead of "p-value"
(i.e., change pivot_df.sort_values("p-value") to pivot_df.sort_values("p-adj")
when rendering), and ensure pivot_df contains the "p-adj" column or fall back
gracefully if missing (check pivot_df.columns or use
pivot_df.sort_values(by=["p-adj"] if "p-adj" in pivot_df.columns else
["p-value"])).

Comment on lines +264 to +270
if not stats_df.empty:
mask = stats_df["p-value"].notna()
if mask.any():
_, p_adj, _, _ = multipletests(stats_df.loc[mask, "p-value"], method="fdr_bh")
stats_df.loc[mask, "p-adj"] = p_adj
else:
stats_df["p-adj"] = np.nan
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

FDR correction logic is well-implemented.

Proper NaN handling with the mask, correct multipletests unpacking, and the fdr_bh method is the standard Benjamini–Hochberg choice.

One edge case: if stats_df is empty (no protein groups), the p-adj column is never created, and line 296 will raise a KeyError. While unlikely given the earlier guards, consider adding a fallback:

Proposed defensive addition
     if not stats_df.empty:
         mask = stats_df["p-value"].notna()
         if mask.any():
             _, p_adj, _, _ = multipletests(stats_df.loc[mask, "p-value"], method="fdr_bh")
             stats_df.loc[mask, "p-adj"] = p_adj
         else:
             stats_df["p-adj"] = np.nan
+    else:
+        stats_df["p-adj"] = pd.Series(dtype=float)
🤖 Prompt for AI Agents
In `@src/common/results_helpers.py` around lines 264 - 270, The issue is that when
stats_df is empty the "p-adj" column is never created and later access raises a
KeyError; fix it by ensuring "p-adj" exists regardless of emptiness—before the
existing if not stats_df.empty block (or in an else branch for the empty case)
initialize stats_df["p-adj"] = np.nan (using the same numpy alias) so that
subsequent code referencing stats_df["p-adj"] (and functions using stats_df,
mask, multipletests) always finds the column.

Comment on lines +815 to 828
# ======================================================
# ⚠️ 5️⃣ GO Enrichment Analysis (INLINE IN EXECUTION)
# ======================================================
workspace_path = Path(self.workflow_dir).parent
res = get_abundance_data(workspace_path)
if res is not None:
pivot_df, _, _ = res
self.logger.log("✅ pivot_df loaded, starting GO enrichment...")
self._run_go_enrichment(pivot_df, results_dir)
else:
st.warning("GO enrichment skipped: abundance data not available.")

# ================================
# 5️⃣ Final report
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Duplicate section numbering: two sections labeled "5️⃣".

Line 816 labels GO enrichment as step 5️⃣, and line 828 labels the final report as step 5️⃣ as well. Consider renumbering the final report to 6️⃣.

🤖 Prompt for AI Agents
In `@src/WorkflowTest.py` around lines 815 - 828, The inline GO enrichment block
and the subsequent final report comment both use the "5️⃣" section number;
update the final report section marker to "6️⃣" to avoid duplicate
numbering—locate the comment preceding the final report (near the block that
follows workspace_path/get_abundance_data and the call to
self._run_go_enrichment) and change "5️⃣ Final report" to "6️⃣ Final report" so
section numbers are unique.

Comment on lines +856 to +859
fg_ids = analysis_df[
(analysis_df["p-value"] < p_cutoff) &
(analysis_df["log2FC"].abs() >= fc_cutoff)
]["UniProt"].dropna().astype(str).unique().tolist()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

GO enrichment foreground selection still uses raw p-value — should this use p-adj?

The PR's stated goal is to upgrade from raw p-values to FDR-adjusted p-values. However, the foreground gene set for GO enrichment at line 857 still filters on analysis_df["p-value"] < p_cutoff. Since pivot_df now contains p-adj, consider using it here for consistency:

Proposed fix
-                fg_ids = analysis_df[
-                    (analysis_df["p-value"] < p_cutoff) &
-                    (analysis_df["log2FC"].abs() >= fc_cutoff)
-                ]["UniProt"].dropna().astype(str).unique().tolist()
+                fg_ids = analysis_df[
+                    (analysis_df["p-adj"] < p_cutoff) &
+                    (analysis_df["log2FC"].abs() >= fc_cutoff)
+                ]["UniProt"].dropna().astype(str).unique().tolist()

Also update line 840 to include p-adj in the dropna subset:

-        analysis_df = pivot_df.dropna(subset=["p-value", "log2FC"]).copy()
+        analysis_df = pivot_df.dropna(subset=["p-adj", "log2FC"]).copy()
🤖 Prompt for AI Agents
In `@src/WorkflowTest.py` around lines 856 - 859, The foreground selection for GO
enrichment still filters on analysis_df["p-value"] when the PR intends to use
FDR-adjusted p-values; update the filter in the fg_ids assignment to use the
adjusted p-value column (e.g., "p-adj") instead of "p-value" while still
applying the log2FC cutoff (fc_cutoff) and p_cutoff threshold, and also ensure
the earlier dropna call that constructs pivot_df includes "p-adj" in its subset
so p-adj values are present for filtering; locate and modify the fg_ids
computation and the prior dropna call (references: variable fg_ids, dataframe
analysis_df, thresholds p_cutoff and fc_cutoff, and the "p-adj" column)
accordingly.

Comment on lines +870 to +872
res_list = mg.querymany(
bg_ids, scopes="uniprot", fields="go", as_dataframe=False
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

External API call to MyGene.info has no timeout or error handling.

mg.querymany(...) is a network call. If the service is slow or unreachable, this will hang the workflow indefinitely. Wrap it in a try/except and consider setting a timeout.

Proposed fix
-                    res_list = mg.querymany(
-                        bg_ids, scopes="uniprot", fields="go", as_dataframe=False
-                    )
+                    try:
+                        res_list = mg.querymany(
+                            bg_ids, scopes="uniprot", fields="go", as_dataframe=False
+                        )
+                    except Exception as e:
+                        self.logger.log(f"❗ MyGene.info API error: {e}")
+                        st.error("Failed to fetch GO terms from MyGene.info. Please try again later.")
+                        return
🤖 Prompt for AI Agents
In `@src/WorkflowTest.py` around lines 870 - 872, The mygene query
mg.querymany(bg_ids, scopes="uniprot", fields="go", as_dataframe=False) is a
network call with no timeout or error handling; wrap this call in a try/except
around the mg.querymany invocation (the call that uses bg_ids and returns
res_list) and add a timeout parameter if supported by the client (or call via a
requests.Session with timeout) so the call fails fast; on exception, log the
error (including exception details) and implement a clear fallback (e.g., return
an empty res_list or re-raise a custom exception) so the workflow does not hang
indefinitely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant