diff --git a/component-readiness.md b/component-readiness.md
new file mode 100644
index 0000000000..ccfee68a64
--- /dev/null
+++ b/component-readiness.md
@@ -0,0 +1,1567 @@
+# Component Readiness: Complete System Documentation
+
+This document provides comprehensive documentation of the component readiness system, including data flows, database schemas, queries, and implementation details.
+
+## Overview
+
+Component Readiness is a statistical analysis system that:
+- Compares test pass rates between OpenShift releases (base vs. sample)
+- Detects statistically significant regressions using Fisher's Exact Test
+- Tracks regressions and associates them with bug triages
+- Provides drill-down capabilities from component-level to job-level analysis
+
+**Data Sources:**
+- **BigQuery**: Historical test execution data and metadata (read-only)
+- **PostgreSQL**: Regression tracking, triage management, and audit logs (read-write)
+- **Redis Cache**: Query result caching (1-4 hour TTL)
+
+---
+
+## Sequence Diagram: Component Report Generation
+
+```mermaid
+sequenceDiagram
+ participant Client as Frontend Client
+ participant API as API Handler
(component_report.go)
+ participant Cache as Redis Cache
+ participant Generator as Report Generator
+ participant BQ as BigQuery
+ participant PG as PostgreSQL
+
+ Note over BQ: Tables: junit, jobs,
job_variants, component_mapping,
job_labels
+ Note over PG: Tables: test_regressions,
triages, triage_regressions
+
+ Client->>API: GET /api/component_readiness
?baseRelease=4.15&sampleRelease=4.16
&variants[Platform]=aws
+
+ API->>Cache: Check cache key
ComponentReport~4.15~4.16~Platform:aws~...
+
+ alt Cache Hit
+ Cache-->>API: Return cached report
+ API->>PG: Query test_regressions
WHERE release='4.16'
+ PG-->>API: Regression records + triages
+ Note over API: PostAnalysis middleware:
- Inject regression data
- Add test_details links
- Update cell status
+ API-->>Client: Component Report (JSON)
+ else Cache Miss
+ Cache-->>API: Cache miss
+
+ API->>Generator: GetComponentReportFromBigQuery()
+
+ par Parallel BigQuery Queries
+ Generator->>BQ: getBaseQueryStatus()
SELECT test counts, variants
FROM junit j
INNER JOIN component_mapping cm
LEFT JOIN job_variants jv_*
WHERE jv_Release='4.15'
GROUP BY test_id, variants
+ BQ-->>Generator: Base test results
(pass/fail/flake counts)
+
+ Generator->>BQ: getSampleQueryStatus()
SELECT test counts, variants
FROM junit j
INNER JOIN component_mapping cm
LEFT JOIN job_variants jv_*
WHERE jv_Release='4.16'
GROUP BY test_id, variants
+ BQ-->>Generator: Sample test results
(pass/fail/flake counts)
+ end
+
+ Note over Generator: Merge base + sample results
by test_id + variants
+
+ Generator->>Generator: generateComponentTestReport()
- PreAnalysis middleware
- Fisher's Exact Test
- Pass rate comparison
- Detect significant regressions
+
+ Note over Generator: Statistical Analysis:
- p-value < 0.02 → Extreme
- p-value < 0.05 → Significant
- Pass rate drop → Regression
- Missing basis/sample handling
+
+ Generator->>Cache: Store ComponentReport
TTL: configured cache duration
+
+ Generator-->>API: ComponentReport (grid data)
+
+ API->>PG: Query test_regressions
WHERE release='4.16'
+ PG-->>API: Regression records + triages
+
+ Note over API: PostAnalysis middleware:
- Inject regression data
- Add test_details links
- Update cell status with triage info
+
+ API-->>Client: Component Report (JSON)
+ end
+```
+
+---
+
+## Sequence Diagram: Test Details Drill-Down
+
+```mermaid
+sequenceDiagram
+ participant Client as Frontend Client
+ participant API as API Handler
(test_details.go)
+ participant Cache as Redis Cache
+ participant Generator as Details Generator
+ participant BQ as BigQuery
+ participant PG as PostgreSQL
+
+ Client->>API: GET /api/component_readiness/test_details
?baseRelease=4.15&sampleRelease=4.16
&testId=openshift-tests:abc123
+
+ API->>Cache: Check cache key
TestDetails~4.15~4.16~testId:abc123~...
+
+ alt Cache Hit
+ Cache-->>API: Return cached test details
+ API-->>Client: Test Details Report (JSON)
+ else Cache Miss
+ Cache-->>API: Cache miss
+
+ API->>Generator: GetTestDetails()
+
+ par Parallel BigQuery Queries
+ Generator->>BQ: getBaseJobRunTestStatus()
SELECT prowjob_name, prowjob_url,
build_id, prowjob_start,
test counts, variants
FROM junit j
INNER JOIN jobs ON j.prowjob_build_id
INNER JOIN component_mapping cm
LEFT JOIN job_variants jv_*
WHERE jv_Release='4.15'
AND test_id='openshift-tests:abc123'
GROUP BY prowjob_name, variants
+ BQ-->>Generator: Base job runs
(per-job test results)
+
+ Generator->>BQ: getSampleJobRunTestStatus()
SELECT prowjob_name, prowjob_url,
build_id, prowjob_start,
test counts, variants
FROM junit j
INNER JOIN jobs ON j.prowjob_build_id
INNER JOIN component_mapping cm
LEFT JOIN job_variants jv_*
WHERE jv_Release='4.16'
AND test_id='openshift-tests:abc123'
GROUP BY prowjob_name, variants
+ BQ-->>Generator: Sample job runs
(per-job test results)
+ end
+
+ Note over Generator: Merge job runs by prowjob_name
+
+ Generator->>Generator: GenerateDetailsReportForTest()
- Group by job name
- Calculate Fisher's Exact Test per job
- Determine job-level status
- Handle base release fallback
+
+ Note over Generator: Per-Job Analysis:
- Aggregate pass/fail/flake
- Fisher's test significance
- Include prowjob URLs
- Sort by regression severity
+
+ Generator->>Cache: Store TestDetails
TTL: configured cache duration
+
+ Generator-->>API: TestDetails Report
+
+ API-->>Client: Test Details Report (JSON)
+ end
+```
+
+---
+
+## Sequence Diagram: Regression Tracking Flow
+
+```mermaid
+sequenceDiagram
+ participant Daemon as RegressionTracker
Daemon
+ participant API as Component Readiness
API
+ participant PG as PostgreSQL
+ participant Jira as Jira API
+
+ Note over Daemon: Runs periodically
(e.g., every 30 minutes)
+
+ Daemon->>API: Generate component report
for each tracked release
+ API-->>Daemon: ComponentReport with
detected regressions
+
+ loop For each regression in report
+ Daemon->>PG: SELECT FROM test_regressions
WHERE release + test_id + variants
+
+ alt Regression exists
+ PG-->>Daemon: Existing regression record
+ Daemon->>Daemon: UpdateRegression()
- Update last_failure
- Update max_failures
- Check if resolved
+
+ alt Still failing
+ Daemon->>PG: UPDATE test_regressions
SET last_failure, max_failures
+ else Resolved (no longer regressed)
+ Daemon->>PG: UPDATE test_regressions
SET closed = NOW()
+ Daemon->>PG: SELECT triages
WHERE regression_id
+ Daemon->>Daemon: ResolveTriages()
if all regressions closed
+ Daemon->>PG: UPDATE triages
SET resolved = NOW(),
resolution_reason = 'regressions-rolled-off'
+ end
+ else New regression
+ Daemon->>Daemon: OpenRegression()
- 5-day hysteresis check
- Create regression record
+ Daemon->>PG: INSERT INTO test_regressions
(release, test_id, variants, opened, ...)
+
+ opt Auto-triage matching
+ Daemon->>PG: Find matching triages
by test_id pattern
+ Daemon->>PG: INSERT INTO triage_regressions
(triage_id, regression_id)
+ end
+ end
+ end
+
+ Note over Daemon: Jira integration (optional)
+
+ opt User creates triage with "file bug" option
+ Daemon->>Jira: Create/Update Jira issue
with regression details
+ Jira-->>Daemon: Bug ID
+ Daemon->>PG: UPDATE triages
SET bug_id
+ end
+```
+
+---
+
+## BigQuery Data Sources
+
+### Configuration
+**File:** `pkg/flags/bigquery.go:27-31`
+
+**Primary BigQuery Project:**
+- **Project:** `openshift-gce-devel` (configurable via `--bigquery-project`)
+- **Datasets:**
+ - `ci_analysis_us` (default, OCP Engineering CI data)
+ - `ci_analysis_qe` (QE CI data, used by QE Component Readiness)
+
+**Releases Table (Different Project):**
+- **Table:** `openshift-ci-data-analysis.ci_data.Releases`
+- **Project:** `openshift-ci-data-analysis` (different from main project)
+- **Dataset:** `ci_data` (contains release metadata)
+
+**BigQuery Architecture:**
+
+Component Readiness queries data from **two different BigQuery projects**:
+
+1. **`openshift-gce-devel`** (Sippy's main project)
+ - `ci_analysis_us` dataset - OCP Engineering CI test results (junit, jobs, job_variants, component_mapping, job_labels)
+ - `ci_analysis_qe` dataset - QE CI test results (same schema)
+ - Used by: Component Readiness API, cache-primer, variant-registry-sync, fetchdata
+
+2. **`openshift-ci-data-analysis`** (job-run-aggregator's project)
+ - `ci_data` dataset - Raw CI job data ingested from GCS (jobs, junit, disruptions, alerts, Releases)
+ - Used by: job-table-updater, disruption-uploader, alert-uploader
+ - Sippy only queries the `Releases` table from this project
+
+**Data Flow Between Projects:**
+```
+GCS Artifacts → job-run-aggregator → openshift-ci-data-analysis.ci_data
+ ↓
+ fetchdata copies to → openshift-gce-devel.ci_analysis_us
+ ↓
+ Component Readiness queries
+```
+
+---
+
+### Table 1: `junit`
+**Purpose:** Core test case execution results
+
+**File:** `pkg/api/componentreadiness/query/querygenerators.go:50-91`
+
+| Column | Type | Purpose | Used In |
+|--------|------|---------|---------|
+| `prowjob_build_id` | STRING | Links to jobs table; primary join key | All queries - `querygenerators.go:70` |
+| `file_path` | STRING | Test file path | Deduplication logic - `querygenerators.go:54` |
+| `test_name` | STRING | Test case name | Test identification, joins with component_mapping - `querygenerators.go:55` |
+| `testsuite` | STRING | Test suite identifier | Joins with component_mapping.suite - `querygenerators.go:56` |
+| `success_val` | INT | Test success indicator (0=fail, 1=pass) | Pass/fail counting - `querygenerators.go:57-60` |
+| `flake_count` | INT | Number of flake occurrences | Flake aggregation - `querygenerators.go:61` |
+| `modified_time` | DATETIME | Record timestamp | Time range filtering - `querygenerators.go:75-76` |
+| `skipped` | BOOL | Test was skipped | Excludes skipped tests - `querygenerators.go:66` |
+
+**Key Query Pattern:** De-duplication CTE
+```sql
+-- File: pkg/api/componentreadiness/query/querygenerators.go:50-91
+-- Handles:
+-- - Test retries (multiple failures) → keep first failure
+-- - Duplicate successes (OCPBUGS-16039) → count once
+-- - Flake detection → aggregate flake_count
+WITH deduped_testcases AS (
+ SELECT
+ file_path,
+ test_name,
+ testsuite,
+ SUM(CASE WHEN success_val = 1 THEN 1 ELSE 0 END) AS success_count,
+ SUM(CASE WHEN success_val = 0 THEN 1 ELSE 0 END) AS failure_count,
+ SUM(flake_count) AS flake_count,
+ prowjob_build_id
+ FROM `{dataset}.junit` j
+ LEFT JOIN `{dataset}.job_labels` jl
+ ON j.prowjob_build_id = jl.prowjob_build_id
+ AND jl.label = 'InfraFailure'
+ WHERE j.skipped = false
+ AND jl.label IS NULL -- Exclude infra failures
+ AND j.modified_time BETWEEN @From AND @To
+ GROUP BY file_path, test_name, testsuite, prowjob_build_id
+)
+```
+
+---
+
+### Table 2: `jobs`
+**Purpose:** Prow job run metadata
+
+**File:** `pkg/api/componentreadiness/query/querygenerators.go:175`
+
+| Column | Type | Purpose | Used In |
+|--------|------|---------|---------|
+| `prowjob_build_id` | STRING | Primary key, join to junit | All queries - `querygenerators.go:175` |
+| `prowjob_job_name` | STRING | Job name for variant lookups | Variant joins - `querygenerators.go:296-304` |
+| `prowjob_start` | DATETIME | Job start time | Time filtering - `querygenerators.go:183-185` |
+| `prowjob_url` | STRING | Link to job artifacts | Test details - `querygenerators.go:465` |
+| `org` | STRING | GitHub organization | PR filtering - `querygenerators.go:100-115` |
+| `repo` | STRING | GitHub repository | PR filtering - `querygenerators.go:100-115` |
+| `pr_number` | INT | Pull request number | PR filtering - `querygenerators.go:100-115` |
+| `pr_sha` | STRING | Pull request SHA | PR filtering - `querygenerators.go:100-115` |
+| `prowjob_annotations` | ARRAY | Job annotations | PR release name extraction - `querygenerators.go:100-115` |
+
+**Key Query Pattern:** Job Name Resolution for PRs
+```sql
+-- File: pkg/api/componentreadiness/query/querygenerators.go:100-115
+-- Extracts the release job name from presubmit annotations
+CASE
+ WHEN EXISTS (
+ SELECT 1 FROM UNNEST(jobs.prowjob_annotations) AS annotation
+ WHERE annotation LIKE 'releaseJobName=%%'
+ ) THEN (
+ SELECT SPLIT(SPLIT(annotation, 'releaseJobName=')[OFFSET(1)], ',')[SAFE_OFFSET(0)]
+ FROM UNNEST(jobs.prowjob_annotations) AS annotation
+ WHERE annotation LIKE 'releaseJobName=%%'
+ LIMIT 1
+ )
+ ELSE jobs.prowjob_job_name
+END AS variant_registry_job_name
+```
+
+---
+
+### Table 3: `job_variants`
+**Purpose:** Maps job names to variant configurations
+
+**File:** `pkg/api/componentreadiness/query/querygenerators.go:296-304`
+
+| Column | Type | Purpose | Used In |
+|--------|------|---------|---------|
+| `job_name` | STRING | Prow job name | Joins with jobs.prowjob_job_name - `querygenerators.go:298` |
+| `variant_name` | STRING | Variant category | Dynamic variant columns - `querygenerators.go:299` |
+| `variant_value` | STRING | Variant value | Filtering and grouping - `querygenerators.go:156-159` |
+
+**Common Variants:**
+- `Platform` (aws, gcp, azure, metal, vsphere, etc.)
+- `Architecture` (amd64, arm64, ppc64le, s390x)
+- `Network` (ovn, sdn, kuryr)
+- `Topology` (ha, single)
+- `FeatureSet` (default, techpreview)
+- `Upgrade` (upgrade, micro, minor)
+- `Suite` (serial, parallel, disruptive)
+- `Installer` (ipi, upi)
+- `Release` (4.15, 4.16, etc.)
+
+**Key Query Patterns:**
+
+**Dynamic Variant Joins:**
+```sql
+-- File: pkg/api/componentreadiness/query/querygenerators.go:296-304
+-- One LEFT JOIN per variant for dynamic column generation
+LEFT JOIN `{dataset}.job_variants` jv_Platform
+ ON variant_registry_job_name = jv_Platform.job_name
+ AND jv_Platform.variant_name = 'Platform'
+LEFT JOIN `{dataset}.job_variants` jv_Architecture
+ ON variant_registry_job_name = jv_Architecture.job_name
+ AND jv_Architecture.variant_name = 'Architecture'
+-- ... (one join per variant)
+```
+
+**Variant Enumeration:**
+```sql
+-- File: pkg/api/componentreadiness/component_report.go:303-354
+-- Gets all available variant names and values for UI filters
+SELECT variant_name, ARRAY_AGG(DISTINCT variant_value ORDER BY variant_value) AS variant_values
+FROM `{dataset}.job_variants`
+WHERE variant_value != ""
+GROUP BY variant_name
+```
+
+---
+
+### Table 4: `component_mapping`
+**Purpose:** Maps tests to components and capabilities
+
+**File:** `pkg/api/componentreadiness/query/querygenerators.go:137-158`
+
+| Column | Type | Purpose | Used In |
+|--------|------|---------|---------|
+| `id` | STRING | Unique test identifier (test_id) | Test identification - `querygenerators.go:145` |
+| `suite` | STRING | Test suite, joins with junit.testsuite | Test matching - `querygenerators.go:177` |
+| `name` | STRING | Test name, joins with junit.test_name | Test matching - `querygenerators.go:177` |
+| `component` | STRING | OpenShift component owning the test | Grouping and filtering - `querygenerators.go:146` |
+| `capabilities` | ARRAY | Test capabilities | Filtering - `querygenerators.go:208-210, 147` |
+| `jira_component` | STRING | Jira component name | UI display - `querygenerators.go:148` |
+| `jira_component_id` | STRING | Jira component ID | UI linking - `querygenerators.go:149` |
+| `staff_approved_obsolete` | BOOL | Test is obsolete | Exclusion filter - `querygenerators.go:193` |
+| `lifecycle` | STRING | Test lifecycle (blocking, informing) | Filtering - `querygenerators.go:211-213` |
+| `created_at` | DATETIME | Record creation time | Latest version selection - `querygenerators.go:137-143` |
+
+**Key Query Pattern:** Latest Mapping Version
+```sql
+-- File: pkg/api/componentreadiness/query/querygenerators.go:137-143
+-- Uses MAX(created_at) to get the most recent mapping version
+WITH latest_component_mapping AS (
+ SELECT * FROM `{dataset}.component_mapping` cm
+ WHERE created_at = (
+ SELECT MAX(created_at) FROM `{dataset}.component_mapping`
+ )
+)
+```
+
+---
+
+### Table 5: `job_labels`
+**Purpose:** Job run labels/annotations for filtering
+
+**File:** `pkg/api/componentreadiness/query/querygenerators.go:85-89`
+
+| Column | Type | Purpose | Used In |
+|--------|------|---------|---------|
+| `prowjob_build_id` | STRING | Links to junit and jobs | Infra failure filtering - `querygenerators.go:85` |
+| `prowjob_start` | DATETIME | Job start time | Partition filtering |
+| `label` | STRING | Label name | InfraFailure detection - `querygenerators.go:87` |
+
+**Key Query Pattern:** Infra Failure Exclusion
+```sql
+-- File: pkg/api/componentreadiness/query/querygenerators.go:85-89
+-- Excludes job runs marked as infrastructure failures
+LEFT JOIN `{dataset}.job_labels` jl
+ ON j.prowjob_build_id = jl.prowjob_build_id
+ AND jl.label = 'InfraFailure'
+WHERE jl.label IS NULL -- Exclude infrastructure failures
+```
+
+---
+
+### BigQuery Query Types
+
+#### Query Type 1: Component Report (Test Summary)
+**File:** `pkg/api/componentreadiness/query/querygenerators.go:316-423`
+
+**Purpose:** Aggregates test pass/fail/flake statistics grouped by component, capability, and variants
+
+**Key Columns Selected:**
+- `test_id`, `test_name`, `test_suite`
+- `component`, `capabilities`, `jira_component`, `jira_component_id`
+- Dynamic variant columns (Platform, Architecture, Network, etc.)
+- `total_count`, `success_count`, `flake_count`
+- `MAX(modified_time) AS last_failure`
+
+**Grouping:** `test_id` + all variant columns
+
+**Filters:**
+- Time range: `prowjob_start BETWEEN @From AND @To`
+- Release: `jv_Release.variant_value = @Release`
+- Job patterns: `prowjob_job_name LIKE '%periodic%' OR '%release%' OR '%aggregator%'`
+- Exclude obsolete: `staff_approved_obsolete = false`
+- Optional: capability, lifecycle, variant filters
+
+**Label:** `bqlabel.CRJunitBase` / `bqlabel.CRJunitSample`
+
+---
+
+#### Query Type 2: Test Details (Job-Level Breakdown)
+**File:** `pkg/api/componentreadiness/query/querygenerators.go:429-523`
+
+**Purpose:** Detailed test results per job run for drill-down analysis
+
+**Key Columns Selected:**
+- Same as Query Type 1, plus:
+- `file_path`
+- `prowjob_url`
+- `prowjob_build_id`
+- `prowjob_start`
+
+**Grouping:** `test_id` + variants + `file_path` + `modified_time`
+
+**Ordering:** `modified_time DESC` (most recent first)
+
+**Additional Filter:** `test_id = @TestId`
+
+**Label:** `bqlabel.TDJunitBase` / `bqlabel.TDJunitSample`
+
+---
+
+#### Query Type 3: Variant Enumeration
+**File:** `pkg/api/componentreadiness/component_report.go:303-354`
+
+**Purpose:** Get all available variant names and values for UI filters
+
+**Label:** `bqlabel.CRJobVariants`
+
+---
+
+#### Query Type 4: Column Value Enumeration
+**File:** `pkg/api/componentreadiness/component_report.go:1138-1159`
+
+**Purpose:** Get unique values for cache validation (last 60 days)
+
+**Query Pattern:**
+```sql
+SELECT DISTINCT {field} as name
+FROM `{dataset}.junit`
+WHERE modified_time > DATETIME_SUB(CURRENT_DATETIME(), INTERVAL 60 DAY)
+ORDER BY name
+```
+
+**Label:** `bqlabel.CRJunitColumnCount`
+
+---
+
+#### Query Type 5: Release Dates
+**File:** `pkg/api/componentreadiness/query/releasedates.go:28-44`
+
+**Purpose:** Get GA dates for releases to determine time ranges
+
+**Table:** Configured via `--bigquery-releases-table`
+
+**Columns:** Release name and GA date
+
+---
+
+## PostgreSQL Data Sources
+
+### Table 1: `test_regressions`
+**Purpose:** Tracks detected test regressions over time
+
+**File:** `pkg/db/models/triage.go:207-230`
+
+| Column | Type | Purpose | Used In |
+|--------|------|---------|---------|
+| `id` | SERIAL | Primary key | All queries |
+| `release` | VARCHAR | Target release (e.g., 4.16) | Filtering - `triage_queries.go:24,36` |
+| `base_release` | VARCHAR | Comparison release (e.g., 4.15) | Metadata |
+| `component` | VARCHAR | OpenShift component | Grouping and display |
+| `capability` | VARCHAR | Test capability | Grouping and display |
+| `test_id` | VARCHAR NOT NULL | Test identifier from BigQuery | Matching - `regressiontracker.go:72` |
+| `test_name` | VARCHAR | Human-readable test name | Display - `regressiontracker.go:74` |
+| `variants` | TEXT[] | Variant key:value pairs | Filtering - `regressiontracker.go:80-84` |
+| `opened` | TIMESTAMP NOT NULL | When regression first detected | Tracking - `regressiontracker.go:75` |
+| `closed` | TIMESTAMP NULL | When regression resolved | Status tracking - `regressiontracker.go:398-403` |
+| `last_failure` | TIMESTAMP NULL | Most recent failure time | Staleness indicator - `regressiontracker.go:346-350` |
+| `max_failures` | INT | Maximum failures seen | Severity indicator - `regressiontracker.go:342-344` |
+
+**Indexes:**
+- `idx_test_regression_release` on `release`
+- `idx_test_regression_test_name` on `test_name`
+
+**Key Queries:**
+
+**List Open Regressions:**
+```go
+// File: pkg/db/query/triage_queries.go:18-30
+dbc.DB.Model(&models.TestRegression{})
+ .Preload("Triages.Bug")
+ .Where("test_regressions.release = ?", release)
+ .Where("test_regressions.closed IS NULL")
+ .Find(&openRegressions)
+```
+
+**List Current Regressions (with hysteresis):**
+```go
+// File: pkg/api/componentreadiness/regressiontracker.go:55-64
+dbc.DB.Table(testRegressionsTable)
+ .Where("release = ?", release)
+ .Where("closed IS NULL OR closed > ?", time.Now().Add(-5*24*time.Hour))
+ .Scan(®ressions)
+```
+
+**Create Regression:**
+```go
+// File: pkg/api/componentreadiness/regressiontracker.go:66-98
+prs.dbc.DB.Create(&models.TestRegression{
+ Release: release,
+ TestID: testID,
+ TestName: testName,
+ Opened: time.Now(),
+ Variants: variants,
+ MaxFailures: failures,
+ // ...
+})
+```
+
+**Update Regression:**
+```go
+// File: pkg/api/componentreadiness/regressiontracker.go:319-427
+// Updates max_failures, last_failure, base_release, component, capability
+// Re-opens closed regressions if they regress again
+```
+
+---
+
+### Table 2: `triages`
+**Purpose:** Triage records linking test failures to bugs
+
+**File:** `pkg/db/models/triage.go:14-60`
+
+| Column | Type | Purpose | Used In |
+|--------|------|---------|---------|
+| `id` | SERIAL | Primary key | All queries |
+| `created_at` | TIMESTAMP | Record creation time | Audit trail |
+| `updated_at` | TIMESTAMP | Last update time | Audit trail |
+| `url` | VARCHAR NOT NULL | Bug URL (typically Jira) | Unique identifier - `triage.go:91` |
+| `description` | TEXT | Triage description | Display |
+| `type` | VARCHAR | ci-infra, product-infra, product, test | Categorization |
+| `bug_id` | INT NULL | Foreign key to bugs table | Jira metadata - `triage.go:244` |
+| `resolved` | TIMESTAMP NULL | When triage resolved | Status - `regressiontracker.go:149` |
+| `resolution_reason` | VARCHAR | user, regressions-rolled-off, jira-progression | Audit - `regressiontracker.go:150` |
+
+**Relationships:**
+- `Bug` (belongs to) - `pkg/db/models/triage.go:35`
+- `Regressions` (many-to-many via triage_regressions) - `pkg/db/models/triage.go:49`
+
+**Key Queries:**
+
+**List Triages:**
+```go
+// File: pkg/db/query/triage_queries.go:9-16
+dbc.DB.Preload("Bug").Preload("Regressions").Find(&triages)
+```
+
+**Create Triage:**
+```go
+// File: pkg/api/componentreadiness/triage.go:73-110
+// Transaction includes:
+// 1. Find or create bug by URL
+// 2. Create triage record
+// 3. Link regressions via many-to-many
+```
+
+**Update Triage:**
+```go
+// File: pkg/api/componentreadiness/triage.go:210-283
+// Transaction includes:
+// 1. Load existing triage
+// 2. Update bug association
+// 3. Replace regression associations
+// 4. Save triage
+```
+
+**Auto-Resolve Triages:**
+```go
+// File: pkg/api/componentreadiness/regressiontracker.go:108-160
+// Finds triages where all regressions are closed
+// Sets resolved timestamp and resolution_reason
+```
+
+---
+
+### Table 3: `triage_regressions`
+**Purpose:** Many-to-many join table
+
+**File:** `pkg/db/models/triage.go:49`
+
+| Column | Type | Purpose |
+|--------|------|---------|
+| `triage_id` | INT | Foreign key to triages |
+| `test_regression_id` | INT | Foreign key to test_regressions |
+
+**Cascade Behavior:** `OnDelete:CASCADE` - deleting triage removes associations
+
+**Managed via GORM Association API:**
+```go
+// File: pkg/api/componentreadiness/triage.go:265
+txWithContext.Session(&gorm.Session{SkipHooks: true})
+ .Model(&triage)
+ .Association("Regressions")
+ .Replace(triage.Regressions)
+```
+
+---
+
+### Table 4: `audit_logs`
+**Purpose:** Audit trail for triage changes
+
+**File:** `pkg/db/models/audit.go:7-16`
+
+| Column | Type | Purpose | Used In |
+|--------|------|---------|---------|
+| `id` | SERIAL | Primary key | All queries |
+| `table_name` | VARCHAR NOT NULL | Always 'triage' | Filtering - `triage.go:568` |
+| `operation` | VARCHAR | CREATE, UPDATE, DELETE | Audit display - `triage.go:590-605` |
+| `row_id` | INT NOT NULL | Triage ID | Filtering - `triage.go:568` |
+| `old_data` | JSONB | Previous state | Diff display - `triage.go:616-666` |
+| `new_data` | JSONB | New state | Diff display - `triage.go:616-666` |
+| `user` | VARCHAR NOT NULL | User who made change | Audit display |
+| `created_at` | TIMESTAMP | When change occurred | Timeline |
+
+**Auto-Created via GORM Hooks:**
+- `AfterCreate()` - `pkg/db/models/triage.go:93-112`
+- `AfterUpdate()` - `pkg/db/models/triage.go:114-133`
+- `AfterDelete()` - `pkg/db/models/triage.go:135-148`
+
+**Key Query:**
+```go
+// File: pkg/api/componentreadiness/triage.go:565-572
+dbc.Where("table_name = 'triage' and row_id = ?", triageID)
+ .Order("created_at DESC")
+ .Find(&auditLogs)
+```
+
+---
+
+### Table 5: `bugs`
+**Purpose:** Jira bug metadata
+
+**File:** `pkg/db/models/prow.go:122-140`
+
+| Column | Type | Purpose | Used In |
+|--------|------|---------|---------|
+| `id` | SERIAL | Primary key | Foreign key from triages |
+| `key` | VARCHAR | Jira issue key (e.g., OCPBUGS-12345) | Unique identifier |
+| `status` | VARCHAR | Jira issue status | Display |
+| `summary` | TEXT | Issue summary | Display |
+| `affects_versions` | TEXT[] | Affected versions | Display |
+| `fix_versions` | TEXT[] | Fix versions | Display |
+| `target_versions` | TEXT[] | Target versions | Display |
+| `components` | TEXT[] | Jira components | Display |
+| `labels` | TEXT[] | Jira labels | Display |
+| `url` | VARCHAR | Issue URL | Linking |
+
+**Key Query:**
+```go
+// File: pkg/api/componentreadiness/triage.go:91,244
+dbc.Where("url = ?", triage.URL).First(&bug)
+```
+
+---
+
+## Data Loading Jobs
+
+The component readiness system relies on several automated data loading jobs that populate both BigQuery and PostgreSQL databases. These jobs are configured as Kubernetes CronJobs in the `continuous-release-jobs` repository.
+
+### Overview
+
+**Data Flow:**
+1. **CI Job Results** → BigQuery (via job-run-aggregator) → PostgreSQL (via Sippy loaders)
+2. **Component Mappings** → GitHub → BigQuery (via ci-test-mapping)
+3. **Job Variants** → BigQuery (via variant-registry-sync)
+4. **Cache Warming** → Redis (via cache-primer)
+5. **Regression Tracking** → PostgreSQL (via regression-tracker)
+
+**BigQuery Projects & Datasets Used:**
+
+| Job | BigQuery Project | Dataset(s) | Tables |
+|-----|------------------|------------|--------|
+| job-table-updater | `openshift-ci-data-analysis` | `ci_data` | jobs, junit |
+| disruption-uploader | `openshift-ci-data-analysis` | `ci_data` | disruptions |
+| alert-uploader | `openshift-ci-data-analysis` | `ci_data` | alerts |
+| ci-test-mapping | `openshift-gce-devel` | `ci_analysis_us` | component_mapping |
+| variant-registry-sync | `openshift-gce-devel` | `ci_analysis_us`, `ci_analysis_qe` | job_variants |
+| fetchdata | `openshift-gce-devel` | `ci_analysis_us` | (writes to PostgreSQL) |
+| cache-primer | `openshift-gce-devel` | `ci_analysis_us` | junit, jobs, job_variants, component_mapping |
+| Component Readiness API | `openshift-gce-devel` | `ci_analysis_us` or `ci_analysis_qe` | junit, jobs, job_variants, component_mapping, job_labels |
+| Release metadata | `openshift-ci-data-analysis` | `ci_data` | Releases |
+
+---
+
+### Job 1: `fetchdata`
+**File:** `continuous-release-jobs/config/clusters/dpcr/services/sippy/fetchdata-cronjob.yaml`
+
+**Schedule:** Every hour (`0 * * * *`)
+
+**Command:**
+```bash
+sippy load --init-database --mode=ocp --config=/config/openshift.yaml \
+ --arch=amd64 --arch=arm64 --arch=multi --arch=s390x --arch=ppc64le
+```
+
+**Loaders Enabled:** (default from `cmd/sippy/load.go:101`)
+- `prow` - Prow job runs and test results
+- `releases` - Release metadata
+- `jira` - Jira bug metadata
+- `github` - GitHub PR data
+- `bugs` - Bug-to-test associations
+- `test-mapping` - Component ownership mappings
+- `feature-gates` - Feature gate configurations
+
+**Data Sources:**
+- **From:** BigQuery (`ci_analysis_us` dataset), GCS (Prow artifacts), GitHub API, Jira API
+- **To:** PostgreSQL tables (prow_job_runs, prow_pull_requests, bugs, etc.)
+
+**Purpose:**
+- Primary data loader that syncs CI job runs from BigQuery into PostgreSQL
+- Downloads and parses Prow job artifacts from GCS (junit XML, build logs)
+- Associates tests with components via component_mapping
+- Links bugs to affected tests
+- Refreshes PostgreSQL materialized views for performance
+
+**Key Code:** `pkg/dataloader/prowloader/prowloader.go`, `cmd/sippy/load.go:211-222`
+
+---
+
+### Job 2: `ci-test-mapping`
+**File:** `continuous-release-jobs/config/clusters/dpcr/services/sippy/cronjob-ci-test-mapping.yaml`
+
+**Schedule:** Daily at 4 AM (`0 4 * * *`)
+
+**Command:**
+```bash
+/hack/cronjob-update-mapping.sh
+```
+
+**Data Sources:**
+- **From:** GitHub repositories (openshift/origin, openshift/kubernetes, etc.)
+- **To:** BigQuery `component_mapping` table
+
+**Purpose:**
+- Scrapes test source code from OpenShift repos to extract test metadata
+- Identifies test ownership (component, capabilities, Jira component)
+- Updates the `component_mapping` table in BigQuery with latest test definitions
+- Used by Component Readiness to group tests by component/capability
+
+**Implementation:** Parses test files looking for annotations like:
+```go
+// Component: Networking
+// Capability: Connectivity
+// JiraComponent: Networking / ovn-kubernetes
+```
+
+**Key Code:** `pkg/dataloader/testownershiploader/testownershiploader.go`
+
+**Table Updated:**
+| Column | Example Value |
+|--------|---------------|
+| `id` | `openshift-tests:abc123def456` |
+| `suite` | `openshift-tests` |
+| `name` | `[sig-network] Services should provide secure connectivity` |
+| `component` | `Networking` |
+| `capabilities` | `["Connectivity", "LoadBalancing"]` |
+| `jira_component` | `Networking / ovn-kubernetes` |
+| `created_at` | `2026-02-11 04:00:00` |
+
+---
+
+### Job 3: `variant-registry-sync`
+**File:** `continuous-release-jobs/config/clusters/dpcr/services/sippy/variant-registry-cronjob.yaml`
+
+**Schedule:** Daily at 7 AM (`0 7 * * *`)
+
+**Command:**
+```bash
+# OCP Engineering Variants
+sippy variants generate --bigquery-dataset ci_analysis_us \
+ --bigquery-jobs-table jobs --o /tmp/variants.json --config /config/openshift.yaml && \
+sippy load --loader job-variants --bigquery-dataset ci_analysis_us \
+ --job-variants-input-file /tmp/variants.json
+
+# QE Variants
+sippy variants generate --bigquery-dataset ci_analysis_qe \
+ --bigquery-jobs-table jobs --o /tmp/qe_variants.json --config /config/openshift.yaml && \
+sippy load --loader job-variants --bigquery-dataset ci_analysis_qe \
+ --job-variants-input-file /tmp/qe_variants.json
+```
+
+**Data Sources:**
+- **From:** BigQuery `openshift-gce-devel.ci_analysis_us.jobs` and `openshift-gce-devel.ci_analysis_qe.jobs` tables, variant configuration in `config/openshift.yaml`
+- **To:** BigQuery `openshift-gce-devel.ci_analysis_us.job_variants` and `openshift-gce-devel.ci_analysis_qe.job_variants` tables
+
+**Purpose:**
+- Analyzes job names to extract variant information (Platform, Architecture, Network, etc.)
+- Generates variant mappings for each job based on naming patterns and configuration
+- Loads variants into BigQuery `job_variants` table for use by Component Readiness
+- Handles both OCP Engineering jobs (ci_analysis_us) and QE jobs (ci_analysis_qe)
+
+**Variant Extraction Examples:**
+| Job Name | Extracted Variants |
+|----------|-------------------|
+| `periodic-ci-openshift-release-master-nightly-4.16-e2e-gcp-ovn` | Platform=gcp, Network=ovn, Release=4.16 |
+| `periodic-ci-openshift-release-master-ci-4.16-upgrade-from-stable-4.15-e2e-aws-sdn-upgrade` | Platform=aws, Network=sdn, Release=4.16, Upgrade=upgrade |
+| `release-openshift-ocp-installer-e2e-azure-serial-4.16` | Platform=azure, Release=4.16, Suite=serial |
+
+**Key Code:** `pkg/variantregistry/loader.go`, `cmd/sippy/variants.go`
+
+**Table Updated:**
+| Column | Example Value |
+|--------|---------------|
+| `job_name` | `periodic-ci-openshift-release-master-nightly-4.16-e2e-gcp-ovn` |
+| `variant_name` | `Platform` |
+| `variant_value` | `gcp` |
+
+---
+
+### Job 4: `cache-primer`
+**File:** `continuous-release-jobs/config/clusters/dpcr/services/sippy/cache-primer-cronjob.yaml`
+
+**Schedule:** Every 4 hours (`0 */4 * * *`)
+
+**Command:**
+```bash
+sippy load --init-database --views=/config/views.yaml \
+ --loader=component-readiness-cache \
+ --loader=regression-tracker \
+ --enable-persistent-cache \
+ --enable-persistent-cache-write \
+ --force-persistent-cache-lookup=true \
+ --persistent-cache-duration-max=48h \
+ --log-level=debug
+```
+
+**Data Sources:**
+- **From:** BigQuery (junit, jobs, job_variants, component_mapping)
+- **To:** Redis cache (cached query results)
+
+**Purpose:**
+- Pre-generates and caches Component Readiness reports for all configured views
+- Warms the Redis cache before user requests to avoid expensive BigQuery queries
+- Runs regression tracker to detect and track new regressions
+- Reduces average API response time from 30-120s to <100ms (cache hit)
+- Uses persistent cache to survive application restarts
+
+**Views Cached:**
+- Configured in `/config/views.yaml`
+- Typically includes all active OpenShift releases (e.g., 4.14, 4.15, 4.16, 4.17)
+- Multiple variant combinations per release
+
+**Resource Usage:** 6GB memory request (BigQuery result processing)
+
+**Key Code:** `pkg/dataloader/crcacheloader/crcacheloader.go`, `pkg/api/componentreadiness/regressiontracker.go`
+
+**Cache Keys Generated:**
+```
+ComponentReport~4.15~4.16~Platform:aws~...
+ComponentReport~4.15~4.16~Platform:gcp~...
+ComponentReport~4.15~4.16~Platform:azure~...
+... (hundreds of permutations)
+```
+
+---
+
+### Job 5: `jira-automator`
+**File:** `continuous-release-jobs/config/clusters/dpcr/services/sippy/cronjob-automate-jira.yaml`
+
+**Schedule:** Daily at 5 AM (`0 5 * * *`)
+
+**Command:**
+```bash
+sippy automate-jira --views=/config/views.yaml \
+ --jira-account=openshift-trt-privileged \
+ --jira-endpoint=https://issues.redhat.com \
+ --jira-bearer-token-file=/etc/jira/token \
+ --sippy-url=https://sippy.dptools.openshift.org/ \
+ --include-components=Test Framework,Unknown \
+ --dry-run=true
+```
+
+**Data Sources:**
+- **From:** PostgreSQL (test_regressions, triages)
+- **To:** Jira API (creates/updates Jira issues)
+
+**Purpose:**
+- Automatically creates Jira tickets for detected test regressions
+- Links regressions to existing bugs when patterns match
+- Updates Jira tickets with latest regression status
+- Helps engineering teams track and triage test failures
+
+**Status:** Currently running in dry-run mode (`--dry-run=true`)
+
+**Key Code:** `cmd/sippy/automate_jira.go`
+
+---
+
+### Job 6: `never-stable-update`
+**File:** `continuous-release-jobs/config/clusters/dpcr/services/sippy/cronjob-never-stable-update.yaml`
+
+**Schedule:** Daily at 4:30 AM (`30 4 * * *`)
+
+**Status:** **SUSPENDED** (`suspend: true`)
+
+**Command:**
+```bash
+git clone https://github.com/openshift/sippy.git /tmp/sippy
+cd /tmp/sippy
+./scripts/cronjob-update-neverstable.sh
+```
+
+**Purpose:**
+- Updates list of tests that are never stable (always fail or flake)
+- Creates PRs to mark tests as never-stable in test metadata
+- Helps filter out known-bad tests from regression analysis
+
+**Key Code:** `scripts/cronjob-update-neverstable.sh`
+
+---
+
+### Job 7: `job-table-updater`
+**File:** `continuous-release-jobs/config/clusters/dpcr/services/dpcr-ci-job-aggregation/job-table-updater-cronjob.yaml`
+
+**Schedule:** Daily at 11 AM (`0 11 * * *`)
+
+**Command:**
+```bash
+job-run-aggregator prime-job-table \
+ --bigquery-dataset ci_data \
+ --google-service-account-credential-file=/secret/google-serviceaccount-token/google-serviceaccount-credentials.json
+```
+
+**Data Sources:**
+- **From:** GCS (Prow job artifacts in gs://origin-ci-test, gs://test-platform-results)
+- **To:** BigQuery `openshift-ci-data-analysis.ci_data.jobs` table (note: different project than Sippy queries)
+
+**Purpose:**
+- Primary data ingestion job for CI job metadata
+- Scans GCS buckets for new Prow job runs
+- Extracts job metadata (job name, start time, URL, build ID, PR info)
+- Populates the `jobs` table that Sippy queries
+
+**Tool:** `job-run-aggregator` (separate repository: https://github.com/openshift/ci-tools)
+
+**Table Updated:**
+| Column | Source |
+|--------|--------|
+| `prowjob_build_id` | GCS path (e.g., `1234567890`) |
+| `prowjob_job_name` | Job name from metadata |
+| `prowjob_start` | Job start timestamp |
+| `prowjob_url` | Link to Prow job page |
+| `org`, `repo`, `pr_number` | GitHub PR info |
+
+---
+
+### Job 8: `disruption-uploader`
+**File:** `continuous-release-jobs/config/clusters/dpcr/services/dpcr-ci-job-aggregation/disruption-cronjob.yaml`
+
+**Schedule:** Every 4 hours (`0 */4 * * *`)
+
+**Command:**
+```bash
+job-run-aggregator upload-disruptions \
+ --google-service-account-credential-file=/secret/google-serviceaccount-token/google-serviceaccount-credentials.json
+```
+
+**Data Sources:**
+- **From:** GCS (Prow job artifacts - disruption JSON files)
+- **To:** BigQuery `openshift-ci-data-analysis.ci_data.disruptions` table
+
+**Purpose:**
+- Uploads API server disruption data from e2e tests
+- Tracks API availability and downtime during upgrades and testing
+- Used for reliability analysis and regression detection
+
+**Disruption Data Format:**
+```json
+{
+ "name": "kube-api-new-connections",
+ "backend": "kube-apiserver",
+ "disruptedDuration": "2.5s",
+ "connectionEstablishedDuration": "1.2s"
+}
+```
+
+**Tool:** `job-run-aggregator`
+
+---
+
+### Job 9: `alert-uploader`
+**File:** `continuous-release-jobs/config/clusters/dpcr/services/dpcr-ci-job-aggregation/alert-cronjob.yaml`
+
+**Schedule:** Every 4 hours (`0 */4 * * *`)
+
+**Command:**
+```bash
+job-run-aggregator upload-alerts \
+ --google-service-account-credential-file=/secret/google-serviceaccount-token/google-serviceaccount-credentials.json
+```
+
+**Data Sources:**
+- **From:** GCS (Prow job artifacts - alert JSON files)
+- **To:** BigQuery `openshift-ci-data-analysis.ci_data.alerts` table
+
+**Purpose:**
+- Uploads Prometheus alerts fired during test runs
+- Tracks which alerts are triggered in CI jobs
+- Used for identifying noisy or broken alerts
+
+**Alert Data Format:**
+```json
+{
+ "alertName": "PodCrashLooping",
+ "severity": "warning",
+ "namespace": "openshift-console",
+ "pod": "console-7d8f5c8f4-abc12"
+}
+```
+
+**Tool:** `job-run-aggregator`
+
+---
+
+## Data Loading Flow Diagram
+
+```mermaid
+graph TB
+ subgraph GCS["Google Cloud Storage"]
+ prow_artifacts[Prow Job Artifacts
JUnit XML, Logs, Disruptions, Alerts]
+ end
+
+ subgraph GitHub["GitHub"]
+ test_source[Test Source Code
Component Annotations]
+ pr_data[Pull Request Data]
+ end
+
+ subgraph Jira["Jira"]
+ bug_data[Bug Metadata]
+ end
+
+ subgraph Loaders["Data Loading Jobs"]
+ job_table[job-table-updater
Daily 11 AM]
+ disruption[disruption-uploader
Every 4h]
+ alert[alert-uploader
Every 4h]
+ test_mapping[ci-test-mapping
Daily 4 AM]
+ variants[variant-registry-sync
Daily 7 AM]
+ fetchdata[fetchdata
Hourly]
+ cache_primer[cache-primer
Every 4h]
+ end
+
+ subgraph BQ["BigQuery"]
+ jobs_table[jobs table]
+ junit_table[junit table]
+ disruptions_table[disruptions table]
+ alerts_table[alerts table]
+ component_mapping_table[component_mapping table]
+ job_variants_table[job_variants table]
+ end
+
+ subgraph PG["PostgreSQL"]
+ prow_job_runs[prow_job_runs]
+ test_regressions[test_regressions]
+ triages[triages]
+ bugs[bugs]
+ end
+
+ subgraph Cache["Redis Cache"]
+ cached_reports[Component Reports
Test Details]
+ end
+
+ prow_artifacts -->|Extract metadata| job_table
+ prow_artifacts -->|Parse disruptions| disruption
+ prow_artifacts -->|Parse alerts| alert
+ job_table --> jobs_table
+ disruption --> disruptions_table
+ alert --> alerts_table
+
+ test_source -->|Scrape annotations| test_mapping
+ test_mapping --> component_mapping_table
+
+ jobs_table -->|Analyze job names| variants
+ variants --> job_variants_table
+
+ prow_artifacts -->|Download & parse| fetchdata
+ jobs_table -->|Query metadata| fetchdata
+ pr_data -->|API calls| fetchdata
+ bug_data -->|API calls| fetchdata
+ fetchdata --> prow_job_runs
+ fetchdata --> bugs
+
+ jobs_table -->|Query| cache_primer
+ junit_table -->|Query| cache_primer
+ job_variants_table -->|Query| cache_primer
+ component_mapping_table -->|Query| cache_primer
+ cache_primer -->|Generate reports| cached_reports
+ cache_primer -->|Detect regressions| test_regressions
+
+ prow_job_runs -->|Track| test_regressions
+ test_regressions -->|Associate| triages
+```
+
+---
+
+## Data Flow Diagram
+
+```mermaid
+graph TB
+ subgraph BigQuery["BigQuery (Read-Only)"]
+ junit[junit
Test Results]
+ jobs[jobs
Job Metadata]
+ job_variants[job_variants
Variant Mapping]
+ component_mapping[component_mapping
Component Mapping]
+ job_labels[job_labels
Job Labels]
+ end
+
+ subgraph PostgreSQL["PostgreSQL (Read-Write)"]
+ test_regressions[test_regressions
Regression Tracking]
+ triages[triages
Triage Records]
+ triage_regressions[triage_regressions
M2M Join]
+ bugs[bugs
Jira Metadata]
+ audit_logs[audit_logs
Audit Trail]
+ end
+
+ subgraph API["Component Readiness API"]
+ component_report[Component Report]
+ test_details[Test Details]
+ regression_tracker[Regression Tracker]
+ triage_api[Triage API]
+ end
+
+ junit -->|JOIN| jobs
+ jobs -->|JOIN| job_variants
+ junit -->|JOIN| component_mapping
+ junit -->|LEFT JOIN| job_labels
+
+ component_report -->|Query| junit
+ component_report -->|Query| jobs
+ component_report -->|Query| job_variants
+ component_report -->|Query| component_mapping
+ component_report -->|Query| job_labels
+
+ test_details -->|Query| junit
+ test_details -->|Query| jobs
+ test_details -->|Query| job_variants
+ test_details -->|Query| component_mapping
+
+ component_report -->|Inject| test_regressions
+ test_details -->|Query| test_regressions
+
+ regression_tracker -->|Generate| component_report
+ regression_tracker -->|INSERT/UPDATE| test_regressions
+
+ test_regressions -->|M2M| triage_regressions
+ triages -->|M2M| triage_regressions
+ triages -->|FK| bugs
+
+ triage_api -->|CRUD| triages
+ triage_api -->|CRUD| bugs
+ triage_api -->|Query| audit_logs
+
+ triages -->|Auto-Create| audit_logs
+
+ regression_tracker -->|Auto-Resolve| triages
+```
+
+---
+
+## Data Relationships
+
+### BigQuery Relationships
+
+```mermaid
+erDiagram
+ junit ||--o{ jobs : "prowjob_build_id"
+ junit ||--o{ component_mapping : "suite+test_name"
+ junit ||--o{ job_labels : "prowjob_build_id"
+ jobs ||--o{ job_variants : "prowjob_job_name"
+
+ junit {
+ string prowjob_build_id PK
+ string test_name
+ string testsuite
+ int success_val
+ int flake_count
+ datetime modified_time
+ }
+
+ jobs {
+ string prowjob_build_id PK
+ string prowjob_job_name
+ datetime prowjob_start
+ string prowjob_url
+ }
+
+ job_variants {
+ string job_name FK
+ string variant_name
+ string variant_value
+ }
+
+ component_mapping {
+ string id PK
+ string suite
+ string name
+ string component
+ array capabilities
+ datetime created_at
+ }
+
+ job_labels {
+ string prowjob_build_id FK
+ string label
+ }
+```
+
+### PostgreSQL Relationships
+
+```mermaid
+erDiagram
+ test_regressions ||--o{ triage_regressions : "id"
+ triages ||--o{ triage_regressions : "id"
+ triages ||--o| bugs : "bug_id"
+ triages ||--o{ audit_logs : "id"
+
+ test_regressions {
+ serial id PK
+ varchar release
+ varchar test_id
+ varchar test_name
+ text_array variants
+ timestamp opened
+ timestamp closed
+ timestamp last_failure
+ int max_failures
+ }
+
+ triages {
+ serial id PK
+ varchar url UK
+ text description
+ varchar type
+ int bug_id FK
+ timestamp resolved
+ varchar resolution_reason
+ }
+
+ triage_regressions {
+ int triage_id FK
+ int test_regression_id FK
+ }
+
+ bugs {
+ serial id PK
+ varchar key UK
+ varchar status
+ text summary
+ varchar url
+ }
+
+ audit_logs {
+ serial id PK
+ varchar table_name
+ varchar operation
+ int row_id FK
+ jsonb old_data
+ jsonb new_data
+ varchar user
+ timestamp created_at
+ }
+```
+
+---
+
+## Key Data Patterns
+
+### 1. Variant Storage
+
+**BigQuery:**
+- Stored as separate columns via dynamic joins
+- Example: `jv_Platform.variant_value AS variant_Platform`
+- Allows flexible GROUP BY and filtering
+
+**PostgreSQL:**
+- Stored as text array
+- Format: `["Architecture:amd64", "Network:OVN"]`
+- Enables exact matching for regression tracking
+
+### 2. Time Windows
+
+**BigQuery:**
+- Uses `prowjob_start` for time-based filtering
+- Typically 7-14 days for base, 3-7 days for sample
+- Configurable via relative date ranges
+
+**PostgreSQL:**
+- Uses `opened`, `closed`, `last_failure` timestamps
+- 5-day hysteresis window for regression reuse
+- `last_failure` indicates staleness
+
+### 3. Test Identification
+
+**Cross-System:**
+- `test_id` is the primary key linking BigQuery component_mapping to PostgreSQL test_regressions
+- Format: `{suite}:{test_name_hash}` or custom ID
+- Examples:
+ - `openshift-tests:abc123def456`
+ - `e2e-aws:789ghi012`
+
+### 4. Deduplication
+
+**BigQuery:**
+- CTE `deduped_testcases` handles:
+ - Test retries (multiple failures) → keep first
+ - Duplicate successes (OCPBUGS-16039) → count once
+ - Flake aggregation → SUM(flake_count)
+- Group by: file_path, test_name, testsuite, prowjob_build_id
+
+**PostgreSQL:**
+- Unique constraint: (release, test_id, variants)
+- Prevents duplicate regression records
+- Reuses closed regressions within 5-day window
+
+### 5. Regression Hysteresis
+
+**Purpose:** Prevents regression flapping
+
+**Implementation:**
+```go
+// File: pkg/api/componentreadiness/regressiontracker.go:55-64
+regressionHysteresisDays := 5 * 24 * time.Hour
+
+// Include recently closed regressions
+WHERE closed IS NULL OR closed > NOW() - @HysteresisDays
+```
+
+**Effect:**
+- Closed regressions can be re-opened without creating new records
+- Avoids noise from intermittent test failures
+- Preserves triage associations
+
+---
+
+## Statistical Analysis: Fisher's Exact Test
+
+Component readiness uses Fisher's Exact Test to determine if the difference between base and sample pass rates is statistically significant.
+
+### Contingency Table
+
+```
+ Pass Fail Total
+Base: a b a+b
+Sample: c d c+d
+Total: a+c b+d n
+
+p-value = hypergeometric probability (two-tailed)
+```
+
+### Status Rules
+
+- **p < 0.02** → "Extreme" regression (red)
+- **p < 0.05** → "Significant" regression (yellow)
+- **Pass rate drop** (new test, no statistical power) → "Regression"
+- **Pass rate improvement** → "Improvement" (green)
+
+### Implementation
+**File:** `pkg/api/componentreadiness/componentreport/component_report.go`
+
+The Fisher's Exact Test is computed during the PreAnalysis middleware phase, which runs inside the cache. This ensures that expensive statistical calculations are cached and reused.
+
+---
+
+## Cache Strategy
+
+### Cache Keys
+
+```
+ComponentReport~{base_release}~{sample_release}~{variants}~{filters}~{config_version}
+TestDetails~{base_release}~{sample_release}~{test_id}~{variants}~{config_version}
+Variants~{release}~{config_version}
+```
+
+### Cache Behavior
+
+**PreAnalysis Middleware:** Runs inside cache
+- Statistical analysis (Fisher's Exact Test)
+- Pass rate calculations
+- Test aggregation
+- Variant grouping
+
+**PostAnalysis Middleware:** Runs outside cache
+- Regression data injection (from PostgreSQL)
+- Triage matching and status updates
+- Test detail link generation
+
+### Cache TTL
+- Configurable per deployment (typically 1-4 hours)
+- Balances freshness vs. BigQuery query costs
+
+### Persistent Cache
+- When enabled via `--enable-persistent-caching`
+- Survives application restarts
+- Decorated via `f.CacheFlags.DecorateBiqQueryClientWithPersistentCache()`
+
+---
+
+## Performance Considerations
+
+### BigQuery
+
+| Pattern | Purpose | Impact |
+|---------|---------|--------|
+| Partition pruning | Filter by `prowjob_start` and `modified_time` | 10-100x speedup |
+| Latest component_mapping CTE | Avoid scanning all versions | 5-10x speedup |
+| Deduplication CTE | Single-pass aggregation | Correct results |
+| LEFT JOIN on job_labels | Exclude InfraFailure | Filter before join |
+| Persistent cache | Reuse query results across runs | 1000x speedup on cache hit |
+
+**Typical Query Time:** 30 seconds - 2 minutes (varies by release size)
+
+### PostgreSQL
+
+| Pattern | Purpose | Impact |
+|---------|---------|--------|
+| Index on `release` | Fast filtering for release-specific regressions | 100x speedup |
+| Index on `test_name` | Fuzzy matching for triage suggestions | 10x speedup |
+| GORM Preload | Avoid N+1 queries for associations | 10-100x speedup |
+| Transactions | Ensure consistency for triage updates | Correctness |
+| Cascade deletes | Automatic cleanup of associations | Simplicity |
+
+**Typical Query Time:** < 100ms for regression/triage queries
+
+---
+
+## Frontend Integration
+
+The frontend (`sippy-ng`) consumes these APIs:
+
+### 1. Component Report Grid (`/component_readiness`)
+- Displays component × capability matrix
+- Color-coded cells:
+ - **Red:** Extreme regression (p < 0.02)
+ - **Yellow:** Significant regression (p < 0.05)
+ - **Green:** Passing or improved
+- Click cell → drill down to test_details
+
+### 2. Test Details (`/component_readiness/test_details`)
+- Shows per-job test results
+- Links to Prow job artifacts
+- Displays regression status and triages
+- Grouped by job name with statistical analysis
+
+### 3. Triage Management
+- Create/edit triages
+- Associate with regressions
+- File Jira bugs
+- View audit history
+- Auto-resolution when all regressions close
+
+---
+
+## Code Reference Index
+
+### BigQuery Queries
+
+| Query Type | File | Lines | Purpose |
+|------------|------|-------|---------|
+| Dedup CTE | `pkg/api/componentreadiness/query/querygenerators.go` | 50-91 | De-duplicate junit results |
+| Latest mapping CTE | `pkg/api/componentreadiness/query/querygenerators.go` | 137-143 | Get latest component_mapping |
+| Component report | `pkg/api/componentreadiness/query/querygenerators.go` | 316-423 | Aggregate test stats |
+| Test details | `pkg/api/componentreadiness/query/querygenerators.go` | 429-523 | Job-level breakdown |
+| Job variants | `pkg/api/componentreadiness/component_report.go` | 303-354 | Variant enumeration |
+| Release dates | `pkg/api/componentreadiness/query/releasedates.go` | 28-44 | GA date lookup |
+
+### PostgreSQL Queries
+
+| Operation | File | Lines | Purpose |
+|-----------|------|-------|---------|
+| List triages | `pkg/db/query/triage_queries.go` | 9-16 | Get all triages |
+| List regressions | `pkg/db/query/triage_queries.go` | 32-45 | Get regressions by release |
+| List open regressions | `pkg/db/query/triage_queries.go` | 18-30 | Get active regressions |
+| Create triage | `pkg/api/componentreadiness/triage.go` | 73-110 | New triage record |
+| Update triage | `pkg/api/componentreadiness/triage.go` | 210-283 | Update triage + associations |
+| Delete triage | `pkg/api/componentreadiness/triage.go` | 285-292 | Remove triage |
+| Open regression | `pkg/api/componentreadiness/regressiontracker.go` | 66-98 | Create regression record |
+| Update regression | `pkg/api/componentreadiness/regressiontracker.go` | 319-427 | Update regression stats |
+| Resolve triages | `pkg/api/componentreadiness/regressiontracker.go` | 108-160 | Auto-resolve when fixed |
+| Audit logs | `pkg/api/componentreadiness/triage.go` | 565-572 | Get change history |
+
+### Model Definitions
+
+| Model | File | Lines | Purpose |
+|-------|------|-------|---------|
+| TestRegression | `pkg/db/models/triage.go` | 207-230 | Regression record |
+| Triage | `pkg/db/models/triage.go` | 14-60 | Triage record |
+| Bug | `pkg/db/models/prow.go` | 122-140 | Jira bug metadata |
+| AuditLog | `pkg/db/models/audit.go` | 7-16 | Audit trail |
+
+### GORM Hooks
+
+| Hook | File | Lines | Purpose |
+|------|------|-------|---------|
+| BeforeUpdate | `pkg/db/models/triage.go` | 77-91 | Capture old state |
+| BeforeDelete | `pkg/db/models/triage.go` | 62-75 | Capture old state |
+| AfterCreate | `pkg/db/models/triage.go` | 93-112 | Create audit log |
+| AfterUpdate | `pkg/db/models/triage.go` | 114-133 | Create audit log |
+| AfterDelete | `pkg/db/models/triage.go` | 135-148 | Create audit log |
+
+---
+
+## Summary
+
+### BigQuery Usage
+- **5 tables:** junit, jobs, job_variants, component_mapping, job_labels
+- **5 query types:** component report, test details, variants, column values, release dates
+- **Read-only:** All queries are SELECT statements
+- **Key pattern:** Dynamic variant joins with de-duplication
+
+### PostgreSQL Usage
+- **5 tables:** test_regressions, triages, triage_regressions, bugs, audit_logs
+- **10+ query patterns:** CRUD operations on triages and regressions
+- **Read-write:** Full CRUD support with audit logging
+- **Key pattern:** Many-to-many relationships with automatic audit trail
+
+### Cross-System Integration
+- **test_id** links BigQuery component_mapping to PostgreSQL test_regressions
+- **PostAnalysis middleware** injects PostgreSQL data into BigQuery results
+- **Regression tracker** uses BigQuery reports to update PostgreSQL regressions
+- **Cache layer** separates statistical analysis (cached) from dynamic data (not cached)
+
+### Key Metrics
+- **BigQuery Query Time**: 30 seconds - 2 minutes
+- **Cache Hit Response**: < 100ms
+- **Cache TTL**: 1-4 hours
+- **Regression Detection Latency**: 30 minutes (daemon interval)
+- **Hysteresis Period**: 5 days (prevents flapping)
+- **Statistical Confidence**: p < 0.05 threshold (95% confidence)