Initial SEO-INTEL documentation: architecture, scoring, code structure

Add comprehensive documentation for the dual-engine performance evaluation system: - System architecture and data flow - Score calculation methodology (0-100 approximation from CWV thresholds) - Detailed metrics reference (LCP, FCP, CLS, TBT, TTFB) - Testing engines comparison (Sitespeed vs PSI) - Complete code structure map (file-by-file breakdown) - Case study: rds.ink 77 score with actionable fixes - Quick reference guides for interpreting results Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-05-14 05:56:49 +10:00
commit 335d9a76e1
6 changed files with 1385 additions and 0 deletions
--- a/docs/01-architecture.md
+++ b/docs/01-architecture.md
@@ -0,0 +1,329 @@
+# System Architecture
+
+## High-Level Overview
+
+SEO-INTEL is a performance measurement system with three main layers:
+
+```
+┌─────────────────────────────────────────────────────────┐
+│ USER LAYER                                               │
+│ Web dashboard (HTMX-driven) on port 8765                │
+│ - Portfolio scorecard                                    │
+│ - Per-site detail (CWV, trend, opportunities)           │
+│ - On-demand test buttons                                │
+└────────────────────┬────────────────────────────────────┘
+                     │
+┌────────────────────▼────────────────────────────────────┐
+│ API LAYER (FastAPI)                                     │
+│ - GET /performance/              (portfolio view)       │
+│ - GET /performance/<site_id>     (per-site view)        │
+│ - POST /performance/api/perf/test    (trigger test)     │
+│ - POST /performance/api/perf/sweep   (portfolio sweep)  │
+└────────────────────┬────────────────────────────────────┘
+                     │
+┌────────────────────▼────────────────────────────────────┐
+│ TESTING LAYER (Dual Engines)                            │
+│ ┌──────────────────────────────────────────────────┐   │
+│ │ Sitespeed.io (Docker)                            │   │
+│ │ - Real browser via headless Chrome               │   │
+│ │ - 3 runs per test, median metrics                │   │
+│ │ - HAR export (resource waterfall)                │   │
+│ │ - CWV: LCP, FCP, CLS, TBT, TTFB                │   │
+│ │ - Duration: ~60s per device                      │   │
+│ └──────────────────────────────────────────────────┘   │
+│ ┌──────────────────────────────────────────────────┐   │
+│ │ Google PageSpeed Insights (API)                  │   │
+│ │ - Official Lighthouse audit                      │   │
+│ │ - Opportunities (what to fix)                    │   │
+│ │ - Official performance score (0-100)             │   │
+│ │ - Duration: ~30s per device                      │   │
+│ └──────────────────────────────────────────────────┘   │
+└────────────────────┬────────────────────────────────────┘
+                     │
+┌────────────────────▼────────────────────────────────────┐
+│ PERSISTENCE LAYER (SQLite)                              │
+│ - perf_runs (test execution records)                    │
+│ - perf_audits (Core Web Vitals metrics)                 │
+│ - perf_opportunities (Lighthouse opportunities)         │
+│ - perf_resources (HAR resource list)                    │
+└─────────────────────────────────────────────────────────┘
+```
+
+## Data Flow: User Clicks "Test Now"
+
+```
+1. User clicks "Test Now" button
+   ↓
+2. HTMX POST to /performance/api/perf/test
+   Body: { site_id: 3, url: "https://...", 
+           engines: ["sitespeed", "psi"], 
+           devices: ["mobile", "desktop"] }
+   ↓
+3. FastAPI endpoint (performance.py:api_perf_test)
+   ├─ Validate inputs
+   ├─ Spawn background task (ThreadPool)
+   ├─ Return 202 (Accepted) immediately
+   ↓
+4. Background task runs src/perf/runner.py:run_full_test()
+   ├─ For each engine in engines:
+   │  └─ For each device in devices:
+   │     ├─ If sitespeed:
+   │     │  └─ Call src/perf/sitespeed.py:run_sitespeed_test()
+   │     │     ├─ Docker: sitespeedio/sitespeed.io:40.4.0
+   │     │     ├─ 3 runs (N=3), median metrics
+   │     │     ├─ Parse HAR: /tmp/sitespeed-output/{run_id}/.../browsertime.har
+   │     │     ├─ Extract: lcp_ms, fcp_ms, cls, tbt_ms, ttfb_ms, page weight
+   │     │     └─ Approximate score from CWV thresholds
+   │     │
+   │     └─ If psi:
+   │        └─ Call src/perf/psi.py:run_psi_test()
+   │           ├─ HTTP GET to googleapis.com/pagespeedonline/v5/runPagespeed
+   │           ├─ Parse Lighthouse audits from response
+   │           ├─ Extract: opportunities (what to fix + savings)
+   │           └─ Return official performance_score
+   │
+   ├─ For each result: _persist_run() writes to database
+   │  ├─ perf_runs (engine, device, success, error_message)
+   │  ├─ perf_audits (performance_score, all CWV metrics)
+   │  ├─ perf_opportunities (opportunity_key, savings_ms, savings_bytes)
+   │  └─ perf_resources (url, type, size, load time)
+   │
+   └─ Log completion summary
+   
+   ↓
+5. User refreshes dashboard after ~90s
+   ↓
+6. FastAPI queries database
+   ├─ _portfolio_rows() — SELECT latest score per site
+   ├─ _site_url_rows() — SELECT latest score per URL
+   ├─ _site_latest_audit() — SELECT full metrics for latest run
+   ├─ _site_trend() — SELECT weekly AVG scores (12 weeks)
+   ├─ _site_opportunities() — SELECT top PSI opportunities
+   └─ _site_slow_resources() — SELECT top 10 slowest resources
+   
+   ↓
+7. Jinja2 templates render HTML with results
+   ├─ performance.html (portfolio scorecard)
+   └─ performance_site.html (per-site detail with CWV, trend, opps)
+   
+   ↓
+8. User sees updated scores, metrics, trend chart, opportunities
+```
+
+## Component Breakdown
+
+### 1. Sitespeed.io Testing (src/perf/sitespeed.py)
+
+**Purpose:** Capture real browser performance metrics via headless Chrome
+
+**Process:**
+```python
+run_sitespeed_test(url="https://rds.ink/endangered", device="mobile")
+├─ Generate unique run_id (UUID)
+├─ Create output dir: /tmp/sitespeed-output/{run_id}/
+├─ Build Docker command:
+│  docker run --rm \
+│  -v /tmp/sitespeed-output:/sitespeed.io \
+│  sitespeedio/sitespeed.io:40.4.0 \
+│  {url} \
+│  --mobile --connectivity 4g \      (if device=="mobile")
+│  --n 3 \                           (3 runs, median taken)
+│  --outputFolder /sitespeed.io/{run_id} \
+│  --summary --summary-detail
+│
+├─ Wait for Docker container to complete (~60s)
+├─ Parse HAR: /sitespeed.io/{run_id}/.../browsertime.har
+│  ├─ Extract pages[]._ googleWebVitals (LCP, FCP, CLS, TTFB)
+│  ├─ Extract pages[]._cpu.longTasks.totalBlockingTime (TBT)
+│  ├─ Compute medians across N=3 runs
+│  ├─ Extract resource list (URL, type, size, timing)
+│  └─ Categorise resources (script, stylesheet, image, font, xhr, other)
+│
+├─ Calculate page weight breakdown:
+│  ├─ total_bytes = sum of all response bodySize
+│  ├─ image_bytes = sum where Content-Type contains "image"
+│  ├─ js_bytes = sum where Content-Type contains "javascript"
+│  ├─ css_bytes = sum where Content-Type contains "css"
+│  └─ font_bytes = sum where Content-Type contains "font"
+│
+├─ Approximate performance score:
+│  └─ _approx_score(lcp_ms, fcp_ms, cls, tbt_ms, ttfb_ms)
+│     (See Section 2: Score Calculation)
+│
+└─ Return dict:
+   {
+     "success": true,
+     "device": "mobile",
+     "performance_score": 77,    ← Approximated, not Lighthouse
+     "metrics": {
+       "lcp_ms": null,
+       "fcp_ms": 2116,
+       "cls": 0.0,
+       "tbt_ms": 1807,
+       "ttfb_ms": 144,
+       ...
+     },
+     "resources": [
+       {"resource_url": "...", "size_bytes": 12345, ...},
+       ...
+     ]
+   }
+```
+
+**Key note:** Sitespeed v40 does NOT run Lighthouse. Performance score is approximated from CWV thresholds. For official Lighthouse, use PSI.
+
+### 2. PageSpeed Insights Testing (src/perf/psi.py)
+
+**Purpose:** Get official Google Lighthouse audit + opportunities
+
+**Process:**
+```python
+run_psi_test(url="https://rds.ink/endangered", device="mobile")
+├─ Build API request:
+│  GET https://www.googleapis.com/pagespeedonline/v5/runPagespeed
+│  ?url={url}&strategy=mobile&category=performance&key={api_key}
+│
+├─ Wait for Google to run Lighthouse (~30s)
+├─ Parse response.lighthouseResult:
+│  ├─ Extract categories.performance.score (0-1) → multiply by 100
+│  ├─ Extract audits[]:
+│  │  ├─ "largest-contentful-paint" → lcp_ms
+│  │  ├─ "first-contentful-paint" → fcp_ms
+│  │  ├─ "cumulative-layout-shift" → cls
+│  │  ├─ "total-blocking-time" → tbt_ms
+│  │  ├─ "interaction-to-next-paint" → inp_ms
+│  │  └─ "server-response-time" → ttfb_ms
+│  │
+│  └─ For each audit with details.type == "opportunity":
+│     ├─ Extract display title
+│     ├─ Extract overallSavingsMs (potential speed gain)
+│     ├─ Extract overallSavingsBytes (potential size reduction)
+│     └─ Store for recommendations
+│
+└─ Return dict:
+   {
+     "success": true,
+     "device": "mobile",
+     "performance_score": 95,    ← Official Lighthouse
+     "metrics": { ... },         ← Same structure as sitespeed
+     "opportunities": [
+       {
+         "opportunity_key": "unused-javascript",
+         "display_label": "Reduce unused JavaScript",
+         "savings_ms": 400,      ← potential gain
+         "savings_bytes": 150000
+       },
+       ...
+     ]
+   }
+```
+
+### 3. Test Orchestration (src/perf/runner.py)
+
+**Purpose:** Run all engines × devices combinations and persist results
+
+```python
+run_full_test(
+  site_id=3,
+  url="https://rds.ink/endangered",
+  engines=["sitespeed", "psi"],
+  devices=["mobile", "desktop"]
+)
+├─ For engine in ["sitespeed", "psi"]:
+│  └─ For device in ["mobile", "desktop"]:
+│     ├─ Run the appropriate test (sitespeed or psi)
+│     ├─ Call _persist_run() to write results:
+│     │  ├─ INSERT perf_runs (site_id, url, engine, device, ...)
+│     │  ├─ INSERT perf_audits (performance_score, all metrics)
+│     │  ├─ INSERT perf_opportunities (for each opportunity)
+│     │  └─ INSERT perf_resources (for each resource)
+│     │
+│     └─ Log result (success or error)
+│
+└─ Return summary:
+   {
+     "url": "https://...",
+     "results": {
+       "sitespeed_mobile":  { "run_id": 1, "score": 77, "success": true },
+       "sitespeed_desktop": { "run_id": 2, "score": 82, "success": true },
+       "psi_mobile":        { "run_id": 3, "score": 95, "success": true },
+       "psi_desktop":       { "run_id": 4, "score": 93, "success": true }
+     }
+   }
+```
+
+### 4. Portfolio Sweep (src/perf/batch.py)
+
+**Purpose:** Weekly automated test of all sites × top URLs
+
+**Scheduled:** Monday 04:00 AEST (hard-coded in template)
+
+```python
+run_weekly_perf_sweep(db)
+├─ For each site in SITES (13 sites):
+│  ├─ resolve_url_list(domain):
+│  │  ├─ Get homepage: https://{domain}/
+│  │  ├─ Query ranking_snapshots last 30 days
+│  │  ├─ Get top 5 URLs by impressions
+│  │  └─ Return: [homepage, url1, url2, url3, url4, url5]  (6 URLs max)
+│  │
+│  ├─ For each URL:
+│  │  ├─ Call run_full_test(site_id, url, engines=["sitespeed", "psi"], devices=["mobile", "desktop"])
+│  │  └─ Wait 5 seconds (inter-URL delay to avoid rate limits)
+│  │
+│  └─ Log site completion (e.g., "dayboro.au: 6 URLs × 4 runs = 24 tests complete")
+│
+└─ Total: ~13 sites × 6 URLs × 4 runs = ~312 tests, ~5 hours
+```
+
+### 5. Database Persistence (src/models/perf.py)
+
+Four tables, one per concept:
+
+| Table | Purpose | Key Fields |
+|-------|---------|-----------|
+| `perf_runs` | Test execution records | site_id, url, engine, device, completed_at, success |
+| `perf_audits` | Core Web Vitals metrics | perf_run_id, performance_score, lcp_ms, cls, tbt_ms, etc. |
+| `perf_opportunities` | Lighthouse audit opportunities | perf_run_id, opportunity_key, savings_ms, savings_bytes |
+| `perf_resources` | HAR resource list | perf_run_id, resource_url, type, size_bytes, duration_ms |
+
+Each perf_run can have:
+- 1 perf_audit (metrics)
+- 0+ perf_opportunities (if PSI ran)
+- 0+ perf_resources (if HAR captured)
+
+### 6. Web Interface (templates/performance.html, performance_site.html)
+
+**Portfolio view** (performance.html):
+- Table of all sites
+- Latest mobile + desktop scores per site
+- Slowest URL per site
+- Last tested timestamp
+- "Run portfolio sweep now" button (HTMX trigger)
+
+**Per-site view** (performance_site.html):
+- CWV metrics for latest run (mobile + desktop side-by-side)
+- 12-week trend sparkline chart (two bars per week)
+- Top 5 opportunities from PSI
+- Top 10 slowest resources from sitespeed
+- Per-URL breakdown table with test buttons
+
+## Why Two Engines?
+
+| Aspect | Sitespeed | PSI |
+|--------|-----------|-----|
+| **What it measures** | Real browser (Browsertime) + HAR waterfall | Official Lighthouse audit |
+| **Speed** | ~60s per device | ~30s per device |
+| **Score source** | Approximated from CWV thresholds | Official Google Lighthouse |
+| **Opportunities** | None (no Lighthouse) | Yes (full audit) |
+| **Resource list** | Yes (full HAR) | No (limited) |
+| **Use case** | Trend tracking, resource diagnosis | Official benchmarking, opportunities |
+
+**Strategy:** Run both in parallel. Sitespeed gives you the waterfall + trend, PSI gives you official score + what to fix.
+
+---
+
+See also:
+- [Score Calculation](02-score-calculation.md) — How the 0-100 score is derived
+- [Testing Engines](04-testing-engines.md) — Deep dive into each engine
+- [Database Schema](../code-refs/database-schema.md) — All fields, all relationships
--- a/docs/02-score-calculation.md
+++ b/docs/02-score-calculation.md
@@ -0,0 +1,268 @@
+# Performance Score Calculation
+
+## The Formula
+
+```
+Performance Score = Average of five metric scores (0-100)
+
+Score = (LCP_score + FCP_score + CLS_score + TBT_score + TTFB_score) / 5
+
+where each metric_score is calculated from thresholds:
+  if metric ≤ good_threshold   → metric_score = 100
+  if metric ≥ poor_threshold   → metric_score = 30
+  if between                   → metric_score = 100 - ((metric - good) / (poor - good)) × 70
+```
+
+## Example: rds.ink/endangered = 77
+
+From the database (sitespeed mobile run on 2026-05-13):
+
+```
+LCP:  NULL           → skipped (no data)
+FCP:  2,116ms        → score calculation:
+      good=1,800  poor=3,000
+      2,116 is between good and poor
+      ratio = (2116 - 1800) / (3000 - 1800) = 316 / 1200 = 0.263
+      score = 100 - (0.263 × 70) = 100 - 18.4 = 82 points ✓
+
+CLS:  0.0            → score = 100 (well below good threshold of 0.1) ✓
+
+TBT:  1,807ms        → score calculation:
+      good=200  poor=600
+      1,807 >> poor threshold
+      ratio = (1807 - 200) / (600 - 200) = 1607 / 400 = 4.02
+      Since ratio > 1: score = capped at 30 points ✗ CRITICAL
+
+TTFB: 144ms          → score = 100 (well below good threshold of 800ms) ✓
+
+Average = (82 + 100 + 30 + 100) / 4 = 78 ≈ 77 (database value)
+                                              ↑ (rounding)
+```
+
+**Bottom line:** TBT (Total Blocking Time) of 1,807ms is **9 times worse** than the 200ms threshold. This single metric alone drops the score from ~90 → 77.
+
+## Thresholds (Hard-Coded)
+
+**File:** `/home/help4bis/seo-intel/src/perf/sitespeed.py` lines 53–60
+
+```python
+_THRESHOLDS = {
+    # (good_max, poor_min)
+    "lcp":  (2500,  4000),  # ms
+    "fcp":  (1800,  3000),  # ms
+    "cls":  (0.1,   0.25),  # unitless
+    "tbt":  (200,   600),   # ms
+    "ttfb": (800,   1800),  # ms
+}
+```
+
+These thresholds are **based on Google's Lighthouse 10 scoring rubric**. They're not arbitrary — they're what Google uses to score web performance.
+
+## Metric-by-Metric Breakdown
+
+### 1. LCP (Largest Contentful Paint)
+
+**What it measures:** How long before the largest visible element (image, heading, paragraph) appears on screen.
+
+**Why it matters:** Users need to see that something is happening.
+
+**Thresholds:**
+- **Good:** ≤ 2,500ms (2.5 seconds)
+- **Poor:** ≥ 4,000ms (4 seconds)
+
+**rds.ink status:** Not measured (NULL)
+
+**Typical fixes:**
+- Optimize server response time (TTFB)
+- Defer non-critical JavaScript
+- Lazy-load images
+- Use a CDN for images
+
+---
+
+### 2. FCP (First Contentful Paint)
+
+**What it measures:** How long before ANY content (text, image, non-white background) appears.
+
+**Why it matters:** The first visual indication that the page is loading.
+
+**Thresholds:**
+- **Good:** ≤ 1,800ms (1.8 seconds)
+- **Poor:** ≥ 3,000ms (3 seconds)
+
+**rds.ink status:** 2,116ms = AMBER (82/100)
+
+The page shows content after 2.1 seconds, which is acceptable but slower than ideal. Caused by deferred script execution blocking rendering.
+
+**Typical fixes:**
+- Reduce server response time (TTFB)
+- Defer non-critical JavaScript
+- Inline critical CSS
+- Reduce DOM size
+
+---
+
+### 3. CLS (Cumulative Layout Shift)
+
+**What it measures:** How much the page layout jumps around after initial load.
+
+**Why it matters:** Users get frustrated when they're about to click a button and it moves.
+
+**Thresholds:**
+- **Good:** ≤ 0.1 (10% of viewport)
+- **Poor:** ≥ 0.25 (25% of viewport)
+
+**rds.ink status:** 0.0 = PERFECT ✓
+
+The page does NOT move after load. Great job. This metric is not the problem.
+
+**Typical fixes:**
+- Set explicit dimensions on images
+- Avoid inserting content above existing content
+- Use transform animations instead of position changes
+
+---
+
+### 4. TBT (Total Blocking Time) 🔴 **THE KILLER METRIC**
+
+**What it measures:** How long JavaScript blocks the main thread, preventing the browser from responding to user input (clicks, scrolls, etc.).
+
+**Why it matters:** A page with 1.8 seconds of TBT feels frozen to the user.
+
+**Thresholds:**
+- **Good:** ≤ 200ms (0.2 seconds)
+- **Poor:** ≥ 600ms (0.6 seconds)
+
+**rds.ink status:** 1,807ms = CRITICAL ❌
+
+The page's JavaScript takes **1.8 seconds** to execute after initial render. During this time:
+- User clicks "Add to cart" → Nothing happens
+- User tries to scroll → Page is frozen
+- User tries to open menu → Unresponsive
+
+**Impact on score:** 30/100 points (single worst metric)
+
+**Root cause:** Likely WooCommerce plugins, Elementor scripts, and lazy-loaded gallery libraries (Lightbox, PhotoSwipe, Slick, etc.) all executing simultaneously.
+
+**Typical fixes (in priority order):**
+1. **Defer non-critical JavaScript** (add `defer` attribute to `<script>` tags)
+2. **Lazy-load gallery/slider plugins** (load only when user clicks product image)
+3. **Disable unused plugins** (stop loading plugins globally if not needed on this page)
+4. **Code-split heavy libraries** (load only what's visible above the fold)
+5. **Minify/combine JavaScript** (reduce parsing overhead)
+
+---
+
+### 5. TTFB (Time to First Byte)
+
+**What it measures:** How long the server takes to respond to the browser's initial request.
+
+**Why it matters:** Everything else depends on this. You can't optimize what you haven't received yet.
+
+**Thresholds:**
+- **Good:** ≤ 800ms
+- **Poor:** ≥ 1,800ms
+
+**rds.ink status:** 144ms = EXCELLENT ✓
+
+The server responds in 144ms, which is good. This is NOT the bottleneck.
+
+**Typical fixes:**
+- Optimise server-side code (database queries, etc.)
+- Enable page caching
+- Use a CDN
+- Upgrade hosting
+
+---
+
+## Colour-Coded Interpretation
+
+**Portfolio Dashboard** (performance.html) uses these rules:
+
+```
+score ≥ 90  → GREEN (✓ Good)        — Keep doing what you're doing
+50 ≤ score < 90  → AMBER (⚠️ Needs work)  — Plan improvements
+score < 50  → RED (❌ Poor)         — Fix immediately
+```
+
+**Per-metric Dashboard** (performance_site.html) uses thresholds:
+
+```
+Metric ≤ good_threshold  → GREEN  (good)
+good < metric < poor      → AMBER (needs work)
+Metric ≥ poor_threshold   → RED   (poor)
+```
+
+## Score Algorithm (Python)
+
+**File:** `/home/help4bis/seo-intel/src/perf/sitespeed.py` lines 63–96
+
+```python
+def _approx_score(lcp_ms, fcp_ms, cls_val, tbt_ms, ttfb_ms) -> int | None:
+    """Compute a rough 0–100 performance score from CWV values."""
+    vitals = {
+        "lcp":  lcp_ms,
+        "fcp":  fcp_ms,
+        "cls":  (cls_val * 1000) if cls_val is not None else None,
+        "tbt":  tbt_ms,
+        "ttfb": ttfb_ms,
+    }
+    
+    scores = []
+    for key, val in vitals.items():
+        if val is None:
+            continue  # skip nulls (e.g., LCP if not measured)
+        
+        good, poor = _THRESHOLDS[key]
+        
+        if val <= good:
+            scores.append(100)
+        elif val >= poor:
+            scores.append(30)
+        else:
+            # linear interpolation
+            ratio = (val - good) / (poor - good)
+            scores.append(int(100 - ratio * 70))
+    
+    return int(statistics.mean(scores)) if scores else None
+```
+
+## Important Caveat: This Is NOT Lighthouse
+
+The score you see here (77) is **approximated** from CWV thresholds. It's **not** the official Google Lighthouse score.
+
+**Why the approximation?**
+- Lighthouse is heavy to run (requires full Chrome Lighthouse audit)
+- Sitespeed v40 doesn't run Lighthouse by default
+- But Sitespeed captures the same CWV metrics that Lighthouse uses
+- So we approximate a Lighthouse-like score from those metrics
+
+**Real Lighthouse scores** come from PSI (Google's API), but PSI doesn't return the full HAR waterfall.
+
+**Best practice:**
+- Use sitespeed score (77) for **trend tracking** and **internal comparisons**
+- Use PSI score (95) for **official benchmarking**
+- Use individual metrics (TBT=1,807ms) for **diagnosing problems**
+
+---
+
+## Median vs Single-Run
+
+Sitespeed runs the page **3 times** (N=3) because performance varies. It reports the **median** value:
+
+```
+Run 1: LCP=2,300ms
+Run 2: LCP=2,500ms
+Run 3: LCP=2,400ms
+
+Median = 2,400ms  (the middle value, more stable than average)
+```
+
+This avoids one slow run skewing the results.
+
+---
+
+See also:
+- [Metrics Reference](03-metrics-reference.md) — Deeper dive into each metric
+- [Testing Engines](04-testing-engines.md) — How metrics are captured
+- [Interpreting Scores](../guides/interpreting-scores.md) — What to do with your score