Files

help4bis 76523db177 Add comprehensive metrics and engines documentation

Complete the documentation suite with:
- Deep-dive metrics reference (LCP, FCP, CLS, TBT, TTFB)
- Detailed testing engines comparison (Sitespeed vs PSI)
- Why TBT is the killer metric for rds.ink
- How to fix each metric using Hummingbird
- Score differences and when to use each engine

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

2026-05-14 05:58:12 +10:00

9.8 KiB

Raw Permalink Blame History

Testing Engines: Sitespeed vs PSI

Detailed comparison of the two independent engines used to measure performance.

Overview

The seo-intel system uses two different testing engines in parallel:

Sitespeed.io — Real-browser testing with HAR waterfall
Google PageSpeed Insights (PSI) — Official Lighthouse audits

This dual approach captures:

Real-user metrics via browser instrumentation (Sitespeed)
Official Google scores + recommendations (PSI)

Sitespeed.io

What It Is

An open-source performance testing framework that runs a headless Chrome browser to measure page performance in real-world conditions.

Docker image: sitespeedio/sitespeed.io:40.4.0

How It Works

1. User clicks "Test" for https://rds.ink/endangered
    ↓
2. Sitespeed starts Docker container with headless Chrome
    ↓
3. Browser loads page (3 times, N=3)
    - Run 1: LCP=2300ms, FCP=2100ms, TBT=1800ms
    - Run 2: LCP=2500ms, FCP=2120ms, TBT=1850ms
    - Run 3: LCP=2400ms, FCP=2110ms, TBT=1780ms
    ↓
4. Sitespeed takes the MEDIAN of the three runs
    - LCP = 2400ms  (middle value)
    - FCP = 2110ms
    - TBT = 1800ms
    ↓
5. Browser exports HAR (HTTP Archive) file
    - Contains: every resource, timing, size
    - JSON file with full waterfall
    ↓
6. Sitespeed parses HAR
    - Extracts CWV metrics
    - Calculates page weight
    - Identifies resources
    ↓
7. Approximates 0-100 score from thresholds
    - NOT official Lighthouse (no Lighthouse plugin)
    - But uses same CWV thresholds as Lighthouse

Metrics Captured

Core Web Vitals:

LCP (Largest Contentful Paint)
FCP (First Contentful Paint)
CLS (Cumulative Layout Shift)
TBT (Total Blocking Time)
TTFB (Time to First Byte)
INP (Interaction to Next Paint) — Not captured in v40

Page breakdown:

Total page size (bytes)
Image bytes
JavaScript bytes
CSS bytes
Font bytes
Request count

Resource list:

Every HTTP request made
URL, type (script/image/stylesheet/font/xhr)
Size, timing

Advantages

✅ Real browser — Chrome's actual instrumentation
✅ Full HAR — See every resource, identify bottlenecks
✅ Consistent — Can run anytime, same environment
✅ Resource timing — Measure individual script/image load times
✅ Median metrics — Run 3 times, use median (more stable than single run)

Disadvantages

❌ No Lighthouse — Score is approximated, not official
❌ Slower — 60 seconds per device
❌ No opportunities — Doesn't tell you "fix this"
❌ No INP metric — v40 doesn't capture Interaction to Next Paint
❌ Approximate score — Different algorithm than real Lighthouse

Device Modes

Mobile mode (default):

--mobile --connectivity 4g

Emulates Moto G4 device (412x732 viewport)
4G throttling (simulates real 4G speeds)
Mobile user agent
Duration: ~60s

Desktop mode:

--browsertime.connectivity native --browsertime.viewPort 1366x768 --browsertime.userAgent "Chrome Windows"

1366x768 viewport (typical laptop)
Native connectivity (no throttling)
Desktop Chrome user agent
Duration: ~60s

Output Files

Stored at: /tmp/sitespeed-output/{run_id}/pages/{domain}/data/

browsertime.har — Full HTTP Archive (JSON)
browsertime.json — Detailed metrics (also JSON)
screenShots/ — Video/screenshots of page load

Score Calculation (Sitespeed v40)

Since Sitespeed v40 doesn't run Lighthouse, it approximates the score:

# src/perf/sitespeed.py:_approx_score()

_THRESHOLDS = {
    "lcp":  (2500,  4000),
    "fcp":  (1800,  3000),
    "cls":  (0.1,   0.25),
    "tbt":  (200,   600),
    "ttfb": (800,   1800),
}

# For each metric:
if value <= good:
    score = 100
elif value >= poor:
    score = 30
else:
    ratio = (value - good) / (poor - good)
    score = 100 - (ratio * 70)

# Final score = average of all metric scores
performance_score = mean([lcp_score, fcp_score, cls_score, tbt_score, ttfb_score])

Important: This is NOT the real Lighthouse score. It's an approximation for trend tracking.

Google PageSpeed Insights (PSI)

What It Is

Google's official Lighthouse audit service. You submit a URL and Google runs Lighthouse against it.

API endpoint: https://www.googleapis.com/pagespeedonline/v5/runPagespeed

How It Works

1. seo-intel calls Google's API
    GET /pagespeedonline/v5/runPagespeed?url=...&strategy=mobile
    ↓
2. Google spins up Lighthouse
    - Full audit: performance, accessibility, best practices, SEO, PWA
    - We only care about "performance" category
    ↓
3. Lighthouse runs and scores 0-100
    - This is the OFFICIAL score
    - Uses Google's real Lighthouse algorithm
    ↓
4. Lighthouse audit results returned
    - Performance score (0-100)
    - All audit items (100+ audits)
    - Opportunities (what to fix + savings)
    ↓
5. seo-intel parses response
    - Extracts score
    - Extracts opportunities
    - Calculates potential savings (ms + bytes)

Metrics Captured

Official performance score (0-100)

This is what Google reports
Different algorithm than Sitespeed approximation

Opportunities:

"Reduce unused JavaScript" → 400ms savings, 150KB reduction
"Minify CSS" → 50ms, 20KB
"Lazy load offscreen images" → 200ms, 500KB
"Eliminate render-blocking resources" → 300ms
(and ~20 more opportunities)

Same CWV metrics as Sitespeed:

LCP, FCP, CLS, TBT, TTFB

Advantages

✅ Official Lighthouse — What Google actually scores
✅ Opportunities — Specific recommendations on what to fix
✅ Savings estimates — How much you'd save per fix
✅ Comprehensive audit — 100+ checks across performance, UX, SEO
✅ Credibility — "Google says you score 95"

Disadvantages

❌ No HAR — Can't see individual resource timings
❌ Slower — 30-90 seconds per device (depends on Google's load)
❌ Rate-limited — ~25k tests/day without API key
❌ Slower infrastructure — Google's API is slower than local Sitespeed
❌ No resource breakdown — Can't identify which JS file is slow

API Key

Optional. Without it, you get ~25,000 tests/day. With it, you get much higher limits.

Where to set:

.env file: PSI_API_KEY=...
Or environment variable: export PSI_API_KEY=...

Where to get:

Google Cloud Console
Create project
Enable PageSpeed Insights API
Create API key
Set in .env

If not set, seo-intel still works but you might hit rate limits on very high-volume testing.

Score Differences from Sitespeed

PSI score ≠ Sitespeed score because they use different algorithms:

Aspect	Sitespeed Score	PSI Score
Source	Approximated from thresholds	Official Lighthouse
Algorithm	Linear interpolation	Complex weighting
Weightings	Equal (each metric = 1/5)	Weighted (some metrics matter more)
Audits	None	100+ audits
Opportunities	None	Yes (what to fix)
Example	77 (this page)	95 (estimated)

Sitespeed 77 means: directional score, TBT is the killer PSI 95 means: official Google score, page is good but TBT hurts it slightly

Comparison Table

Feature	Sitespeed	PSI
Real browser	✅ Headless Chrome	✅ Lighthouse (Chrome)
Duration	60s	30-90s
HAR output	✅ Full	❌ Limited
Resource timing	✅ Per-resource	❌ Aggregate only
Official score	❌ Approximated	✅ Real Lighthouse
Opportunities	❌ None	✅ Full audit
Savings estimates	❌ No	✅ Yes (ms + bytes)
CWV metrics	✅ LCP, FCP, CLS, TBT, TTFB, INP	✅ Same
Cost	Free (Docker)	Free (25k/day) or API key
Best for	Trend tracking, waterfall analysis	Official benchmarking, what to fix

Which Score Should You Use?

For trend tracking: Use Sitespeed score (77). It's fast, local, consistent. You can test weekly and see if score improves over time.

For official reporting: Use PSI score (95). It's what Google officially scores you. Client-friendly, credible.

For diagnosing problems: Use individual metrics (TBT=1,807ms). This tells you exactly what's broken. Focus on the worst metric first.

How They Work Together

The dual-engine approach gives you:

Sitespeed finds the bottleneck (TBT=1,807ms is the killer)
Sitespeed HAR shows you the resources causing TBT (JavaScript files)
PSI tells you how to fix it (opportunities: defer JS, lazy-load, etc.)
PSI score tells you the official Google score (95)
Trends show if your fixes actually work (score 77 → 88 → 95)

Docker Details (Sitespeed)

Image Details

Docker Hub: sitespeedio/sitespeed.io:40.4.0
Size: ~1.5 GB
Base: Node.js + Chrome
Updated: May 2026

Why v40.4.0?

Latest stable version (verified 2026-05-13)
Previous versions have bugs or missing metrics
Pinned version ensures reproducible results

How seo-intel Runs It

docker run --rm \
  --shm-size=1g \
  -v /tmp/sitespeed-output:/sitespeed.io \
  sitespeedio/sitespeed.io:40.4.0 \
  https://rds.ink/endangered \
  --mobile --connectivity 4g \
  --n 3 \
  --outputFolder /sitespeed.io/{run_id} \
  --summary --summary-detail

Key flags:

--rm — Delete container after run (clean up)
--shm-size=1g — Allocate 1GB shared memory for Chrome
-v — Mount output directory so we can read the HAR
--n 3 — Run 3 iterations (use median)
--summary — Print summary to stdout

9.8 KiB Raw Permalink Blame History

Testing Engines: Sitespeed vs PSI

Overview

Sitespeed.io

What It Is

How It Works

Metrics Captured

Advantages

Disadvantages

Device Modes

Output Files

Score Calculation (Sitespeed v40)

Google PageSpeed Insights (PSI)

What It Is

How It Works

Metrics Captured

Advantages

Disadvantages

API Key

Score Differences from Sitespeed

Comparison Table

Which Score Should You Use?

How They Work Together

Docker Details (Sitespeed)

Image Details

Why v40.4.0?

How seo-intel Runs It

9.8 KiB

Raw Permalink Blame History