Add comprehensive documentation for the dual-engine performance evaluation system: - System architecture and data flow - Score calculation methodology (0-100 approximation from CWV thresholds) - Detailed metrics reference (LCP, FCP, CLS, TBT, TTFB) - Testing engines comparison (Sitespeed vs PSI) - Complete code structure map (file-by-file breakdown) - Case study: rds.ink 77 score with actionable fixes - Quick reference guides for interpreting results Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
12 KiB
Code Structure Map
Complete file-by-file breakdown of the seo-intel repository.
Directory Layout
/home/help4bis/seo-intel/
├── README.md # Project overview (v1.1.0)
├── pyproject.toml # Python project config (dependencies, build)
├── requirements.txt # Python package list
├── run.sh # Launch script (runs main.py)
├── .env # Secrets: PSI_API_KEY, DB path, etc.
│
├── src/ # Python package
│ ├── __init__.py
│ ├── main.py # FastAPI app entry point
│ ├── config.py # Settings, site list (SITES config)
│ ├── db.py # SQLAlchemy setup, migrations, session factory
│ │
│ ├── models/
│ │ ├── __init__.py
│ │ ├── perf.py # ORM models: PerfRun, PerfAudit, PerfOpportunity, PerfResource
│ │ ├── site.py # Site model (name, domain, priority)
│ │ ├── ranking.py # Ranking snapshot model (SEO keyword rankings)
│ │ └── ... # Other models (not perf-related)
│ │
│ ├── routers/
│ │ ├── __init__.py
│ │ ├── performance.py # GET /performance/, /performance/<site_id>, POST /api/perf/test, /api/perf/sweep
│ │ ├── dashboard.py # GET / (main dashboard)
│ │ ├── keywords.py # Keyword ranking pages
│ │ └── ... # Other routers (not perf-related)
│ │
│ ├── perf/ # Performance testing engines
│ │ ├── __init__.py
│ │ ├── runner.py # Orchestrator: run_full_test() — runs engines × devices
│ │ ├── sitespeed.py # Sitespeed.io Docker wrapper + HAR parser
│ │ ├── psi.py # Google PageSpeed Insights API client
│ │ └── batch.py # Weekly sweep logic
│ │
│ ├── playbook/ # SEO playbook generation (not perf-related)
│ │ ├── __init__.py
│ │ ├── rules.py
│ │ └── llm.py
│ │
│ └── ... # Other modules (keyword analysis, etc.)
│
├── templates/ # Jinja2 HTML templates
│ ├── base.html # Base template (nav, styling)
│ ├── performance.html # Portfolio scorecard
│ ├── performance_site.html # Per-site detail dashboard
│ ├── dashboard.html # Main dashboard
│ └── ... # Other templates
│
├── data/
│ └── seo-intel.db # SQLite database (perf_runs, perf_audits, etc.)
│
├── docs/ # Documentation (this repo)
│
└── ops/ # Operations scripts
├── schema.sql # Database schema
└── ...
Performance System Files (Perf Tier)
src/routers/performance.py
Purpose: FastAPI routes for the performance dashboard
Key functions:
performance_home(request, db)—GET /performance/→ portfolio scorecardperformance_site(site_id, request, db)—GET /performance/<site_id>→ per-site detailapi_perf_test(body, background_tasks, db)—POST /api/perf/test→ trigger single URL testapi_perf_sweep(background_tasks)—POST /api/perf/sweep→ trigger portfolio sweep_portfolio_rows(db)— SQL: latest scores per site × device_site_url_rows(db, site_id)— SQL: latest score per URL_site_latest_audit(db, site_id, device)— SQL: full metrics for latest run_site_trend(db, site_id, weeks)— SQL: weekly AVG scores (12 weeks)_site_opportunities(db, site_id, device)— SQL: top PSI opportunities_site_slow_resources(db, site_id)— SQL: top 10 slowest resources
Key imports:
from fastapi import APIRouter, BackgroundTasks, Depends
from sqlalchemy import text
from fastapi.templating import Jinja2Templates
from .perf.runner import run_full_test
from .perf.batch import run_weekly_perf_sweep
Size: ~545 lines
src/perf/runner.py
Purpose: Orchestrates test runs across engines and devices
Key functions:
run_full_test(site_id, url, db, engines, devices)— Main orchestrator- Loops: for engine in engines: for device in devices:
- Calls appropriate engine (sitespeed or psi)
- Persists each result via
_persist_run() - Returns summary dict
_persist_run(db, site_id, url, engine, result)— Writes one test result to database- Inserts: perf_runs (1), perf_audits (1), perf_opportunities (0+), perf_resources (0+)
- Commits transaction
Key imports:
from sqlalchemy.orm import Session
from .models.perf import PerfRun, PerfAudit, PerfOpportunity, PerfResource
from .sitespeed import run_sitespeed_test
from .psi import run_psi_test
Size: ~200 lines
src/perf/sitespeed.py
Purpose: Wraps sitespeed.io Docker container, parses HAR output
Key functions:
run_sitespeed_test(url, device)— Execute sitespeed in Docker- Builds Docker command with device-specific args (--mobile vs desktop UA)
- Runs
docker run sitespeedio/sitespeed.io:40.4.0 {url} --n 3 ... - Waits for output (60s)
- Calls
_parse_har()to extract metrics - Calls
_approx_score()to calculate performance score - Returns: success, performance_score, metrics, resources
_parse_har(har_path)— Parse/tmp/sitespeed-output/{run_id}/.../browsertime.har- Extracts _googleWebVitals from pages[] (LCP, FCP, CLS, TTFB)
- Extracts _cpu.longTasks.totalBlockingTime from pages[] (TBT)
- Sums resource sizes by type (image, script, stylesheet, font)
- Returns: metrics dict, resources list
_approx_score(lcp_ms, fcp_ms, cls, tbt_ms, ttfb_ms)— Calculate 0-100 score- Uses _THRESHOLDS (lines 53–60)
- Linear interpolation between good/poor for each metric
- Returns: int(mean(all_metric_scores))
_guess_resource_type(url, content_type)— Classify resource (script, image, etc.)
Key constants:
SITESPEED_IMAGE = "sitespeedio/sitespeed.io:40.4.0"(pinned version)OUTPUT_BASE = Path("/tmp/sitespeed-output")(Docker output mount point)_THRESHOLDSdict (lines 53–60): (good, poor) for LCP, FCP, CLS, TBT, TTFB
Size: ~450 lines
src/perf/psi.py
Purpose: Calls Google PageSpeed Insights API, parses Lighthouse results
Key functions:
run_psi_test(url, device)— Call PageSpeed Insights API- GET
https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=...&strategy={device} - Parses response.lighthouseResult
- Calls
_parse_lighthouse_audits()(shared with sitespeed) - Returns: success, performance_score (official), metrics, opportunities
- GET
_parse_lighthouse_audits(audits)— Extract metrics + opportunities from Lighthouse JSON- Maps audit keys (largest-contentful-paint, etc.) to metric values
- Extracts opportunities (audit.details.type == "opportunity")
- Calculates savings_ms and savings_bytes for each opportunity
- Returns: metrics dict, opportunities list
Key constants:
PSI_ENDPOINT = "https://www.googleapis.com/pagespeedonline/v5/runPagespeed"PSI_TIMEOUT = 90(Google's API can be slow)
Size: ~150 lines
src/perf/batch.py
Purpose: Weekly portfolio performance sweep
Key functions:
run_weekly_perf_sweep(db)— Main sweep orchestrator- Loops: for each site in SITES:
- Calls
resolve_url_list()to get top 6 URLs - For each URL: calls
run_full_test()(sitespeed + psi, mobile + desktop) - Logs completion summary
resolve_url_list(db, domain)— Get URLs for a site- Always: homepage
- Plus: top 5 URLs from ranking_snapshots (last 30 days, sorted by impressions)
- Returns: list of 6 URLs max
_get_top_urls(db, site_id, limit)— Query ranking_snapshots for impressions
Size: ~150 lines
src/models/perf.py
Purpose: SQLAlchemy ORM models for performance data
Models:
PerfRun— Test execution record- Fields: id, site_id, url, engine, device, started_at, completed_at, success, error_message
- Relations: audits (1-to-many), opportunities (1-to-many), resources (1-to-many)
PerfAudit— Core Web Vitals metrics for one run- Fields: id, perf_run_id, performance_score, lcp_ms, cls, inp_ms, tbt_ms, fcp_ms, ttfb_ms, total_byte_weight, image_bytes, js_bytes, css_bytes, font_bytes, requests_count, dom_size
- Relations: run (many-to-1)
PerfOpportunity— Lighthouse audit opportunity- Fields: id, perf_run_id, opportunity_key, display_label, savings_ms, savings_bytes, details_json
- Relations: run (many-to-1)
PerfResource— HAR resource entry- Fields: id, perf_run_id, resource_url, resource_type, size_bytes, transfer_size_bytes, start_time_ms, end_time_ms, is_render_blocking
- Relations: run (many-to-1)
Size: ~100 lines
Templates
templates/performance.html
Purpose: Portfolio performance scorecard
Features:
- Table of all sites (13 rows)
- Columns: domain, score_mobile, score_desktop, lcp_ms, cls, slowest_url, last_tested
- Colour-coded scores (green ≥90, amber ≥50, red <50)
- "Run portfolio sweep now" button (HTMX POST to /api/perf/sweep)
- Sweep status display (idle | running | ok | error)
Size: ~200 lines
templates/performance_site.html
Purpose: Per-site performance detail dashboard
Features:
- Latest CWV metrics (mobile + desktop side-by-side)
- 12-week trend sparkline chart (mobile + desktop bars per week)
- Top 5 optimisation opportunities (PSI)
- Top 10 slowest resources (sitespeed HAR)
- Per-URL breakdown table with test buttons
- Columns: URL, score, LCP, CLS, requests, tested_at, test_now_buttons
- Test buttons: Both (mobile+desktop), Mob, Dsk
Interactive elements:
- HTMX buttons that queue tests
- Coloured metric badges (green/amber/red)
- Tooltips for long URLs
Size: ~390 lines
Supporting Files
src/config.py
What it contains:
Settingsclass (Pydantic)SITES— list of 13 sites to monitor- Each site: domain, priority (sorting order)
Size: ~50 lines
src/db.py
What it contains:
- SQLAlchemy engine + session factory
Base(declarative base for all models)- Database URI from .env
- Migration logic (auto-create tables on startup)
Size: ~60 lines
requirements.txt
Key dependencies for performance testing:
- fastapi, uvicorn (web framework)
- sqlalchemy (ORM)
- httpx (for PSI API calls)
- docker (for sitespeed execution)
- jinja2 (templates)
File Interaction Map
FastAPI Request
↓
performance.py (routers)
↓
[Query] perf_audits table via SQL
├─→ db.py (SQLAlchemy session)
│
[Create] templates (Jinja2)
├─→ performance_site.html
└─→ performance.html
[Background Task] api_perf_test()
↓
runner.py:run_full_test()
├─ For each engine:
│ ├─ sitespeed.py:run_sitespeed_test() → Docker
│ │ ├─ subprocess.run("docker run sitespeedio/...")
│ │ ├─ _parse_har(browsertime.har)
│ │ └─ _approx_score(metrics) → 0-100
│ │
│ └─ psi.py:run_psi_test() → Google API
│ ├─ httpx.get(googleapis.com/...)
│ ├─ _parse_lighthouse_audits(audits)
│ └─ opportunities + official_score
│
├─ runner.py:_persist_run() for each result
│ ├─ INSERT perf_runs
│ ├─ INSERT perf_audits
│ ├─ INSERT perf_opportunities
│ └─ INSERT perf_resources
│
└─ models/perf.py (ORM objects)
└─ db.py (commit to SQLAlchemy)
Deployment
All files live in /home/help4bis/seo-intel/ on george (192.168.0.117).
To start the service:
cd /home/help4bis/seo-intel
./run.sh
# or
uvicorn src.main:app --host 0.0.0.0 --port 8765 --reload
To run tests manually:
cd /home/help4bis/seo-intel
python -c "
from src.perf.runner import run_full_test
from src.db import SessionLocal
db = SessionLocal()
result = run_full_test(
site_id=3,
url='https://rds.ink/endangered',
db=db,
engines=['sitespeed', 'psi'],
devices=['mobile', 'desktop']
)
print(result)
"
See also:
- Database Schema — All tables and fields
- API Endpoints — HTTP routes and payloads