⚙️ Demo Controls
80% cached IDs + 20% fresh DB hits — reveals the bimodal distribution
How many requests fly in parallel. 1 = sequential (pure queue, clear separation of cache vs DB times). 5–20 = realistic production load — the Cloudflare Worker handles them simultaneously, Redis and PostgreSQL serve different requests at the same time, and latencies can overlap. Higher concurrency reveals whether your stack degrades gracefully under pressure or collapses into a single slow blob on the histogram.
📊 Live Latency Distribution
📈 Stats
Mean = — — nobody actually experiences this latency.
Real peaks: —
💡 Analysis
📋 Run History
| Run | Mode | Count | Cache % | Mean | P50 | P99 | Shape |
|---|
🤔 Why Your Average Lies
If cache hits take ~150ms and DB hits take ~450ms at an 80/20 split:
80% of users get ~150ms, 20% get ~450ms. The mean is a phantom — it exists in the gap between the two real experiences.
🔮 What to Monitor Instead
- ✓P50 — what a typical user experiences
- ✓P95 / P99 — what your worst-served users experience
- ✓Cache hit rate — leading indicator; drops before latency spikes
- ✓Histogram shape — bimodal = hidden slow tail; unimodal = healthy
- ✕Mean — collapses the distribution into a single misleading number