System Design Case Study

How to push live cricket scores to 30M concurrent users within sub-second latency?

?? Design a live scoring platform: 30M concurrent, sub-second push, ball-by-ball events
Concepts Involved

Problem Statement

Build a live scoring platform where every ball event (runs, wicket, over) must reach 30M concurrent users globally within sub-second latency during World Cup finals, supporting both persistent connections and fallback polling.

Core challenge: 30M users watching the same match. A wicket falls. Every user must see it within 1 second. That's 30M push deliveries for a single event. You can't maintain 30M individual WebSocket connections on one server · you need a broadcast architecture.
30M
concurrent users
World Cup final peak
<1s
delivery latency
event ? all users
~1 event/s
ball frequency
burst: 5-10 events/s
Global
distribution
India, UK, Aus, etc.

Architecture · Broadcast at Scale

Event ingestion ? fan-out tree ? edge push servers ? clients

INGESTION LAYER Stadium Feed Ball-by-ball data ~1 event / 6 sec avg Burst: 5-10 events/sec multiple redundant sources Ingestion Service Normalize event format (JSON schema) Validate + assign sequence number Dedup multiple sources (idempotency key) Latency: ~50ms processing Event Store (Append-Only Log) Immutable event log per match Sequence guaranteed ordering Source of truth for replay/catch-up Clients reconnect ? replay from last seq# DISTRIBUTION LAYER · Regional Edge Push Event Bus (Kafka) 1 event ? all regions simultaneously Broadcast: same content for ALL users Latency: ~10ms to all consumers Edge India · 20 servers · 500K WebSocket conns 10M connections | Primary audience (70% traffic) Mumbai + Delhi PoPs | <5ms push to client Edge UK · 5 servers · 500K WebSocket conns 2.5M connections | London PoP Stateless: client reconnects to any server Edge Aus · 3 servers · 500K WebSocket conns 1.5M connections | Sydney PoP Auto-scale during World Cup peaks CDN Fallback 1s TTL polling endpoint For non-WS clients Absorbs 2.5M poll req/sec Max 1-2s stale CLIENT LAYER · 30M Concurrent Users WebSocket Clients Mobile apps (iOS/Android) Instant delivery (<100ms) Persistent connection, low overhead ~20M users via WebSocket SSE Clients Web browsers (modern) Auto-reconnect built-in HTTP-based, simpler than WS ~5M users via SSE Polling Clients Old browsers, firewalls CDN-cached JSON (1s TTL) Max 1-2s stale, acceptable ~5M users via polling 30M concurrent | <1s delivery | stateless edge servers BROADCAST NOT FAN-OUT: All 30M users get SAME event Unlike chat (personalized per user), sports scores are identical for everyone watching same match Total Latency Breakdown Stadium ? Ingestion (50ms) ? Event Bus (10ms) ? Edge Push (5ms) ? Client = <100ms for WebSocket CDN polling clients: max 1-2s stale | CDN absorbs 2.5M polling req/sec without hitting origin 1 event ? Kafka ? 28 edge servers ? 30M clients in parallel (each server pushes to its 500K connections)
LayerComponentScaleRole
IngestionScore API + manual operators1-10 events/secNormalize ball events from stadium feed, validate, sequence
Fan-outKafka/Redis Pub/Sub1 event ? all edge serversBroadcast single event to all regional push servers
Edge PushWebSocket/SSE servers (per region)~500K connections/server · 60 serversHold persistent connections, push events to connected clients
FallbackCDN-cached polling endpointMillions of req/secFor clients that can't hold WS (old browsers, firewalls)
CDNEdge-cached score JSON1s TTL, global PoPsAbsorb polling traffic, serve stale-but-recent scores
Key insight: This is a broadcast problem, not a fan-out problem. All 30M users receive the SAME event. Unlike chat (personalized per user), sports scores are identical for everyone watching the same match. This means: one event ? multicast to edge servers ? each server pushes to its connected clients. No per-user fan-out needed.
Connection strategy: WebSocket for mobile apps (persistent, low overhead). SSE (Server-Sent Events) for web (simpler, HTTP-based, auto-reconnect). Long-polling as fallback. CDN-cached JSON (1s TTL) for extreme scale · polling clients get scores within 1-2s.
Failure modes: Edge server crash ? clients auto-reconnect to another server (stateless). Event source delay ? show "updating..." indicator. CDN stale ? 1s TTL means max 1s stale. 30M simultaneous reconnect (thundering herd after outage) ? exponential backoff + jitter on client.
Real-world: Cricbuzz · 30M+ during IPL/World Cup. ESPN · CDN-first with WebSocket for premium. Hotstar · 25M concurrent during cricket (similar architecture). BBC Sport · SSE for live text commentary.

Scale Estimation

StepDerivationResultDesign Impact
1WebSocket servers: 30M · 500K conn/server60 edge serversDistributed across regions (India, UK, Aus)
2Events per match: ~300 balls · 3 events/ball~900 events/matchLow event rate · broadcast is the challenge, not ingestion
3Fan-out per event: 1 event ? 60 servers ? 30M clients30M pushes in <1sEach server pushes to its 500K connections in parallel
4Polling fallback: 5M pollers · every 2s2.5M req/sec to CDNCDN absorbs · 1s TTL means max 1s stale
5Bandwidth per client: ~200 bytes/event · 1 event/6s~33 bytes/sec per clientNegligible · WebSocket overhead dominates

Interview Cheat Sheet

The 6 things to say for live broadcast design

1. Broadcast, not fan-out · all users get same event (unlike chat which is personalized)
2. Fan-out tree · 1 event ? Kafka/Redis Pub/Sub ? 60 edge servers ? 500K clients each
3. WebSocket + SSE + CDN polling · tiered delivery for different client capabilities
4. CDN with 1s TTL · absorbs polling traffic, serves near-real-time for non-WS clients
5. Stateless edge servers · client reconnects to any server (no session affinity needed)
6. Exponential backoff + jitter on reconnect · prevents thundering herd after outage