System Design Case Study

How does a URL shortener generate globally unique short codes at scale?

?? Design a URL shortener: 1000+ URLs/sec, 7-char codes, no collisions, 100K redirects/sec
Concepts Involved

Problem Statement

How does a URL shortener generate globally unique short codes at 1000+ URLs/sec, ensuring no collisions across distributed nodes while keeping codes short (7 chars) and supporting custom aliases?

Core challenge: 7 chars (base62) = 627 = 3.5 trillion possible codes. But generating them without collisions across multiple servers · without a central coordinator bottleneck · requires careful ID generation strategy.
1000+
URLs shortened / sec
100K+
redirects / sec
7 chars
base62 (3.5T codes)
<10ms
redirect P99

Architecture

WRITE PATH · POST /shorten (1000 URLs/sec) Client POST long_url + optional alias API Server Validate URL format Check custom alias unique Safe Browsing API check ID Generator Pre-allocated counter ranges base62(counter) ? 7 chars Collision-free by construction DB Write code ? long_url DynamoDB/Cassandra Highly available Cache Warm Redis SET pre-warm hot path ZooKeeper: allocate counter ranges node1: 1-1M, node2: 1M-2M, node3: 2M-3M READ PATH · GET /abc123 ? 301 Redirect (100K req/sec) Browser GET /abc123 100K req/sec Redis Cache GET code ? long_url 99% hit rate | <1ms 170GB for top 20% URLs (Pareto) hit (99%) miss (1%) DB lookup ? cache fill 301/302 Redirect Location: long_url <10ms total P99 301=permanent | 302=track clicks async Kafka (click event) User lands on destination Browser follows redirect Total: <10ms redirect latency ANALYTICS LAYER · Real-time Click Stats Kafka Click event stream code, timestamp, IP user-agent, referrer 100K events/sec ingestion Flink / Spark Stream processing Aggregate by geo, referrer device type, time window 1-min tumbling windows Analytics DB ClickHouse / Druid Columnar storage Fast aggregation queries Billions of click records Dashboard Real-time stats Clicks per link Geo heatmap Top referrers KEY DESIGN DECISIONS Write:Read = 1:100 ? optimize for reads | Cache is the critical path, DB is fallback only base62(counter) = collision-free by construction | 3.5 trillion possible 7-char codes (won't exhaust for centuries) 301 (permanent, browser caches ? fewer server hits) vs 302 (temporary, track every click for analytics) 1000 URLs/sec write | 100K redirects/sec read | 7 chars base62 | <10ms redirect
ID generation: Pre-allocate counter ranges from a central sequence (ZooKeeper/DB). Each node gets a range (e.g., 1M IDs). Encode counter as base62 ? 7 chars. No coordination during normal operation · only when range exhausted.
Redirect path: GET /abc1234 ? Redis lookup (1ms) ? if miss, DB lookup ? 301 redirect. 301 (permanent) for SEO, 302 (temporary) if you need analytics on every click.
Anti-patterns: Random hash + check-for-collision · race conditions under load. Auto-increment single DB · bottleneck. No cache for redirects · DB melts at 100K req/sec.

Scale Estimation

StepDerivationResultDesign Impact
1New URLs: 1000/sec · 86400~86M URLs/dayWrite-light · DB handles easily
2Redirects: 100:1 read/write ratio100K redirects/secRead-heavy · must cache aggressively
3Storage: 86M/day · 365 · 5 years · 500 bytes~7.8 TB totalFits in single sharded DB cluster
4Cache: top 20% URLs = 80% traffic (Pareto)~170 GB RedisHot URLs cached, 99% hit rate
5ID space: 7 chars base62 = 6273.5 trillion codesWon't exhaust for centuries at 86M/day
6Bandwidth: 100K req/sec · 500 bytes response~50 MB/sec outboundTrivial for modern infrastructure

Resilience & Edge Cases

FailureImpactRecovery
Cache miss storm (viral link)1M req/sec to DB for same keyRequest coalescing: only 1 DB query, others wait for result. Or: cache-aside with SETNX lock.
Counter range exhaustedNode can't generate new IDsPre-fetch next range before current exhausts (watermark at 80%). Fallback: request new range from coordinator.
Custom alias collisionTwo users want same aliasUNIQUE constraint in DB. First writer wins. Return 409 Conflict to second.
Malicious URLsPhishing/malware behind short linkScan URLs against Google Safe Browsing API on creation. Flag + interstitial warning page.
Link expiryExpired link still cachedTTL on cache entries. Check expiry on redirect. Return 410 Gone for expired links.

Interview Cheat Sheet

1. Base62 encoding of counter · 7 chars = 3.5T unique codes, collision-free
2. Pre-allocated counter ranges · no coordination during normal operation
3. Read-heavy (100:1) · cache redirects aggressively (Redis, 99% hit rate)
4. 301 vs 302 · permanent (SEO, browser caches) vs temporary (track every click)
5. Custom aliases · check uniqueness, store separately, higher priority in lookup
6. Analytics async · log click event to Kafka, don't block redirect path