How does Meta classify 500M+ posts/day for policy violations?

➔? Design a content moderation system: 500M posts/day, multi-modal ML, <1% false positive, human review routing

Concepts Involved

Kafka Stream Processing Message Queues Caching Load Balancer

Problem Statement

How does a content moderation platform classify 500M+ posts/day across text, images, and video for policy violations while keeping false positives below 1% and routing borderline cases to human reviewers within minutes?

Core challenge: Multi-modal content (text + image + video) requires different ML models working together. Must balance aggressive detection (catch harmful content fast) with low false positives (don't silence legitimate speech). Borderline cases need human judgment in minutes, not hours.

500M+

posts classified / day

<1%

false positive rate

Minutes

to human review

Multi-modal

text + image + video ML

Architecture

Multi-modal ML pipeline: Text goes through NLP transformers, images through CNN + CLIP embeddings, video through keyframe extraction + per-frame classification. Ensemble scoring combines modality signals · a benign caption on a harmful image still gets flagged.

Threshold-based routing: High confidence (>0.95) → auto-remove. Medium (0.5-0.95) → human review queue prioritized by severity. Low (<0.5) → allow but monitor. Child safety content has lowest threshold · err on side of removal.

Anti-patterns: Single threshold for all categories · hate speech → spam → nudity. No human-in-the-loop · ML alone can't handle context/satire. Batch retraining only · adversarial content evolves daily.

Feedback loop: Human review decisions feed back into model training. Track false positive/negative rates per category. A/B test new models on shadow traffic before promotion. Active learning · prioritize uncertain samples for labeling.

Interview Cheat Sheet

1. Multi-modal classifiers · separate models for text/image/video, ensemble for final score
2. Confidence thresholds · auto-action at high confidence, human review for borderline
3. Severity-based routing · child safety immediate, hate speech priority, spam batch
4. Feedback loop · human decisions retrain models, active learning on uncertain samples
5. False positive minimization · per-category thresholds, appeal process, shadow testing
6. Adversarial robustness · text obfuscation detection, steganography checks, rapid model updates

System Design Case Study

Problem Statement

Architecture

Interview Cheat Sheet