How to ballpark scale in 60 seconds and avoid being wrong by 100x.
Back-of-Envelope Estimation: how to ballpark scale in 60 seconds and avoid being wrong by 100x
frame the problem before drawing the system
Numbers are 'within an order of magnitude' — that's the bar.
Almost every design decision starts with an estimate: 'how big is this?' You will not have time to look things up. The trick is a tiny mental toolkit — powers of ten, latency numbers, a few storage sizes — and a discipline of always doing the math out loud.
Anchor users. 100M MAU × ~30% DAU = 30M DAU. ~10% peak hour ≈ 3M users that hour. Average actions per user × that = peak QPS.
Anchor storage. Tweet-sized record ≈ 300 bytes. Image ≈ 200 KB. Video minute ≈ 5–50 MB. Multiply by daily writes for raw, then ×3 for replication, ×2 for indexes.
Anchor bandwidth. 1 Gbps ≈ 125 MB/s. A single Postgres on commodity SSD ≈ tens of thousands of writes/sec; sustained 100k writes/sec is a distributed-system question.
Sanity check by two independent paths. If user math gives 50k QPS and storage math gives 2 TB/day, both should be consistent with the same write volume.
Estimate Twitter timeline read load at 300M MAU.