Module 2 · Scale, Latency, CapacityDay 01720 min

QPS, RPS, IOPS, BPS

Different units — each names a different bottleneck.

← Previous Next →

Day 017

QPS, RPS, IOPS, BPS

20m

focus

QPS

service

IOPS

datastore

BPS

edge

PPS

edge

Memory hook

QPS, RPS, IOPS, BPS: different units

Mental model

make the invisible limits visible

Design lens

Bigger RAM raises page-cache hit rate.

Recall anchors

QPS/RPS — appIOPS — storageBPS — bandwidth

Why it matters

Throughput is measured in different units depending on which layer you mean. QPS/RPS at the app, IOPS at storage, BPS at the network. Scaling problems usually live in one unit but get described in another — clarity matters.

1Use the right unit for each layer.
2Convert between units when mixing layers.
3Recognize IOPS as the often-hidden ceiling.

Deep dive

QPS/RPS: queries or requests per second at the app or DB. Limited by CPU, locks, or downstream.

IOPS: storage operations per second; commodity SSDs ~10–100k. Many DB problems are really IOPS problems.

BPS: bytes per second; matters for media, analytics, replication.

Always derive: e.g. 1k RPS × 5 reads/req × 10 IOPS/read ≈ 50k IOPS.

Demo / scenario

API hits 5k RPS, DB CPU 30%, but slow.

Each request reads 20 small rows → 100k DB reads/sec.
Storage IOPS limit: 80k → bottleneck.
Fix: bigger page-cache hit rate, batch fetches, secondary index, or read replica.

Tradeoffs

Bigger RAM raises page-cache hit rate.
Composite indexes reduce reads per query.
Caching reduces DB IOPS at the cost of consistency.

Diagram

Units across the stack.

Mind map

Check yourself

Loading quiz…

Sources & further reading

AWS — EBS volume types

← Previous Next →