Module 1 · Foundations & MethodDay 00125 min

What System Design Actually Is

Why interviews and real engineering both demand the same skill.

Day 001

What System Design Actually Is

25m

focus

System

service

Scale

note

Reliability

note

Cost

note

Changeability

note

Signal path

The four perennial concerns of any large system

Scale

note

async

System

service

Reliability

note

async

System

service

Cost

note

async

System

service

Memory hook

What System Design Actually Is: why interviews and real engineering both demand the same skill

Mental model

frame the problem before drawing the system

Design lens

Postgres is operationally simpler and the team knows it well.

Recall anchors

ConcernsInputsOutputs

Why it matters

System design is the discipline of choosing the right shapes for a problem before you build it: which boxes exist, what the lines between them mean, where state lives, and how each part fails. It is the bridge between product requirements and code, and the place where most expensive mistakes get baked in.

1Define system design beyond just diagrams.
2Distinguish design from coding and from architecture.
3Recognize the four perennial concerns: scale, reliability, cost, change.

Deep dive

Most engineering bugs are local: a wrong loop, a wrong type, a missing null check. System design bugs are global: a database that cannot shard, a queue with no backpressure, a write path that cannot survive a region failure. They are cheap to prevent on a whiteboard and expensive to fix in production, which is why interviews — and real launch reviews — keep returning to the same exercise.

A system design always answers four questions at once. How does it scale when traffic grows by 10x? How does it stay available when a node, zone, or region disappears? How much does it cost to run, and how does that change with growth? How easy is it to change — to ship a new feature, fix a bug, or migrate a datastore?

These four pull against each other. A design that scales infinitely is rarely the cheapest. A design that is most reliable is rarely the easiest to change. Good engineers learn to recognize the tradeoff curves and pick a point that matches the product, not the other way around.

Demo / scenario

A teammate says 'just use Postgres' for a feature that will see 50k writes per second of small events.

Sketch the write path: client → API → Postgres single primary.
Estimate: 50k writes/sec × ~500 bytes ≈ 25 MB/s sustained, ~2 TB/day.
Recognize the bottleneck: a single primary will hit IOPS and WAL limits long before storage.
Reframe the choice: append-mostly events fit a log (Kafka) or a wide-column store (Cassandra) better than a row store.

What System Design Actually Is