System DesignReview

100 Days · Interactive Roadmap

Master system design in 100 days.

A focused roadmap inspired by donnemartin/system-design-primer. Each day: a concise lesson, a hand-drawn diagram, a mind map, and a 5-question quiz. Progress lives in your browser.

100 day atlas

One map, ten design instincts.

The curriculum moves from framing and estimation to data, async work, operations, and full design drills. Each module leaves a visual cue you can recall in an interview.

01-10
Frame first
11-20
Find limits
21-30
Pick tradeoffs
31-40
Route close
41-50
Contract edges
51-60
Shape data
61-70
Fit workload
71-80
Smooth bursts
81-90
Survive prod
91-100
Tell the story
Module 1

Foundations & Method

How to think about, scope, and discuss large systems.

  1. Day 001
    25m

    What System Design Actually Is

    Why interviews and real engineering both demand the same skill.

  2. Day 002
    30m

    The Interview Method: Clarify → Scope → Sketch → Scale → Summarize

    A repeatable five-step framework that beats raw cleverness.

  3. Day 003
    25m

    Functional vs Non-Functional Requirements

    What the system does vs how well it does it.

  4. Day 004
    25m

    SLA, SLO, SLI, and Error Budgets

    How you express 'reliable enough' in numbers — and govern by them.

  5. Day 005
    30m

    Back-of-Envelope Estimation

    How to ballpark scale in 60 seconds and avoid being wrong by 100x.

  6. Day 006
    25m

    Reading Existing Systems

    How to reverse-engineer a deployed product into its design.

  7. Day 007
    25m

    Tradeoff Frameworks

    Pick a corner of the triangle and own it.

  8. Day 008
    25m

    Communicating Designs Visually

    Boxes and arrows are not decoration — they are the load-bearing language.

Module 2

Scale, Latency, Capacity

Numbers every engineer should know — and how to estimate.

  1. Day 009
    20m

    Latency Numbers Every Engineer Should Know

    The physics ceiling on what your system can do in a millisecond.

  2. Day 010
    20m

    Throughput vs Latency

    Two different curves with different bottlenecks.

  3. Day 011
    20m

    Vertical vs Horizontal Scaling

    Bigger box vs more boxes.

  4. Day 012
    25m

    Capacity Planning

    Provision for peak — and prove it.

  5. Day 013
    20m

    Stateless vs Stateful

    Where state lives is the most important architectural choice.

  6. Day 014
    25m

    Identifying Bottlenecks

    There is exactly one bottleneck at any moment — find it.

  7. Day 015
    25m

    Tail Latency

    p99 is your real product.

  8. Day 016
    25m

    Workload Patterns

    Read-heavy, write-heavy, fan-out, fan-in — each picks a different stack.

  9. Day 017
    20m

    QPS, RPS, IOPS, BPS

    Different units — each names a different bottleneck.

  10. Day 018
    25m

    Performance Budgets

    Every component on a path gets a slice of the latency budget.

Module 3

Consistency & Replication

CAP, PACELC, replication, and the cost of correctness.

  1. Day 019
    25m

    CAP Theorem

    Under network partition, you must pick consistency or availability.

  2. Day 020
    20m

    PACELC and Latency Tradeoffs

    Even without a partition, consistency costs latency.

  3. Day 021
    25m

    Strong Consistency Models

    Linearizable reads see the latest write — at a cost.

  4. Day 022
    20m

    Eventual Consistency

    All replicas converge — eventually.

  5. Day 023
    25m

    Read-Your-Writes, Monotonic, Causal

    Useful 'middle' consistency models that match user expectations cheaply.

  6. Day 024
    25m

    Replication Topologies

    Single-leader, multi-leader, leaderless — pick a coordinator strategy.

  7. Day 025
    25m

    Quorums and R+W>N

    Tune consistency vs availability per operation.

  8. Day 026
    25m

    Conflict Resolution

    When concurrent writes diverge, you need a deterministic merge.

  9. Day 027
    30m

    Consensus: Paxos and Raft

    How nodes agree on a single value despite failures.

  10. Day 028
    25m

    Two-Phase Commit and Sagas

    Distributed transactions: the strict way and the practical way.

Module 4

Edge: DNS, CDN, Load Balancers

How traffic finds your servers — and how to spread it.

  1. Day 029
    20m

    DNS Basics

    How a name becomes an IP — and where each step can fail.

  2. Day 030
    20m

    GeoDNS and DNS Load Balancing

    Use DNS to send users to the right region.

  3. Day 031
    25m

    CDN Architecture

    Move bytes close to users so origin only answers misses.

  4. Day 032
    25m

    CDN Caching and Invalidation

    Cache headers are a contract; invalidation is its escape hatch.

  5. Day 033
    25m

    Anycast and BGP

    One IP, many physical locations, the network picks the closest.

  6. Day 034
    20m

    Reverse Proxies vs Load Balancers

    Same picture, different verbs.

  7. Day 035
    20m

    L4 vs L7 Load Balancing

    Bytes and ports vs URLs and headers.

  8. Day 036
    25m

    Health Checks, Circuit Breakers, Drains

    Rotate broken backends out before users notice.

  9. Day 037
    25m

    Sticky Sessions and Consistent Hashing

    Route by key without thrashing the map when nodes change.

  10. Day 038
    25m

    TLS Termination and mTLS

    Encrypt the wire, identify the parties.

Module 5

APIs & Application Layer

Service boundaries, contracts, discovery, and the mesh.

  1. Day 039
    25m

    REST Design Principles

    Resources, verbs, and statelessness as a contract.

  2. Day 040
    25m

    GraphQL — Strengths and Pitfalls

    Ask for exactly what you need — and own the cost.

  3. Day 041
    25m

    gRPC and Protobuf

    Strongly typed, binary, fast — and the default for service-to-service.

  4. Day 042
    25m

    API Versioning Strategies

    How to evolve a contract without breaking clients.

  5. Day 043
    20m

    Pagination, Filtering, Sorting

    Cursors beat offsets at scale.

  6. Day 044
    25m

    Idempotency and Retries

    Retries are inevitable; design as if every call may run twice.

  7. Day 045
    25m

    Rate Limiting

    Cap the firehose before it floods the basement.

  8. Day 046
    25m

    API Gateways

    One entrypoint for cross-cutting concerns.

  9. Day 047
    25m

    Microservices vs Monolith

    One deployable vs many — both are valid; pick consciously.

  10. Day 048
    25m

    Service Discovery and Service Mesh

    How services find each other — and how the mesh helps.

Module 6

Relational Data at Scale

Schemas, indexes, replication, and sharding for SQL.

  1. Day 049
    25m

    Relational Basics and ACID

    Why SQL still wins for most transactional work.

  2. Day 050
    25m

    Schema Design and Normalization

    Normalize until it hurts; denormalize until it works.

  3. Day 051
    25m

    Indexing Fundamentals

    B-trees: the workhorse of SQL.

  4. Day 052
    25m

    Composite and Covering Indexes

    Order matters; covering avoids the heap.

  5. Day 053
    25m

    Query Plans and Execution

    Read the plan; the plan is the truth.

  6. Day 054
    25m

    SQL Replication

    Streaming and logical replication compared.

  7. Day 055
    20m

    Read Replicas and Replica Lag

    Cheap reads — at the cost of staleness.

  8. Day 056
    20m

    Connection Pooling

    Connections are expensive — share them.

  9. Day 057
    30m

    Sharding Strategies

    When one DB isn't enough — split by key.

  10. Day 058
    25m

    Online Schema Migrations

    Add columns, change types, split tables — without downtime.

Module 7

NoSQL, Search, Graph, Object

Picking the right datastore for the workload.

  1. Day 059
    20m

    Document Stores

    JSON-shaped data, flexible schemas, easy aggregations.

  2. Day 060
    20m

    Key-Value Stores

    Fastest possible lookups when keys are everything.

  3. Day 061
    25m

    Wide-Column Stores

    Cassandra/HBase: write-optimized, partitioned, big.

  4. Day 062
    20m

    Time-Series Databases

    Built for append-only timestamped data.

  5. Day 063
    25m

    Search Engines

    Inverted indexes turn 'find me X' into milliseconds.

  6. Day 064
    25m

    Inverted Indexes Up Close

    How a search engine actually finds matches.

  7. Day 065
    25m

    Graph Databases

    When relationships are the thing you query.

  8. Day 066
    25m

    Object Storage

    Cheap, durable, infinite — your bucket of bytes.

  9. Day 067
    25m

    Data Lake vs Warehouse vs Lakehouse

    Cheap raw bytes vs fast structured queries — and the middle.

  10. Day 068
    25m

    Choosing a Datastore

    There's no 'best DB' — only best fit for the workload.

Module 8

Caching, Queues, Async Work

Hide latency, smooth spikes, and decouple services.

  1. Day 069
    25m

    Caching Layers

    Cache at every layer that pays for itself.

  2. Day 070
    25m

    Cache Patterns

    Cache-aside, write-through, write-behind — different shapes, different tradeoffs.

  3. Day 071
    25m

    Cache Invalidation

    The hardest problem in computer science (only half-jokingly).

  4. Day 072
    20m

    Eviction Policies

    When the cache is full, who loses?

  5. Day 073
    25m

    Distributed Caches

    Redis and Memcached at scale.

  6. Day 074
    25m

    Hot Keys and Cache Stampedes

    When the cache misses, everyone misses at once.

  7. Day 075
    25m

    Message Brokers

    Decouple producers and consumers with durable queues.

  8. Day 076
    20m

    Pub/Sub vs Queues

    Multiple subscribers vs one consumer per message.

  9. Day 077
    25m

    Exactly-once vs At-least-once

    Pick at-least-once + idempotent — the only honest answer.

  10. Day 078
    25m

    Outbox Pattern and Idempotency Keys

    Atomic publish-with-state, deduped consumption.

  11. Day 079
    25m

    Backpressure and Flow Control

    Signal slowness upstream — don't drown.

  12. Day 080
    25m

    Stream Processing

    Treat data in motion as a first-class citizen.

Module 9

Protocols, Security, Observability

What prod really runs on — and how you survive it.

  1. Day 081
    25m

    HTTP/1.1 → HTTP/2 → HTTP/3

    How the web's transport got faster — and what it means for your design.

  2. Day 082
    25m

    WebSockets and Server-Sent Events

    Push, not poll.

  3. Day 083
    20m

    Polling, Long Polling, and Push Models

    Climb the realtime ladder by problem, not preference.

  4. Day 084
    25m

    Authentication: Passwords and MFA

    Hash with a slow function; add a second factor.

  5. Day 085
    25m

    Authorization: RBAC and ABAC

    Who can do what, decided where.

  6. Day 086
    25m

    OAuth 2.1 and OIDC

    Delegated authorization vs identity — and why both matter.

  7. Day 087
    20m

    Secrets Management

    Don't put secrets in env files; rotate them like passwords.

  8. Day 088
    25m

    Logs, Metrics, Traces

    The three pillars of observability.

  9. Day 089
    25m

    Alerting: Symptom vs Cause

    Page on user pain; chart on suspect causes.

  10. Day 090
    25m

    Chaos and Disaster Recovery

    Rehearse failures so you don't learn them at 3 AM.

Module 10

End-to-End Design Drills

Apply everything: ten classic system design problems.

  1. Day 091
    35m

    Drill: URL Shortener

    Cheap to build, instructive at scale.

  2. Day 092
    45m

    Drill: Twitter Feed

    Fan-out on write vs read — and the celebrity problem.

  3. Day 093
    45m

    Drill: Instagram

    Photo upload pipeline + feed + likes — visually heavy.

  4. Day 094
    45m

    Drill: Messenger / WhatsApp

    Realtime messaging with delivery guarantees and presence.

  5. Day 095
    45m

    Drill: Uber Dispatch

    Geo-indexing + matching + pricing — at city scale.

  6. Day 096
    45m

    Drill: Netflix-Style Streaming

    Video ingest, encoding, CDN, recommendations.

  7. Day 097
    45m

    Drill: Dropbox / File Sync

    Block-level sync with conflict resolution.

  8. Day 098
    45m

    Drill: Web Crawler

    Politeness + breadth + dedup at internet scale.

  9. Day 099
    35m

    Drill: Rate Limiter (Deep)

    Distributed token bucket with consistency hazards.

  10. Day 100
    50m

    Drill: Distributed Key-Value Store

    Build Dynamo from first principles.