Back to roadmap
Module 6 · Relational Data at ScaleDay 05125 min

Indexing Fundamentals

B-trees: the workhorse of SQL.

Day 051

Indexing Fundamentals

Root
datastore
Internal
datastore
Internal
datastore
Leaf
datastore
Leaf
datastore
Signal path
B-tree from root to leaf
Root
datastore
flow
Internal
datastore
Root
datastore
flow
Internal
datastore
Internal
datastore
flow
Leaf
datastore
Memory hook

Indexing Fundamentals: b-trees

Mental model

shape data so reads and writes stay honest

Design lens

Index size adds storage and write cost.

Recall anchors
B-treeCostTools

Why it matters

B-trees are balanced trees that keep keys sorted on disk and let the DB find a row in O(log N) page reads. Most relational indexes are B-trees; they accelerate equality and range queries.

Deep dive

Each index adds write cost (every insert/update touches the index).

Index-only scans are possible when the index covers all columns the query reads.

Selectivity matters: an index on (gender) is rarely useful.

Demo / scenario

Slow query: SELECT * FROM events WHERE user_id=$1 ORDER BY ts DESC LIMIT 50.

  1. EXPLAIN reveals seq scan.
  2. CREATE INDEX events_user_ts ON events(user_id, ts DESC).
  3. Query becomes index range scan.
  4. p99 from 800ms to 4ms.

Tradeoffs

  • Index size adds storage and write cost.
  • Too many indexes hurt writes.
  • Drop unused indexes — they're not free.

Diagram

Root
Internal
Internal
Leaf
Leaf
Leaf
B-tree from root to leaf.

Mind map

Check yourself

Loading quiz…

Sources & further reading