Bigger box vs more boxes.
Vertical vs Horizontal Scaling: bigger box vs more boxes
make the invisible limits visible
Vertical: simple, finite ceiling.
Vertical scaling buys a bigger machine; horizontal scaling adds more machines. Vertical is fast and simple but caps at the largest single box. Horizontal is unbounded but only works when work can be split across machines without a global coordinator.
Vertical limits: physics (single CPU socket), price curves (top-tier instances are non-linearly expensive), and blast radius (one machine = one failure domain).
Horizontal demands: stateless app servers (any can handle any request) or partitioned state (each machine owns a slice). Achieving statelessness usually means moving session, cache, and locks out of the process.
A common path: scale up first (cheap and easy), then scale out when the largest single instance is no longer enough.
Single API server hitting 80% CPU at peak.