Member-only story
Advanced System Design — Scale TinyURL
Discussing how to scale the barebones TinyURL system using Load Balancer and Database Horizontal Partitioning, and related topics like partition key design strategy, partition vs replica vs backup, SQL vs NoSQL.

In the last couple of articles, we discussed how to design a barebones working solution for TinyURL, and also how to improve 90% of the estimated workload by introducing cache.
The system diagram below shows what we currently have. The single web server handles all the HTTP requests from the clients, and the single database stores all the data and handles all the read and write requests. Let’s trace the workflow and see how we can scale each component in the diagram, starting from the web server.

Web Server
HTTP requests are stateless, which means each HTTP request contains enough information for it to be understood and handled by a web server. This allows us to have more than just one web server to serve all the HTTP requests. Once we have multiple web servers available, the users’ HTTP requests can now be routed to any web server in the system (there is a concept of sticky session, which we will cover later). To achieve this, we will need to add another component, Load Balancer, who takes care of the dispatching between the Users and the Web Servers.

Introducing Load Balancer
The role of a Load Balancer is to distribute incoming requests to different Web Servers, and then return the HTTP responses to the users. There are various approaches a Load Balancer can take to route the requests to different Web Servers, such as Random, Least Busy, Round Robin, Sticky Session, etc.. For TinyURL, there is no particular requirement on how a request should be dispatched. We can just use Random or Least Busy for simplicity.