Design a proximity server like Yelp Part — 2

Published in

codeburst

6 min readJul 6, 2020

Recapitulation

In our previous post, we covered the basic system design of a proximity server like Yelp with some of its core functional and non-functional requirements, APIs, and low-level design. We also discussed the approach of using a 2D grid for dividing the entire world map into small squares.

2D Grid Approach

Considering the area of the earth is 500 Million square km, and having a fixed search radius is 10km. We will have 500M/10 = 50 Million squares with a fixed grid size of 10km.

This approach had a complex problem of managing popular places like Dam Square(Amsterdam) or Las Vegas(USA) and can furthermore cause an imbalance in the grids where some grids will be densely populated and others will be sparsely populated with places like coastal regions or islands. We have also talked about the implementation complexities involved in creating dynamically adjusting grids.

Overview

In this part, we will be discussing how we can solve the problem of grid imbalance along with the final system design and handling some important core design requirements like Data Partitioning and Data Replication.

How can we solve the problem of grid imbalance?

This is the common challenge that Tinder and Uber also faced when they started growing and receiving a whale of traffic.

QuadTrees

A quadtree is a tree data structure in which each node has exactly zero or four children. Quadtree’s peculiarity is the way it efficiently dividing a flat 2-Dimensional space and storing data of places in its nodes.

Considering our case, each node can represent a grid and can contain details about the places in that grid. If a node reaches the bucket size of X, then that node will be broken down into 4 child nodes and their data will be distributed into those nodes recursively.

Initialization

We will firstly keep all the places into one root node and as our 5 years scale is 400 Million, the root node will not be able to hold all of them. The root node will then be recursively broken down into 4 child nodes until no nodes are left with more than 500 locations. Now we have our QuadTree constructed for 400 Million places with leaf nodes containing all the locations.

A quadtree representation with a bucket size of 1. Source: Wikipedia

Search Flow

We will start our search from the root node, then search downward till we find the required node. The required node will always be the leaf node as discussed above that places will be stored only in the leaf nodes.

Our Quadtrees construction algorithm always ensures that neighboring nodes will always contain geographically close places. Thus, for finding the nearby locations we will also be taking the neighboring nodes in the consideration.

To find all the places nearby for a given location with latitude and longitude(100, 85) within a radius of 10Km.

List<Places> places = getNearByPlaces(root, 100, 85, 10);

Data Caching

We can make it faster by caching the QuadTree details. As discussed above, we will be having 400M/500 = 0.8 Million nodes in total. We can assume that the node_id will take 6 bytes of size and each node can have 4 children pointers except the leaf ones. Apart from that, we also have location_id, latitude, and longitude of 8 bytes each. Thus for storing everything, we will require:

(8+8+8) * 400 M + 0.8M*4 = 10 GB

Data Partitioning

Considering the scale, we cannot rely on just one server for serving all the traffic as it can be a single point of failure and can adversely hamper the requirement of availability which is not acceptable these days when you have enormous power of distributed systems. QuadTrees should be partitioned.

We can have multiple ways and algorithms to partition the data

Regional based sharding: The data can be partitioned on the basis of regions, but as discussed above this approach will result in a non-uniform data distribution because some locations are densely populated whereas others are sparsely populated. Therefore our uniform data distribution problem will not be solved.
Sharding based on Place ID: The data can be sharded on the basis of plcae_id by either hashing or consistent hashing. While constructing the Quadtree, we will iterate through all the places and calculate the hash of each place_id through our hash function. Our hash function will map each place id to a server where we will store details of that particular place.

The second approach looks simpler. Though we will end up having multiple QuadTrees which wouldn’t be of that much concern as the uniform distribution of places is guaranteed.

Final System Design

To find nearby places, we have to query all servers and each server will return a set of nearby places. A centralized aggregation server will collaborate these results, sort them, and return to the user.

Data Replication

One of the most important core requirements of any mammoth scale distributed system is reliability which means the system should be able to work correctly even in the face of adversity. Therefore, in order to serve this, we cannot rely simply on just one machine and make that a single point of failure.

Master-Slave

The master-Slave architecture will suit our use case the most. Writes will happen only through masters and reads will happen through the slaves. Whenever a master server goes faces a downtime, any one of the slave servers can take its position and become master and serve the writes. With this approach, a small delay of few milliseconds can be there in showing recently updated data causing eventual consistency but that should be fine as it’s not a banking application.

What will happen if everything replica is down?

It’s an extremely rare fault, but our system should be smart enough to deal with such cases. We can solve this by building an inverted index strategy based on QuadTree Index Server that can store the QuadTree Server number as the key and a HashSet of place ids that are present in the corresponding server as its values.

This approach will be less time-consuming. We should also have a replica of the QuadTree Index server for fault tolerance. If any QuadTree server dies, then that server can always rebuild by searching the QuadTree index server instead of querying and increasing database load.

Conclusion

We learned about how to design a proximity server like NearBy or Yelp, we discussed both the functional and non-functional requirements, calculated the user scale, figured out the mandatory APIs followed by creating database tables.

After that, we tried to design the actual system and incrementally solved the challenging problems attached to storing place details with their location:

Firstly, we tried to store place details along with its latitude and longitude in an SQL based storage system and also created an index on the location parameters. We carefully analyzed the downsides of this approach including inefficiency in the search query for a mammoth scale of 400 Million.
Secondly, we tried to solve the above problem by dividing the entire world map into a fixed-sized 2D grids and also found out that this approach is more efficient than the above. But the downside of this approach is the grid imbalance between densely and sparsely populated areas.
Finally, we solved the above problems by introducing a Quadtree data structure where each node has exactly zero or four children. Quadtrees efficiently divide the 2 Dimensional space into its structure and stores all the places detail in its nodes. Quadtrees are the best fit for our use case as they impeccably solve the problem of grid imbalance and scalability. Quadtrees are heavily used by Tinder and Uber for solving their complex geo-sharding and spatial problems.

Along with the final system design, we also discussed some of the interesting problems related to Data Partitioning and Data replication.

Coming up in this series

There are many more topics we need to discuss in detail. In the next few pieces, I’ll write in detail about the following topics:

Design a video streaming system like Netflix or Youtube
Design Notification System.
Design a movie ticketing system like BookMyShow or TicketMaster.

Thanks for reading.