Hello readers,
I am writing a series of blogs on system designing where I decode every decision one takes while system designing. For the sake of simplicity, I am going to use few examples. In this blog, I am going to discuss decisions related to data stores like Database, Cache, Search Engine etc.
Requirement Gathering
We can break requirement gathering into 2 parts:
Data Based Requirements: In this part, we have to identify data that we will be dealing with in the system. Once the data is identified we need to classify data based on its type which in turn will help us to decide our data stores. Also, we can dive deeper into the properties of data like structure, orientation to make more accurate choices.
System Based Requirements: In this part, we have to discuss the system. We have to establish the system constraints, how we will scale the system and the growth of data expected. Also, we can predict user behaviour that helps to make better data store related decisions.
The Datastore Roadmap
Here is the roadmap showcasing various data store decision tree based on requirements.
The above roadmap tells us how to classify our data and what is the most favourable datastore. We can even have a system where we have an RDBMS, NoSQL and Column DB to store data for different subsystems/functions/services.
The above roadmap tells us how to choose databases/techniques based on system constraints/properties.
We have to prioritize between CAP theorem, data growth and scaling. With which of these we can relate most or which factor is more important while defining our system. We can consider multiple factors at once.
The Discussion
We are going to decide the most favourable data stores for COWIN system using the above roadmap. For those who are unaware of COWIN, it is the portal where one can book Covid-19 Vaccination Appointment based on slots available in a particular area. Also, I am going to discuss any tradeoffs that will be resulting in our choices.
COWIN Portal
Step 1: Lay basic System objectives / What system needs to achieve.
- System should maintain basic user information like name, password, phone no., age, Aadhar no., which vaccination he/she got and what round of vaccination is done to him/her. The user authentication is generally what I believe is outsourced to a 3rd party system that authenticates the user based on Aadhar details. It is critical to maintaining the type and the round of vaccination the user got so that he/she can get the same vaccination in the following rounds.
- User can search the vaccination centers available around him/her using pin code.
- User can access information regarding type and availability of vaccine. There can be many types of vaccination available in a particular quantity at a center.
- User can book a slot by selecting a vaccination center and the type of vaccination he/she wants.
- A Vaccination Center Admin/Government Officials can update the availability of vaccination in a center.
- A Vaccination Center Admin/Government Officials can update the status of vaccination for a user.
Step 2: Data based requirements and decisions.
As we said earlier, the goal of this part is to identify and classify the data based on its type.
Significant effort goes into creating Database Schema. We have to identify the entities, relationships and properties to design a schema.
For COWIN Portal we don't need to deal with files or images so we don't need blob storage. Since we have data that qualify to be represented in tables and databases, we will now discuss the data we need to store. (Assuming we are also developing user authentication)
- user table:
fields | Description |
---|---|
id | auto increment pk |
name | varchar |
age | int |
aadhar_no / UIN | varchar |
phone_no | int |
user_role | OFFICIAL/CITIZEN |
password | SHA256 |
- vaccination table:
fields | Description |
---|---|
id | auto inc. pk |
name | varchar |
description | varchar |
price | float |
- center table:
fields | Description |
---|---|
id | auto inc. pk |
address | varchar |
pin | int |
city | varchar |
- vaccine stock table:
fields | Description |
---|---|
id | auto inc. pk |
center_id | fk to center |
vaccination_id | fk to vaccination |
dose_no | int |
quantity | int |
- slot table:
fields | Description |
---|---|
id | auto inc. pk |
user_id | fk to user |
center_id | fk to center |
vaccination_id | fk to vaccination |
dose_no | int |
status | BOOKED/ABSENT/CANCELED/COMPLETED |
This is basic database schema. We might do some changes in future.
Orientation -> Row
Structure -> Structured
According to the road map + since we have proper entities and relations we can say, we should incline towards RDBMS.
We need to store logs for debugging and analysis of system health. For logs since we can say,
Orientation -> Column
Structure -> Structured
we should incline towards Column Database.
We also need to perform some analytics over data so we also need data warehousing.
Step 3: System based requirements and decisions.
Data Estimation
We have to maintain data for 1.36 billion users (population of India).
Let's dive into some calculations,
- user table: User table will have approx. 1.36 billion entries. We can calculate size of the tables using estimates from this link. Suppose one row of user table takes 100 Bytes of space.
1.36 × 10^9 × 100 × 10^-6 = 1.36 × 10^5 MB ~ 136 GB
vaccination table:
For Vaccination table, there are only handful type of vaccination (Covidsheild, Covaxin, Sputnik etc.) so, we don't need to concerned for this table.center table:
Suppose, we have maximum 1000 centers in each state/UT. There are total 35 state/UT. Roughly estimating, one row takes 80 Bytes.
35 × 1000 × 80 × 10^-9 ~ 0.003 GB
- vaccine stock table:
We have 35000 centers all over the country, each center providing approx. 10 types of vaccination and 10 dose cycles to followed (hypothetical situation). Suppose 30 Bytes is size of a row.
35000 × 10 × 10 × 40 × 10^-9 ~ 0.2 GB
- slot table:
This table will have maximum no. of entries. We have to maintain slots booked by user, per dose.
Suppose, 1.31 billion people take 10 rounds of dose and hence book 10 slots and each row occupies 50 MB of space.
1.36 × 10^9 × 10 × 50 × 10^-9 ~ 700 GB
So, total space we need with margin (indexes, meta data etc.) is 900 GB
Reading into System
Following the System based roadmap the first question we need to ask do we need ACID or some transactions are involved in the system. When we observe slot booking we think we would need ACID properties because:
If multiple users booking a slot at a time we would need the slot to be booked or not. If 10 users are reading no. of available slots as 1 and go ahead booking the slot then it will lead to a race condition. We would require Multiversion concurrency control to handle concurrent reads and writes.
Answer to if ACID needed is Yes so our inclination towards RDBMS/Acid Compliant database is now confirmed.
We can use MySQL or Postgres for this.
Scaling Journey
Now let's discuss scaling, we need to scale based on hardware available. We can have 1 TB of storage and it will be fine as according to our estimation we need to store 900 GB of data. If data even grows in future we can scale vertically adding more space to the database server.
But if we reach the max hardware limit we would need to scale the database horizontally. Since we are dealing with RDBMS, according to the roadmap we should go with database shading. However, scaling is difficult in RDBMS (dotted path in the roadmap) but since we need MCC we can't switch to NoSQL.
Database Sharding
Suppose we have n storage servers and we can then shard data based on zones, cities and even regions. Row-based sharding makes more sense here and I will explain why?
Considering APIs that we will be designing for this system, we will be using joins extensively and if have columns stored in different database servers queries will be expensive.
Now, suppose we shard the database based on states. We would be storing all the data for Delhi and dedicating a server where we would be maintaining all the users, centers, vaccinations available in Delhi. Most queries will not have cross-database server joins. If we want to accumulate data from all the database servers only then joins are required and it's going add the latency.
Caching
Now coming to the last branch of this roadmap, it's the behaviour of the user. What we think is, there is going to be more reads since the user will check the availability of vaccines around his/her area. Reading the roadmap, we should go with caching. We can store the following key-value pairs in the cache:
- 110009: ['Center_A', 'Center_B']
- 'Center_A': {'Adress': 'ABC Vihar', 'City': 'Delhi',...} (basically, object of center)
- 'Center_A_Vaccines': ['Covaxin':30, 'Covidshield': 75]
This will enable our server to serve read requests quickly.
Now we need to update the cache if a center is added or removed or when a person books a slot decrementing no. of vaccinations now available in stock at that center. We can use the most common techniques listed in this link. We would like to use Write-through since it ensures consistency. The tradeoff is that the user will experience latency as you have to write to two places every time. We can live with this tradeoff since we think the aim of the system is to be consistent over a better UX.
Step 4: Further Optimizations
To optimize our queries we would have to rely on indexing.
We can index center table based on pin and city. This will optimize our search queries.
We can index the vaccination table based on center_id so that we can reduce the time of queries that will extract vaccine stock based on center.
We can index slot table based on user_id, center_id and vaccination_id, we can think of many scenarios where we might have query slots based on users, centers and vaccinations.
That's the end of the blog, I hope you find it informative. I will do complete system designing of COWIN in the next blog and also explain new examples using the above road maps. Since System Designing is highly subjective, if you have suggestion, we could discuss in comments and I will accommodate them in the blog.
Thanks for reading.
Top comments (0)