koki kitamura

Posted on May 1, 2022

Server can handle 10 million users

#node #dynamodb #express

Overview

I created an API server that is highly scalable and can handle 10 million users. It's an SNS like Twitter.
The implementation is published on Github.

The development environment is as follows.

Node 16.14
Express 4.17.3
DynamoDB 2012-08-10

The functional requirements are as follows.

Post a tweet
Post a Comment for tweet
Follow user
Get Timeline

Introduction

Services with hundreds of millions of users, such as Facebook, Amazon, and Youtube, need to handle a lot of traffic. A commonly used approach to handling heavy traffic is scale-out rather than scale-up. Scale-up is expensive because it uses high-performance server. In addition, there is a performance limit for operating on one server.

Let's talk about scale-out. Application can be broadly divided into three layers.

Client layer
Server layer
Database layer

When handling a large amount of traffic, The server layer only processes the data, it does not store it. Therefore, it is easy to scale out.

On the other hand, the database layer becomes difficult to maintain consistency and availability as data is distributed due to scale-out. You also need the logic to decide which data is stored on which node. Data relocation is required when increasing or decreasing the number of nodes. Since these features are not in RDB, we will use NoSQL.

Typical databases that support scale-out include BigTable, HBase, DynamoDB, Cassandra, etc.

Database	Description
BigTable、HBase	Consistent and up-to-date data can be obtained. On the other hand, data cannot be acquired while the lock is applied due to data update.
DynamoDB、Cassandra	Data is always accessible. On the other hand, old data may be read during data synchronization.

This time, we will create an API server for SNS, so availability is more important than consistency. Therefore, we use DynamoDB.

What is DynamoDB?

DynamoDB is a key-value database. You can create tables, and each table stores an item. Each item has a key and a value.

You can specify a partition key and a sort key for the item key. The partition key is used to determine the node from within the DynamoDB cluster. The sort key is like an index on a table and is used for sorting.

You can store multiple attribute / value pairs for an item's value. The attributes can be different for each item.

DynamoDB queries are limited and basically narrow down items by partition key and sort key only. When querying using other attributes, it will be slower as the number of items increases because it is necessary to check all items.

When you want to treat other attributes as partition keys, use GSI (Global Secondaly Index). When other attributes are treated as sort keys, LSI (Local Secndary Index) is used.

Database Design

DynamoDB's database design is different from RDB. The flexibility of querying RDBs allows you to design a normalized table first, without considering access patterns to your data. On the other hand, DynamoDB has a limited query pattern, so first determine the access pattern to the data and then design the table based on it. Specifically, we will proceed with the following flow.

Modeling
Create use case list
Design Table
Create query definition

Modeling

The ER diagram is as follows.

The timeline shows tweets of users that you are following. In SNS, the display speed of the timeline has a great influence on usability. Consider a database design that can display the timeline faster.

Read Heavy / Write Light on the timeline

In the case of a normalized table design, writing data at the time of tweeting is light because data is written only to the Tweets table. On the other hand, reading data on the timeline is heavy. The main flow when reading the timeline is as follows.

Get a list of IDs of users you are following
Get tweets from each user you follow
Merge the retrieved tweets
Sort merged tweets

The SQL for getting the timeline is as follows.

SELECT
  *
FROM 
  tweets
WHERE
  userId IN (
    SELECT followeeId FROM follows WHERE followerId = 'user id'
  )
ORDER BY
  postDate DESC

With this method, the more followers you have, the heavier the load on the timeline will be. It can be said to be a Read Heavy / Write Light method.

Read Light / Write Heavy on the timeline

Consider a Read Light / Write Heavy technique. If you create a Timeline table and want to read the timeline, just query the timeline table. On the other hand, when a user tweeted, make sure to write the tweet to the user's follower's timeline.

The SQL for getting the timeline is as follows.

SELECT
  *
FROM
  timelines
WHERE
  userId = 'user id'
ORDER BY
  tweetPostDate

This time, we will use this Read Light / Write Heavy method.

Create use case list

Create a data use case list based on functional requirements to find out how to access the data.

Entity	UseCase	Screen
Tweet	getTimelineByUserId	Home
User	getUserByUserName	User Detail
Follow	getFolloweesByUserId	User Detail
Follow	getFollowersByUserId	User Detail
Follow	getCountFoloweeByUserId	User Detail
Follow	getcountFollowerByUsreId	User Detail
Tweet	getTweetsByUserId	User Detail
Tweet	getTweetByTweetId	Tweet Detail
Comment	getCommentsByTweetId	Tweet Detail

Design Table

We will design the table and index based on the use case list. DynamoDB has a limited query pattern, but a method called Overloading GSI allows for flexible queries.

Include the ID in the sort key. Make the order of ID and record creation time the same. Then you can sort the posts by date without using LSI.

Create query definition

Finally, write out the query conditions. Based on this, we will implement around the database.

Entity	UseCase	Parameters	Table / Index	Key Condition
Tweet	getTimelineByUserId	{ UserId }	Primary Key	GetItem (ID=UserId AND begins_with(DataType, timeline))
User	getUserByUserName	{Username}	GSI-1	Query (DataValue=Username AND DataType=usserProfile)
Follow	getFolloweesByUserId	{UserId}	Primary key	Query (ID=userId AND begins_with(DataType, followee)
Follow	getFollowersByUserId	{UserId}	Primary Key	Query (ID=userId AND begins_with(DataType, follower)
Follow	getCountFoloweeByUserId	{UserId}	Primary Key	Select COUNT / Query (ID=userId AND begins_with(DataType, followee)
Follow	getcountFollowerByUsreId	{UserId}	Primary Key	Select COUNT / Query (ID=userId AND begins_with(DataType, follower)
Tweet	getTweetsByUserId	{UserId}	Primary Key	Query(ID=userId AND begins_with(DataType, tweet)
Tweet	getTweetByTweetId	{TweetId}	GSI-1	Query(DataValue=tweetId AND begins_with(DataType, tweet)
Comment	getCommentsByTweetId	{TweetId}	GSI-1	Query(DataValue=tweetId AND begins_with(DataType, comment)

Design API Server

Software Design

Design based on Domain Driven Design. The layer and directory names are matched.

Directory Name	DDD Layer	Components
src/domain	Domain Layer	Entity / Value Object / Repository Interface
src/application	Application Layer	Application Service / Serializer
src/infrastructure	Infrastructure Layer	Repository / AWS Config
src/presentation	Presentation Layer	API Server

ID generation method

Make the order of ID and record creation time the same. It can be handled by ID generation using the numbering table, but it lacks scalability. Use Snowflake as a scalable ID generation method.

This method divides the bit string into three parts. The ID is the decimal number of this bit string.

Part	Description
Epoch time	The number of seconds of difference from a particular time.
Sequence	It counts up every time an ID is generated and is cleared every second.
Node number	The number assigned to each node.

Implementing Snowflake in Node.js is as following.

import { config } from "@src/config";
import { dateToUnixTime } from "./time";

const workerIDBits = 10;
const sequenceBits = 12;

// Use snowflake
// See: https://blog.twitter.com/engineering/en_us/a/2010/announcing-snowflake
export class IdGenerator {
  private workerId: number;
  private lastGenerateAt: number;
  private sequence: number;

  constructor(workerId?: number) {
    this.workerId = config.snowflakeWorkerId;
    this.lastGenerateAt = dateToUnixTime(new Date());
    this.sequence = 0;
  }
  generate(): number {
    const now = dateToUnixTime(new Date());

    if (now == this.lastGenerateAt) {
      this.sequence++;
    } else {
      this.sequence = 0;
    }
    this.lastGenerateAt = now;

    // The bit operators ('<<' and '|' ) can handle numbers within
    // the range of signed 32 bit integer.
    return (
      now * 2 ** (workerIDBits + sequenceBits) +
      this.workerId * 2 ** sequenceBits +
      this.sequence
    );
  }
}

FAQ

Is the user's profile information duplicated?

Yes, it's a duplicate. When the profile is updated, you need to start Lambda with DynamoDB Stream to keep it asynchronous and consistent.

Isn't the tweet of a user with many followers a heavy writing load?

Yes, it's expensive. Only when the number of followers is large, it is necessary to take some measures such as dynamically merging when the timeline is acquired without writing to the follower's timeline.

Do you not cache?

Let's do it. It's not too late to monitor and find bottlenecks before making a decision.

Conclusion

In this article, I explained how to create a highly scalable API server. Just keep in mind that excessive performance optimization can go wrong when there are no performance issues.

The implementation is published on Github, so please take a look.

DEV Community