DEV Community

Ken Yip
Ken Yip

Posted on

Crafting a Scalable Node.js API: Insights from My RealWorld Project with Express, Knex, and AWS CDK

Image description

Introduction

Recently, I undertook a significant refactor of an example program I previously built by implementing the repository design pattern. I also transitioned from using AWS SAM to AWS CDK for managing infrastructure and deployment. You can find the source code for the application here and the infrastructure code here. In this article, I will highlight the technologies and design patterns I adopted in this repository, sharing insights into the structure and techniques that contribute to a robust and scalable application.

Project Structure

The project employs a monorepository pattern using Turborepo, which allows for efficient management of multiple services and shared resources.
Within this structure, the services are organized in the apps folder, while shared components and libraries reside in the packages folder. This organization enables the apps to depend on the packages, and for the packages to depend on one another as needed. Turborepo facilitates dependency management and utilizes caching mechanisms, significantly speeding up development by avoiding redundant tasks when underlying resources have not changed.
The core business logic is located in the core folder under packages. In this package, I implemented the repository design pattern, dividing the program into four distinct layers: Database, Repository, Service, and Provider.

  • Database Layer: This layer manages data exchange between the database and the application, handling CRUD operations and database-specific logic.
  • Repository Layer: The repository layer acts as an intermediary, transforming inputs and outputs, implementing caching mechanisms, and ensuring that the application logic does not need to be tightly coupled with the database.
  • Service Layer: Here, the core business logic resides. This layer consists of multiple service classes that act as facades, receiving necessary dependencies and delegating tasks to the appropriate handlers located in the implementations folder.
  • Provider Layer: This layer handles interactions with third-party services, encapsulating their functionality and providing a clean interface for the rest of the application.

This architecture emphasizes the separation of concerns, where each layer is dependent on the others, but changes in one layer do not impact the others. For example, if we decide to switch our database from RDS to MongoDB, only the database layer would require modifications.

By adhering to the principles of dependency injection, we can easily mock each layer during testing, enabling us to verify the functionality of other services in isolation. In fact, I have mocked all API requests and responses within the provider layer, which will be discussed further in the testing section.

In the apps folder, we establish the entry points for our services. This is where we extract request payloads from various endpoints, validate them, and process them using the core package as needed. The architecture is designed to ensure that even if we change the source of the payload (for instance, switching from SQS to RabbitMQ), we only need to modify the application layer, leaving the package layer untouched.

API Deployment

This project is designed as a fully serverless application, with API endpoints deployed within AWS Lambda functions. To facilitate this, we utilized the @vendia/serverless-express plugin, which allows us to convert our Express application into a Lambda function seamlessly. Additionally, we integrated AWS API Gateway to handle incoming HTTP requests, providing a robust interface for our APIs.

Since we are using Cloudflare as our CDN provider, I configured the API Gateway as a regional resource. This setup allows us to create a CNAME record in Cloudflare that points to the API Gateway endpoint, ensuring efficient routing and caching of requests.

One of the significant advantages of this architecture is the flexibility it offers. We can easily switch from a serverless deployment to a containerized solution if needed. The Dockerfile for this purpose is located in the apps/local folder. If you're interested in deploying the Docker container to AWS ECS with a load balancer and auto-scaling capabilities, you can refer to this repository for a comprehensive guide on the process.
https://github.com/kenyipp/nextjs-cdk-example

By adopting this serverless architecture, we benefit from scalability, reduced operational overhead, and efficient resource management, allowing us to focus on developing features and improving the application.

Infrastructure Management

The infrastructure for this project is fully managed using AWS CDK, and I have created a dedicated repository specifically for managing it. This repository includes all necessary resources such as IAM roles, SQS queues, and CodePipeline configurations that support the application.

To streamline the deployment process, this setup includes two CI/CD pipelines:

  • Infrastructure CI/CD Pipeline: This pipeline is triggered by GitHub Actions and handles the deployment of the entire infrastructure stack. If the pipeline code is well-structured (e.g., with no circular dependencies between stacks) and has been thoroughly tested in lower environments, it can deploy to production seamlessly without requiring manual adjustments on local. This pipeline updates the infrastructure components, ensuring consistency across all environments.
  • Lambda Deployment CI/CD Pipeline: This pipeline manages the deployment of the application code. It pulls the latest code from the application repository, builds the application, and uploads the resulting package to the Lambda function. This deployment pipeline is also triggered by GitHub Actions, allowing for a smooth and automated code deployment process.

Both CodePipeline processes are manually triggered through GitHub Actions. Each pipeline runs quality checks and test cases in GitHub Actions before triggering the corresponding AWS CodePipeline deployment for infrastructure or application code. This approach ensures that only tested and verified changes make it to production, providing a robust and automated deployment pipeline for the entire application stack.

Testing Strategies

This repository employs a comprehensive testing strategy, including unit tests, integration tests, and end-to-end tests, to ensure stability and reliability across all layers of the application.

  • Unit Tests: Unit tests are written for functions and core modules, targeting the smallest components in the application. By mocking dependencies within each layer - such as database, repository, and service layers - we can effectively test both happy paths and edge cases. This isolation allows us to focus on individual functions and modules, ensuring each component works as expected without dependencies interfering.
  • Integration Tests: Integration tests focus on validating the APIs in the apps layer. Using supertest in conjunction with an in-memory SQLite database, we test each API endpoint in a realistic environment. This approach enables us to verify that modules interact correctly and that data flows as expected from one layer to another.
  • End-to-End Tests: For end-to-end testing, we set up a real MySQL database in GitHub Actions and launch the server to simulate actual deployment conditions. We use Newman (a tool for running Postman collections) to test the full user experience, from API requests to final responses. This allows us to validate that the application behaves as intended from a user's perspective.

Testing is a crucial part of our development workflow, supporting rapid iteration and refactoring. Well-written test cases allow us to confidently add features or make modifications without the risk of breaking other parts of the system. Good test coverage also provides a foundation for future refactoring, ensuring that as long as tests pass, the refactored code performs correctly.

In many cases, tight timelines may require us to prioritize feature rollouts, which can lead to sacrifices in code quality. However, I prioritize writing robust test cases first, as they serve as a safety net for future improvements. This practice enables us to refactor and enhance the codebase with confidence, knowing that as long as tests pass, the quality and functionality of the code remain intact.

Conclusion

Refactoring this repository has been a valuable learning experience, allowing me to apply new technologies and best practices along the way. I continuously explore and integrate improvements to keep the project up-to-date and efficient. If you find this repository or article helpful, consider giving it a star or clap - your support is appreciated! Thank you.

Top comments (0)