szymon-szym for AWS Community Builders

Posted on Nov 2

AWS Lambda with RDS Data API (TypeScript)

#aws #serverless #typescript #node

Introduction

AWS RDS Data API is the service that helps use relational databases without setting up a direct SQL connection. It seems like a useful option, especially for serverless applications, where connections are created inside short-living lambda containers.

With RDS Data API we don't need to open and close connections by ourselves. AWS manages this process (AppSync uses the same mechanism inside RDS resolvers)

Goal

I plan to build a simple HttpAPI endpoint and add a Lambda function as an integration. The application code will be written in TypeScript. For the database, I will use the Aurora Serverless cluster.
The lambda function will call Aurora DB using RDS Data API.

Project

Code is available in the GH repo

Infrastructure

Let's create AWS resources. I am using AWS CDK to define infrastructure

//... imports

export class RdsDataNodejsStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const vpc = new cdk.aws_ec2.Vpc(this, "Vpc", {
      maxAzs: 2,
    });

    const dbSecret = new secretsmanager.Secret(this, "Secret", {
      generateSecretString: {
        secretStringTemplate: JSON.stringify({ username: "master" }),
        generateStringKey: "password",
        excludePunctuation: true,
        includeSpace: false,
      },
    });

    const cluster = new rds.DatabaseCluster(this, "Database", {
      engine: rds.DatabaseClusterEngine.auroraPostgres({
        version: rds.AuroraPostgresEngineVersion.VER_16_4,
      }),
      writer: rds.ClusterInstance.serverlessV2("writerInstance"),
      vpc,
      credentials: rds.Credentials.fromSecret(dbSecret),
      enableDataApi: true,
      serverlessV2MaxCapacity: 6,
      serverlessV2MinCapacity: 0.5,
      defaultDatabaseName: "postgres",
    });

    const rdsAPIFunction = new nodeLambda.NodejsFunction(
      this,
      "RdsAPIFunction",
      {
        runtime: cdk.aws_lambda.Runtime.NODEJS_20_X,
        entry: "lambda/handlers/getItem.ts", // Path to the Lambda function code
        handler: "handler", // Exported handler function name
        tracing: cdk.aws_lambda.Tracing.ACTIVE, // Enable X-Ray tracing
        environment: {
          DB_SECRET_ARN: dbSecret.secretArn,
          DB_CLUSTER_ARN: cluster.clusterArn,
          DB_NAME: "postgres",
          POWERTOOLS_SERVICE_NAME: "getItemService",
        },
        bundling: {
          minify: true,
          sourceMap: true,
          keepNames: true,
          format: nodeLambda.OutputFormat.ESM,
          sourcesContent: true,
          mainFields: ["module", "main"],
          externalModules: [], // we bundle all the dependencies
          esbuildArgs: {
            "--tree-shaking": "true",
          },
          // We include this polyfill to support `require` in ESM due to AWS X-Ray SDK for Node.js not being ESM compatible
          banner:
            'import { createRequire } from "module";const require = createRequire(import.meta.url);',
        },
      }
    );

    cluster.grantDataApiAccess(rdsAPIFunction);
    dbSecret.grantRead(rdsAPIFunction);

    const itemsIntegration = new HttpLambdaIntegration(
      "ItemsIntegration",
      rdsAPIFunction
    );

    const httpApi = new apigatewayv2.HttpApi(this, "ItemsApi");

    httpApi.addRoutes({
      path: "/items/{id}",
      methods: [apigatewayv2.HttpMethod.GET],
      integration: itemsIntegration,
    });
  }
}

Application code

As we are not going to use SQL connection directly. Outside the handler we need only to initialize the SDK client to use rds-data.

I use Lambda Powertools to create the tracer.

import {
  RDSDataClient,
  ExecuteStatementCommand,
  ExecuteStatementCommandInput,
} from "@aws-sdk/client-rds-data";

import { APIGatewayProxyEventV2, APIGatewayProxyResultV2 } from "aws-lambda";

import { Tracer } from "@aws-lambda-powertools/tracer";
import { captureLambdaHandler } from "@aws-lambda-powertools/tracer/middleware";
import middy from "@middy/core";

const dbClusterArn = process.env.DB_CLUSTER_ARN;
const secretArn = process.env.DB_SECRET_ARN;
const databaseName = process.env.DB_NAME;

const TABLE = "items";

type Item = {
  id: number;
  name: string;
  description: string;
  price: number;
  image: string;
};

const tracer = new Tracer({ serviceName: "getItemFunction" });
const rdsClient = tracer.captureAWSv3Client(new RDSDataClient());
//...

In the handler, I get the id from the path parameter, and use it as a parameter for SQL SELECT query.

RDS Data API has some limitations compared to traditional SQL connection, however, it fits perfectly in the scenario with simple queries, the limited size of the returned data, and AWS Lambda context.

I don't use hardcoded DB credentials from SecretsManager and send the secret's ARN instead with a data-rds call.

// ...
export const lambdaHandler = async (
  request: APIGatewayProxyEventV2
): Promise<APIGatewayProxyResultV2> => {
  try {
    const id = request.pathParameters?.id;

    console.log(`id: ${id}`);

    if (!id) {
      return {
        statusCode: 400,
        body: JSON.stringify({ error: "Missing 'id' parameter" }),
      };
    }

    const sql = `SELECT * FROM ${TABLE} WHERE id = :id`;
    const parameters = [{ name: "id", value: { longValue: Number(id) } }];

    const params: ExecuteStatementCommandInput = {
      secretArn: secretArn,
      resourceArn: dbClusterArn,
      sql: sql,
      database: databaseName,
      parameters: parameters,
    };

    const command = new ExecuteStatementCommand(params);
    const response = await rdsClient.send(command);

    const items: Item[] = (response.records || []).map((record) => ({
      id: record[0].longValue as number,
      name: record[1].stringValue as string,
      description: record[2].stringValue as string,
      price: record[3].doubleValue as number,
      image: record[4].stringValue as string,
    }));

    return {
      statusCode: 200,
      body: JSON.stringify(items),
    };
  } catch (error) {
    console.error("Error executing query:", error);
    return {
      statusCode: 500,
      body: "error",
    };
  }
};

Finally, I use middy to wrap the handler with the tracer:

//...
// Wrap the handler with middy
export const handler = middy(lambdaHandler)
  // Use the middleware by passing the Tracer instance as a parameter
  .use(captureLambdaHandler(tracer));

Testing

For testing purposes, I seed some data into DB (just a few rows for the items table)

After deployment the endpoint works for the single request:

But what is really interesting is how this solution would handle traffic spikes. The expectation is that all pieces (DB cluster, Lambda functions, and SQL connections via RDS Data API) would be able to scale up under some pressure.

I've created a simple scenario with K2

import http from 'k6/http';
import { check, sleep } from 'k6';
import { randomIntBetween } from 'https://jslib.k6.io/k6-utils/1.2.0/index.js';

export let options = {
  stages: [
    { duration: '1m', target: 100 }, 
    { duration: '1m', target: 500 }, 
    { duration: '2m', target: 500 }, 
    { duration: '1m', target: 0 },   
  ],
};

export default function () {
  // Randomly generate an ID between 1 and 4 for each request
  const id = randomIntBetween(1, 4);

  // Make a GET request to the endpoint with the random ID
  const res = http.get(`https://dcqnlv17sa.execute-api.us-east-1.amazonaws.com/items/${id}`);

  // Check if the response status is 200
  check(res, { 'status was 200': (r) => r.status === 200 });

  // Optional: sleep for a short period between requests
  sleep(1);
}

I will gradually increase the traffic from 0 to 100 users in a minute, and to 500 users in the next minute.
Then for 2 minutes, the traffic will remain on the same level, and eventually decrease to 0 users during the last minute.

Each user will send one request and go to sleep for 1 second before sending the next one.

It is quite a basic scenario but should be enough to put our endpoint under some pressure.

Results

During testing around 78,5k requests were sent. All were successful.

p95 request duration was ~310 ms, which is basically the time needed for networking (the endpoint is deployed in us-east-1 and I call it from Europe)

Max time is above 2 seconds, which is OKish, at least for nodejs runtime. (If I need fast cold starts, I go for Rust or Go)

From Lambda perspective, I can confirm, that there were no errors. In the pick, there were ~280 concurrent functions' executions.

Average function durations, as expected, were in most cases below 100 ms.

Let's check how our DB cluster reacted to the traffic spike.

It scaled up to 6 ACU, which is the maximum defined in our stack.

RDS Data API opened 123 connections. What is interesting, this is significantly less than the number of maximum concurrently running lambda functions (280).

Summary

RDS Data API looks like a valid option to consider when building a serverless application that gets data from AWS RDS service. It helps solve the possible issues of managing opened connections and simplifies the flow of connecting AWS Lambda with DB.

The function doesn't need to be inside VPC and it doesn't use directly the password and the username stored in the secrets manager.

It is worth remembering, that RDS Data API has a cost associated with it. It also brings some overhead to getting data from DB and might not be suitable for applications that require real-time data

DEV Community

AWS Lambda with RDS Data API (TypeScript)

Introduction

Goal

Project

Infrastructure

Application code

Testing

Results

Summary

Top comments (0)

Read next

6 Common Data Structures in Programming

Let's create Data Table. Part 4: Column pinning

Introduction to Amazon Simple Notification Service (SNS)

Learn Zustand Right Now in the Simplest Way!