Arpad Toth for AWS Community Builders

Posted on Jul 13, 2023 • Originally published at arpadt.com

Returning large objects from Lambda functions using stream responses

#aws #serverless #lambda #apigateway

The new response stream invocation mode in Lambda gives another great tool in our hand. With its help, we can return larger payloads than the limit from our Lambda functions.

1. The problem

Everything works well for Bob in his application, which consists of an API Gateway and some Lambda functions.

One day, a response to an endpoint request doesn't come, and Bob sees the following error in the logs:

LAMBDA_RUNTIME Failed to post handler success response.
Http response code: 413.

It turned out that Lambda throws this error when the response size exceeds the maximum limit. API Gateway invokes the Lambda integration synchronously, so the response payload cannot exceed 6 MB. But the returned object's or array's size might exceed the limit. Some examples can include array responses from a database or a large object from S3 that the function needs to return to the client.

Either way, we need a solution!

2. Some options

We can solve this problem in multiple ways depending on our use case. Some options are the following:

Classic pagination.
The function compresses the response, and the client unzips it.
The backend uploads the large object to S3, returns the object key, and the client downloads it.
Replace Lambda with a different technology, e.g., ECS containers.
Use Lambda stream response.

All options have advantages and disadvantages and are suitable for specific use cases.

3. Using streams in Lambda function response

Node.js streams are a great way to return response objects that are larger than the payload limit from a Lambda function. This way, the response won't be one single object. Instead, the client will receive it in multiple pieces.

Currently, we must configure a function URL that is a dedicated endpoint for the Lambda function if we want to return streamed responses.

The example shows logic that downloads an object (~ 8 MB in size) from S3, but the principle will be the same for responses from a database or 3rd party call.

Let's see how it works.

3.1. Function URL

First, we configure a URL for the function. We can control access to the endpoint with AWS_IAM authorization or can choose to make it public.

Everything in AWS is about permissions, so don't forget to add a statement like this to the function's resource-based policy (not the execution role):

{
  "Effect": "Allow",
  "Principal": "*",
  "Action": "lambda:InvokeFunctionUrl",
  "Resource": "arn:aws:lambda:eu-central-1:123456789012:function:FUNCTION_NAME"
  "Condition": {
    "StringEquals": {
      "lambda:FunctionUrlAuthType": "AWS_IAM"
    }
  }
}

If we make the endpoint public, we should replace AWS_IAM with NONE in the Condition block.

3.2. Code

Enabling streamed responses requires some changes in the function's handler code. Lambda provides the global awslambda object that contains the streamifyResponse wrapper method. We must add the original handler function as an argument to this method.

By wrapping the handler in streamifyResponse, a new responseStream argument will be available. This way, the context object will become the third input to the handler.

A sample code can look like this:

import { S3Client, GetObjectCommand } from '@aws-sdk/client-s3';
import { pipeline } from 'stream/promises';
import { Readable } from 'stream';

const client = new S3Client();

const { BUCKET_NAME } = process.env;

async function streamHandler(event, responseStream, context) {
  const commonMetadata = {
    headers: {
      'x-custom-header': 'some value',
    },
  };

  const input = {
    Bucket: BUCKET_NAME,
    Key: 'large-response.json',
  };
  const command = new GetObjectCommand(input);

  let responseReadableStream;
  let metadata;

  try {
    const s3Response = await client.send(command);
    // (1) S3 response body is a readable stream
    responseReadableStream = s3Response.Body;

    metadata = {
      statusCode: 200,
      ...commonMetadata,
    };

    console.log('SUCCESS!');
  } catch (error) {
    // (2) Create a readable stream from the error object
    responseReadableStream = Readable.from(Buffer.from(JSON.stringify(error)));

    metadata = {
      // (3) Get the status code from the error object
      statusCode: error.$metadata.httpStatusCode,
      ...commonMetadata,
    };

    console.error(error.message);
  }

  // (4) Add metadata, e.g., custom headers to the response
  responseStream = awslambda.HttpResponseStream.from(responseStream, metadata);

  // (5) A promisified version of the pipeline method is also available
  await pipeline(responseReadableStream, responseStream);
}

// (6) Wrap the handler
exports.lambdaHandler = awslambda.streamifyResponse(streamHandler);

Let's discuss some key points.

Using pipeline

The Node.js stream module provides the pipeline method, which automatically handles errors, deals with backpressure, and properly ends the stream. It requires streams as parameters and adds the previous stream as input to the next one. It's a good idea to use pipeline because we'll have a shorter code and less work to do.

We can use the promisified version of pipeline from stream/promises to avoid working with callbacks (5).

responseStream argument

responseStream from awslambda is a writable stream, and we can add metadata like status code or custom headers to the response. We will also add the actual return value to responseStream.

First, we can add the metadata to the response object with the HttpResponseStream.from method (4). Next, we add the downloaded large object to the response using pipeline (5).

The Body property of the S3 response object is a readable stream (1), so we can add it to pipeline as is.

Wrapping the handler

The last step is to wrap the streamHandler function into streamifyResponse (6).

If we now invoke the function URL in Postman, we'll see that the first bytes of the response arrive very soon. The entire object will be available after a few seconds. The good news is that we can now return payload whose size is up to 20 MB, which is a soft limit.

Error handling

It's still a good idea to wrap the S3, database, or 3rd party API calls inside a try/catch block to handle any errors.

We can create a readable stream from the error object (2), and get the error status code (3) from it inside catch.

After we have made a stream from the error, we can use the same process described above to return it to the client.

3.3. With API Gateway

Bob's original problem includes an API Gateway. How can we incorporate it into the architecture without losing the ability to stream the Lambda response?

Reduced response size

API Gateway has a maximum response size of 10 MB, and this is a hard limit. So if we want to use an API Gateway, we might face the same size issue again. If it's not a cause for concern, the following solution can be a good workaround. Otherwise, we'll need to change the architecture and move the Lambda backend behind a bare function URL.

Using HTTP_PROXY integration

If we use an API Gateway, we are more flexible how we control access to the endpoint. It's a good idea to move authorization to the API Gateway.

We'll still need the function URL for the Lambda function, but we should configure an HTTP_PROXY integration instead of the classic LAMBDA_PROXY. It makes sense because the function's logic is now available via an HTTP endpoint (the function URL).

Protecting the function URL

We should also change the function URL's authorization type to NONE from AWS_IAM. But this way, we'll leave the endpoint unprotected, so we might want to implement something to filter out illegitimate requests.

A potential solution can be to check for the existence of a secret header:

const { BUCKET_NAME, SECRET_HEADER } = process.env;

async function streamHandler(event, responseStream, context) {
  const commonMetadata = {
    headers: {
      'x-custom-header': 'some value',
    },
  };

  if (!(event.headers && event.headers['x-secret-header'] === SECRET_HEADER)) {
    const noHeaderReadableStream = Readable.from(Buffer.from(JSON.stringify({
      body: 'Unauthorized request',
    })));
    const metadata = {
      statusCode: 403,
      ...commonMetadata,
    };
    responseStream = awslambda.HttpResponseStream.from(responseStream, metadata);

    await pipeline(noHeaderReadableStream, responseStream);

    console.error('Unauthorized request');
    return;
  }

  // code continues
}

If the value of the x-secret-header header is correct, we'll let the function complete its logic. If not, we can use the already discussed way to stream a custom 403 response.

But if we do so, we must instruct API Gateway to add the header and its value to each integration request it makes to the function URL. We can easily achieve it by adding parameter mapping and configuring the value of the x-secret-header as a static value:

It's easy to miss wrapping the value inside single quotes, so don't forget to do so.

4. Summary

Lambda functions have a 6 MB limit of the response payload size for synchronous invocations. Sometimes we need to return responses that are larger than the size limit.

We can implement stream responses to return large objects from our functions, but this feature is only available with function URLs.

5. Further reading

Comparing Lambda invocation modes - The three ways we can call a Lambda function

Getting started with Lambda - How to create a Lambda function

Creating and managing Lambda function URLs - How to create Lambda function URLs

Configuring a Lambda function to stream responses - Short official documentation on stream responses

DEV Community

Returning large objects from Lambda functions using stream responses

1. The problem

2. Some options

3. Using streams in Lambda function response

3.1. Function URL

3.2. Code

3.3. With API Gateway

4. Summary

5. Further reading

Top comments (0)

Read next

Migrating AWS Organizations: How I Did It and Why

Automating Infrastructure Deployment for CI/CD Pipelines Using Terraform

Navigating Through the Latest AWS Pricing Increase: What You Need to Know

Run Kubernetes Like a Pro—Without the Expertise! Introducing EKS Auto Mode