DEV Community

Cover image for Query DynamoDB with SQL using Athena - Leveraging DynamoDB Exports to S3 (1/2)
Marko - ServerlessLife.com
Marko - ServerlessLife.com

Posted on • Edited on • Originally published at serverlesslife.com

Query DynamoDB with SQL using Athena - Leveraging DynamoDB Exports to S3 (1/2)

Export DynamoDB to S3 and query with Athena using SQL, unlocking powerful, scalable, and serverless data analytics

This is the first part of a two-part series covering how to copy data from DynamoDB to S3 and query it with Athena.

We will build two solutions using AWS CDK and TypeScript, covering both single-table and multiple-table designs.

Unlocking New Possibilities

The DynamoDB is a NoSQL database, and that fact comes with both the good and the bad. Among others, the good is that it is ultimately scalable; the bad is that it has limited query capabilities. You can squeeze out some additional query capabilities with a single-table approach, but it is still not comparable to any SQL database. However, for reporting, the SQL query language is extremely useful or almost essential.

One option is to copy data to an SQL database and execute queries there. However, if the database is large, this can be impractical or too expensive. A more scalable solution is to transfer data to S3 and query it with Athena using SQL, which we will explore in this article.

This approach is ideal for serverless solutions where the amount of data or cost makes an SQL database impractical. Athena is a powerful alternative and can be very cost-effective. At $5 per terabyte of queried data, it is significantly cheaper than provisioning an SQL instance capable of handling that volume. However, if you're running many queries, Athena may become expensive. Athena is best for creating offline reports. It's usually too slow for direct synchronous responses to users.

There are several ways to copy data from DynamoDB to S3, but the most useful and serverless-friendly are:

  • Using native full or incremental exports from DynamoDB to S3
  • Using EventBridge Pipes and Firehose for a near real-time copy of the data

This article focuses on the first approach. The second approach will be covered in Part 2. We will look at both solutions in the context of single-table and multiple-table designs.

...continue reading

Top comments (0)