DEV Community

Cover image for Example how to trigger a Dynamodb export and create an Athena saved query with CDK
Johannes Konings for AWS Community Builders

Posted on • Originally published at johanneskonings.dev on

Example how to trigger a Dynamodb export and create an Athena saved query with CDK

In this post is described how to get the data to analyze the changes in the dynamodb data. This post describes how to (semi) automate the export of the dynamodb table data and analyze it with Athena. This post describes how you can do that manually.

One approach is with a lambda and another approach is with step functions. Both approaches implement the steps for triggering the export to a S3 bucket, create an athena table for that exported data and prepare a namend query for analyzing.

The data for this example looks like this.

ddb export ddb data

With lambda

This lambda triggers the export with via the sdk and create or update a named query.

The query creates the athena table. The export id will be set by the lambda by replacing the “s3location” with something like s3://<<bucket name>>/ddb-exports/AWSDynamoDB/<<ddb-export-id>>/data/.

CREATE EXTERNAL TABLE ddb_exported_table (
 Item struct<pk:struct<S:string>,
             person:struct<M:struct<
                jobArea:struct<S:string>,
                firstname:struct<S:string>,
                gender:struct<S:string>,
                jobType:struct<S:string>,
                jobDescriptor:struct<S:string>,
                lastname:struct<S:string>
                >>>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3Location'
TBLPROPERTIES ( 'has_encrypted_data'='true');

Enter fullscreen mode Exit fullscreen mode

https://github.com/JohannesKonings/test-aws-dynamodb-athena-cdk/blob/main/cdk/lib/ddb-export/readTable.sql

SELECT 
item.pk.S as pk,
item.person.M.firstname.S as firstname,
item.person.M.lastname.S as lastname,
item.person.M.jobArea.S as jobArea,
item.person.M.gender.S as gender, 
item.person.M.jobType.S as jobType, 
item.person.M.jobDescriptor.S as jobDescriptor
FROM "db_name"."table_name";

Enter fullscreen mode Exit fullscreen mode

After you started the lambda you have to wait until the export is finished. Then you can run the query for creating the athena table. The lambda has already deleted the old table. After that you can use the prepared query for analyzing.

A more orchestrated approach is with step function. That’s better for waiting for the results :)

With step functions

This are the steps, which are orchestrated by the step function.

ddb export sfn

It’s definend here

The step function could be startet with the default values.

ddb export sfn start 1

ddb export sfn start 2

It takes some minutes to complete.

ddb export sfn run

The “recent queries” section list the steps for dropping the old table and create the new one.

ddb export sfn athena recent queries

After it’s finished you can choose the saved query with the name sfn-ddb-export-read-table. It can be used to query all the data from the dynamodb table and could be adapted to more “complex” queries.

ddb export sfn athena query

Code

GitHub logo JohannesKonings / examples-aws-dynamodb-analytics-cdk

Example how to Analyze DynamoDB data with Athena in different ways created with AWS CDK

examples-aws-dynamodb-analytics-cdk

diff over all: npm run cdk -- diff deploye over all: npm run cdk -- deploy

scenarios

ddb-kinesis-athena: DynamoDb -> DynamoDb Streams -> Kinesis Data Streams -> Data Firehose -> S3 -> Glue Crawler -> Athena

diff: npm run cdk:ddb-kinesis-athena -- diff deploy: npm run cdk:ddb-kinesis-athena -- deploy

ddb-kinesis-athena architecture

ddb-kinesis-athena: DynamoDb -> EventBridge Pipe (DynamoDb Streams -> Data Firehose) -> S3 -> Glue Crawler -> Athena

diff: npm run cdk:ddb-pipe-athena -- diff deploy: npm run cdk:ddb-pipe-athena -- deploy

ddb-pipe-athena architecture

archive

overview

deploy options

The config file controls the deplyoement options.

Firehose

The formats JSON and Parquet can be choosen

export const config: Config = {
    ...
    kinesisFormat: 'JSON',
    ...
}
Enter fullscreen mode Exit fullscreen mode
export const config: Config = {
    ...
    kinesisFormat: 'PARQUET',
    ...
}
Enter fullscreen mode Exit fullscreen mode

Quicksight

export const config: Config = {
    ...
    isQuicksight: true,
    ...
}
Enter fullscreen mode Exit fullscreen mode

cd cdk

QUICKSIGHT_USERNAME=<<Quicksight user name>> npx cdk deploy

desription

see…




(https://github.com/JohannesKonings/test-aws-dynamodb-athena-cdk)

Top comments (0)