If our looking for a solution to handle file management of S3 objects, but listing a Bucket doesn't work for you, maybe this approach will.
Problem and solution:
You want to log S3 objects metadata and expose it through an API backed with superfast DynamoDB. I've chosen to write this simple example without even using any Lambda functions. We'll be using the build in StepFunctions integration with DynamoDB. So no code at all.
In this first step of building that kind of solution I will describe how to log newly uploaded files in the DynamoDB table in an event-driven fashion. In the next step, I’ll describe how to expose that data through a Lambda-backed API.
What will we build:
An event-driven architecture deployed with AWS SAM.
Resources used (might incur costs):
Serverless StateMachine
EventBridge Rule
S3 Bucket
DynamoDB table
Requirements:
An AWS Account obviously
AWS SAM CLI installed and configured
Let’s start! You can either run the console command to initiate SAM App using one of the Quick Start Templates or choose the Custom Template that we will write here:
sam init
I’ve chosen the first option and provided Python as my runtime even though I won't need any code in this example. Then just clear the template.yaml and remove the existing lambda code or else get started with a clean slave.
We’ll start with deploying our Bucket and set up Notifications to be sent to EventBridge:
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Event-driven file management
Resources:
FilesBucket:
Type: AWS::S3::Bucket
Properties:
NotificationConfiguration:
EventBridgeConfiguration:
EventBridgeEnabled: true
Step into your App folder and run:
sam deploy --guided
Now that we have our bucket and deployed with EventBridge configuration all S3 event notifications will be sent to the default Event Bridge bus.
In the next step, we will deploy a DynamoDB table. We’ll create two string attributes for now.
FilesTable:
Type: AWS::DynamoDB::Table
Properties:
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: 'fileName'
AttributeType: S
- AttributeName: 'lastModified'
AttributeType: S
KeySchema:
- AttributeName: fileName
KeyType: HASH
- AttributeName: lastModified
KeyType: RANGE
Now let’s write the definition of our StateMachine in ASL (Amazon States Language) that we’ll be deploying when creating the StateMachine resource.
{
"Comment": "Triggered on EventBridge S3 Object Created notification",
"StartAt": "Query",
"States": {
"Query": {
"Type": "Task",
"Next": "ObjectExists",
"Parameters": {
"TableName": <<TABLE_NAME>>,
"KeyConditionExpression": "fileName = :fileName",
"ExpressionAttributeValues": {
":fileName": {
"S.$": "$.detail.object.key"
}
}
},
"Resource": "arn:aws:states:::aws-sdk:dynamodb:query",
"ResultPath": "$.dynamoDbResult"
},
"ObjectExists": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.dynamoDbResult.Items",
"NumericGreaterThan": 0,
"Next": "UpdateObject"
}
],
"Default": "PutObject"
},
"UpdateObject": {
"Type": "Task",
"Resource": "arn:aws:states:::dynamodb:updateItem",
"Parameters": {
"TableName": <<TABLE_NAME>>,
"Key": {
"fileName": {
"S": "$.detail.object.key"
}
},
"UpdateExpression": "set lastModified : lastModified",
"ExpressionAttributeValues": {
":lastModified": {
"S": "$.time"
}
}
},
"Next": "ObjectUpdated"
},
"PutObject": {
"Type": "Task",
"Resource": "arn:aws:states:::dynamodb:putItem",
"Parameters": {
"TableName": <<TABLE_NAME>>,
"Item": {
"fileName": {
"S": "$.detail.object.key"
},
"lastModified": {
"S": "$.time"
}
}
},
"Next": "ObjectAdded"
},
"ObjectUpdated": {
"Type": "Succeed"
},
"ObjectAdded": {
"Type": "Succeed"
}
}
}
The visual workflow would look like that:
The StateMachine would be triggered on S3 Object Created notification. In the ‘Query’ step will check if the file already exists in the DynamoDB table. Then will do the ‘ObjectExists’ Choice operation on the output from the query step. If there’s no Attribute with the newly created object the ‘PutObject’ operation will run. If there’s one will update the ‘lastModified’ attribute will the event time.
Now let’s create our StateMachine Resource in our SAM template and specify the definition parameter providing a path to the above JSON file (mine is at ‘definitions/object_created.asl.json’):
ObjectCreatedStateMachine:
Type: AWS::Serverless::StateMachine
Properties:
DefinitionUri: definitions/object_created.asl.json
Policies:
- CloudWatchLogsFullAccess
- DynamoDBCrudPolicy:
TableName: !Ref FilesTable
Type: STANDARD
DefinitionSubstitutions:
TableName: !Ref FilesTable
Events:
EBPutRule:
Type: EventBridgeRule
Properties:
Pattern:
source:
- aws.s3
detail-type:
- Object Created
detail:
bucket:
name:
- !Ref FilesBucket
The above snippet creates the StateMachine for us, an IAM Role with permissions to perform CRUD operations on DynamoDB Table. It even creates EventBridge Rule with patern that will trigger StateMachine on S3 Object Created events in the FilesBucket.
This is just a beginning of a potentially powerful event-driven architecture. Thanks to the power of EventBridge and the flexibility of the Rules and patterns, one might filter the events granularly and route the events to corresponding processes.
This simple app would definitely need some error handling and maybe integration with Lambda functions to get more adjusted data out of events and into DynamoDB. In another part I will describe how one might integrate the data stored in DynamoDB table with serverless HTTP API.
Projects repo:
Top comments (0)