Why Replication is Important
S3 replication is a crucial feature of the AWS storage service, primarily because it allows us to establish a disaster recovery strategy, potentially minimize latency for global reach, or implement redundancy for maintaining a subset of production data in lower environments, thereby enhancing operational efficiency.
It's a pretty common use case, and it's better to know how to implement it quickly.
Lets get our hands dirty🧑🏻💻
Code template
Steps:
- Copy the template.
- Deploy the resources using CloudFormation.
- Validate by placing an object in the source bucket.
- If desired, remove the
bucketName
property to avoid collisions. - Reap the benefits of learning something new and use it wisely.
- Delete the stack.
AWSTemplateFormatVersion: '2010-09-09'
Transform: 'AWS::Serverless-2016-10-31'
Resources:
ReplicationRole:
Type: 'AWS::IAM::Role'
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: 'Allow'
Principal:
Service: 's3.amazonaws.com'
Action: 'sts:AssumeRole'
Policies:
- PolicyName: 'ReplicationPolicy'
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: 'Allow'
Action:
- 's3:GetBucketVersioning'
- 's3:ListBucket'
- s3:GetReplicationConfiguration
- s3:GetObjectVersionForReplication
- s3:GetObjectVersionAcl
- s3:GetObjectVersionTagging
- s3:GetObjectRetention
- s3:GetObjectLegalHold
Resource: '*'
- Effect: 'Allow'
Action:
- 's3:ReplicateObject'
- 's3:ReplicateDelete'
- 's3:ReplicateTags'
- 's3:GetObjectVersionTagging'
- 's3:ObjectOwnerOverrideToBucketOwner'
Resource: '*'
BucketSource:
Type: 'AWS::S3::Bucket'
Properties:
VersioningConfiguration:
Status: 'Enabled'
ReplicationConfiguration:
Role: !GetAtt ReplicationRole.Arn
Rules:
- Destination:
Bucket: !GetAtt BucketReplica.Arn
Prefix: ''
Status: 'Enabled'
BucketName: 'aws-community-builders-source'
BucketReplica:
Type: 'AWS::S3::Bucket'
Properties:
VersioningConfiguration:
Status: 'Enabled'
BucketName: 'aws-community-builders-replica'
BucketReplicaPolicy:
Type: AWS::S3::BucketPolicy
Properties:
Bucket: !Ref BucketReplica
PolicyDocument:
Version: 2012-10-17
Statement:
- Sid: "Object Level Permissions"
Effect: "Allow"
Principal:
AWS: !GetAtt ReplicationRole.Arn
Action:
- "s3:ReplicateObject"
- "s3:ReplicateDelete"
Resource: !Sub "arn:aws:s3:::${BucketReplica}/*"
- Sid: "Bucket Level Permissions"
Effect: "Allow"
Principal:
AWS: !GetAtt ReplicationRole.Arn
Action:
- "s3:List*"
- "s3:GetBucketVersioning"
- "s3:PutBucketVersioning"
Resource: !Sub "arn:aws:s3:::${BucketReplica}"
Validate deployment
Put some objects in the Source Bucket
You will see the replication rules under management tab
It takes some time to replicate the object, depending on the size and regions used. However, after a few seconds or minutes, you will see the object in the replica bucket.
Delete the stack
Make sure you empty the bucket before deleting the stack otherwise you will get an error.
Explanation
We created two buckets (for simplicity purposes in the same region), but it can be created even in another account, and the idea remains the same: we need to replicate our data.
We have a role with the necessary privileges to move and replicate the data.
We added the required policies in the replica bucket to allow the role to put the information.
Notes:
- We can be more granular in the privileges and policies, for example, to explicitly state the bucket.
- We can use batch operations to move data that is already present in the bucket; batch operations will leverage the replication and role configurations in place.
Additional Tools
I always try to automate and validate many things before pushing to main, and tools like this can save you a lot of time.
https://github.com/aws-cloudformation/cfn-lint
You can use cfn-lint
to check the CloudFormation template "crr-template.yml" for syntax errors, best practices, and potential issues.
Infrastructure as Code
Infrastructure as Code (IaC) is crucial for automating and managing IT infrastructure efficiently, enabling version control, repeatability, and collaboration in a consistent and scalable manner.
Conclusion
Cross-Replication, or even Cross-Account Replication, is a useful feature to ensure a highly available application, providing redundancy and fault tolerance.
Replication is also fundamental to have batch operations that are a fast way to move TB of data.
S3 remains one of the most useful, cheapest, and powerful services of AWS.
Happy coding! 🎉
If you enjoyed the articles, visit my blog at jorgetovar.dev.
Top comments (0)