Prashant Bhatasana for AWS Community Builders

Posted on Sep 13, 2024

AWS Database Migration Service — Incremental Migration from RDS To S3

#aws #rds #s3 #migration

Originally Posted on medium.com

Migrating data from relational databases to cloud-based data lakes is a common task in modern data architectures. Amazon Web Services (AWS) offers a robust solution called AWS Database Migration Service (DMS) that simplifies this process. This service is particularly useful for incremental migrations, allowing you to replicate ongoing changes in your source databases to a target destination like Amazon S3.

In this article, we’ll set up an incremental migration from an Amazon RDS database to an S3 bucket using Terraform.

Why Use AWS DMS?

AWS DMS provides several advantages:

Minimal Downtime: Supports continuous data replication, which is essential for reducing downtime during migrations.
Versatility: Supports various source and target database engines, making it flexible for different use cases.
Ease of Use: Simple to set up and monitor, with a pay-as-you-go pricing model.

Architecture Overview

The migration setup involves:

Source Database: An Amazon RDS instance (e.g., MySQL, PostgreSQL).
Target Data Store: An Amazon S3 bucket where the data will be stored.
AWS DMS: The service used to perform the migration, including replication instances, endpoints, and tasks.

Setting Up the Environment with Terraform
Terraform is an Infrastructure as Code (IaC) tool that allows you to define and provision infrastructure using a high-level configuration language. Here’s how to set up the environment:

Step 1: Define Provider and Variables

First, set up your provider and variables in a main.tf file:

provider "aws" {
  region = "us-west-2"
}

variable "rds_instance_id" {
  description = "The RDS instance identifier"
}

variable "s3_bucket_name" {
  description = "The name of the S3 bucket"
}

Step 2: Create the S3 Bucket

resource "aws_s3_bucket" "target_bucket" {
  bucket = var.s3_bucket_name
  acl    = "private"

  versioning {
    enabled = true
  }

  server_side_encryption_configuration {
    rule {
      apply_server_side_encryption_by_default {
        sse_algorithm = "AES256"
      }
    }
  }
}

Step 3: Define IAM Role for DMS

AWS DMS needs an IAM role with the necessary permissions to access the RDS instance and the S3 bucket:

resource "aws_iam_role" "dms_role" {
  name = "dms-access-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17",
    Statement = [{
      Action    = "sts:AssumeRole",
      Effect    = "Allow",
      Principal = {
        Service = "dms.amazonaws.com"
      }
    }]
  })
}

resource "aws_iam_policy" "dms_policy" {
  name   = "dms-access-policy"
  policy = jsonencode({
    Version = "2012-10-17",
    Statement = [{
      Action = [
        "s3:*",
        "rds:*",
        "dms:*"
      ],
      Effect   = "Allow",
      Resource = "*"
    }]
  })
}

resource "aws_iam_role_policy_attachment" "dms_attach_policy" {
  role       = aws_iam_role.dms_role.name
  policy_arn = aws_iam_policy.dms_policy.arn
}

Step 4: Define DMS Endpoints

Create the source and target endpoints for DMS:

resource "aws_dms_endpoint" "source_endpoint" {
  endpoint_id          = "rds-source-endpoint"
  endpoint_type        = "source"
  engine_name          = "mysql"
  username             = "your-db-username"
  password             = "your-db-password"
  server_name          = "your-db-endpoint"
  port                 = 3306
  database_name        = "your-database-name"
}

resource "aws_dms_endpoint" "target_endpoint" {
  endpoint_id          = "s3-target-endpoint"
  endpoint_type        = "target"
  engine_name          = "s3"
  s3_settings {
    bucket_name        = aws_s3_bucket.target_bucket.bucket
    bucket_folder      = "dms-data"
    compression_type   = "gzip"
  }
}

Step 5: Create a DMS Replication Instance

The replication instance handles the migration process:

resource "aws_dms_replication_instance" "replication_instance" {
  replication_instance_id   = "dms-replication-instance"
  replication_instance_class = "dms.t2.micro"
  allocated_storage         = 100
  publicly_accessible       = true
  apply_immediately         = true
}

Step 6: Create a DMS Replication Task

Define the task that will perform the migration:

resource "aws_dms_replication_task" "replication_task" {
  replication_task_id          = "rds-to-s3-task"
  migration_type               = "full-load-and-cdc"
  table_mappings               = file("table-mappings.json")
  replication_task_settings    = file("task-settings.json")
  source_endpoint_arn          = aws_dms_endpoint.source_endpoint.endpoint_arn
  target_endpoint_arn          = aws_dms_endpoint.target_endpoint.endpoint_arn
  replication_instance_arn     = aws_dms_replication_instance.replication_instance.replication_instance_arn
}

The table-mappings.json file defines which tables to migrate, and the task-settings.json file contains task-specific settings like logging and error handling.

Configuring the Incremental Migration

To enable incremental migration (also known as Change Data Capture or CDC), the DMS task must be configured to continuously capture changes from the source database after the initial full load. This is controlled by the migration_type parameter, set to "full-load-and-cdc" in the DMS replication task.

Ensure that the source database is configured to support CDC, which may involve enabling binary logging in MySQL or a similar mechanism in other database engines.

Monitoring and Managing the Migration

AWS DMS provides several metrics and logs that you can use to monitor the migration process. You can access these metrics in the AWS Management Console or set up CloudWatch alarms to notify you of any issues.

Thank you for reading, if you have anything to add please send a response or add a note!

Happy migrating! 🚀

DEV Community

AWS Database Migration Service — Incremental Migration from RDS To S3

Why Use AWS DMS?

Architecture Overview

Step 1: Define Provider and Variables

Step 2: Create the S3 Bucket

Step 3: Define IAM Role for DMS

Step 4: Define DMS Endpoints

Step 5: Create a DMS Replication Instance

Step 6: Create a DMS Replication Task

Configuring the Incremental Migration

Monitoring and Managing the Migration

Top comments (0)

Read next

My (non-AI) AWS re:Invent 24 picks

SNS vs. SQS vs. EventBridge: Choosing the Right AWS Messaging Service

Introduction to Amazon VPC and Its Fundamentals

Glue cross-account setup