Recently, I had the opportunity to build a CI/CD workflow for ECS on Fargate using GitHub Actions. In this process, I incorporated various improvements regarding operational maintenance and security. Here's a summary of the key points.
Workflow Overview
Here is the complete CI/CD workflow for ECS on Fargate. This workflow includes steps for checking out code, logging into the ECR repository, building and pushing Docker images, running security scans, and deploying to ECS. For simplicity, I've omitted steps for tests and linting, but it's recommended to include them in your actual workflow.
name: ECS Fargate CI/CD
on:
push:
branches: [main, develop]
paths:
- "backend/**"
workflow_dispatch:
permissions:
id-token: write
contents: read
jobs:
build-and-push:
if: github.ref == 'refs/heads/develop' || github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set ECR repository URI based on branch
run: |
if [[ "${{ github.ref }}" == "refs/heads/develop" ]]; then
echo "REPOSITORY_URI=************.dkr.ecr.ap-northeast-1.amazonaws.com/ecr-dev" >> $GITHUB_ENV
else
echo "REPOSITORY_URI=************.dkr.ecr.ap-northeast-1.amazonaws.com/ecr-prod" >> $GITHUB_ENV
fi
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-region: ap-northeast-1
role-to-assume: ${{ secrets.AWS_IAM_ROLE_TO_ASSUME }}
role-session-name: GitHubActions
role-duration-seconds: 3600
- name: Login to ECR
uses: aws-actions/amazon-ecr-login@v2
- name: Build Docker Image
run: docker build -t ${{ env.REPOSITORY_URI }}:${{ github.sha }} -f ./backend/Dockerfile.ecs ./backend
- name: Scan image with Trivy
uses: aquasecurity/trivy-action@master
with:
image-ref: ${{ env.REPOSITORY_URI }}:${{ github.sha }}
format: "table"
severity: "CRITICAL,HIGH"
exit-code: 1
- name: Check Docker best practices with Dockle
uses: erzz/dockle-action@v1
with:
image: ${{ env.REPOSITORY_URI }}:${{ github.sha }}
failure-threshold: fatal
exit-code: 1
- name: Push to ECR
if: success()
run: docker push ${{ env.REPOSITORY_URI }}:${{ github.sha }}
- name: Notify Slack on success
if: success()
uses: rtCamp/action-slack-notify@v2
env:
SLACK_WEBHOOK: ${{ secrets.SLACK_INCOMING_WEBHOOK_URL }}
SLACK_COLOR: "#36A64F"
SLACK_MESSAGE: "Security scans have completed successfully. All checks passed."
SLACK_TITLE: "Security Scan Completed"
- name: Notify Slack on failure
if: failure()
uses: rtCamp/action-slack-notify@v2
env:
SLACK_WEBHOOK: ${{ secrets.SLACK_INCOMING_WEBHOOK_URL }}
SLACK_COLOR: "danger"
SLACK_MESSAGE: "A critical error has occurred in the build or security scan process. Please check the GitHub Actions logs for more details."
SLACK_TITLE: "Build or Security Scan Failed"
deploy:
if: github.ref == 'refs/heads/develop' || github.ref == 'refs/heads/main'
needs: build-and-push
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-region: ap-northeast-1
role-to-assume: ${{ secrets.AWS_IAM_ROLE_TO_ASSUME }}
role-session-name: GitHubActions
role-duration-seconds: 3600
- name: Download ecspresso
uses: kayac/ecspresso@v2
with:
version: v2.3.3
- name: Setup environment
run: |
echo "IMAGE_TAG=${{ github.sha }}" >> $GITHUB_ENV
if [[ ${{ github.ref }} == 'refs/heads/develop' ]]; then
echo "working_directory=./backend/ecspresso/dev" >> $GITHUB_ENV
echo "ENV=${{ secrets.ENV_DEV }}" >> $GITHUB_ENV
echo "COGNITO_USER_POOL_ID=${{ secrets.COGNITO_USER_POOL_ID_DEV }}" >> $GITHUB_ENV
echo "S3_URL=${{ secrets.S3_URL_DEV }}" >> $GITHUB_ENV
echo "SLACK_MENTIONS=" >> $GITHUB_ENV
echo "SLACK_TITLE_PREFIX=Develop" >> $GITHUB_ENV
else
echo "working_directory=./backend/ecspresso/prod" >> $GITHUB_ENV
echo "ENV=${{ secrets.ENV_PROD }}" >> $GITHUB_ENV
echo "COGNITO_USER_POOL_ID=${{ secrets.COGNITO_USER_POOL_ID_PROD }}" >> $GITHUB_ENV
echo "S3_URL=${{ secrets.S3_URL_PROD }}" >> $GITHUB_ENV
echo "SLACK_MENTIONS=<@***********>" >> $GITHUB_ENV
echo "SLACK_TITLE_PREFIX=Production" >> $GITHUB_ENV
fi
- name: Deploy to ECS service
run: ecspresso deploy --config ecspresso.yml
working-directory: ${{ env.working_directory }}
env:
ENV: ${{ env.ENV }}
COGNITO_USER_POOL_ID: ${{ env.COGNITO_USER_POOL_ID }}
S3_URL: ${{ env.S3_URL }}
IMAGE_TAG: ${{ env.IMAGE_TAG }}
- name: Set Slack message and title on success
if: success()
run: |
echo "SLACK_COLOR=good" >> $GITHUB_ENV
echo "SLACK_TITLE_SUFFIX=(${{ github.ref_name }}) on ECS Fargate Deployment Success" >> $GITHUB_ENV
- name: Set Slack message and title on failure
if: failure()
run: |
echo "SLACK_COLOR=danger" >> $GITHUB_ENV
echo "SLACK_TITLE_SUFFIX=(${{ github.ref_name }}) on ECS Fargate Deployment Failure" >> $GITHUB_ENV
- name: Notify Slack about deployment status
if: always()
uses: rtCamp/action-slack-notify@v2
env:
SLACK_WEBHOOK: ${{ secrets.SLACK_INCOMING_WEBHOOK_URL }}
SLACK_COLOR: ${{ env.SLACK_COLOR }}
SLACK_MESSAGE: ${{ env.SLACK_MENTIONS }}
SLACK_TITLE: ${{ env.SLACK_TITLE_PREFIX }} ${{ env.SLACK_TITLE_SUFFIX }}
Improvements for Operational Maintenance
Consolidating Workflows for All Environments
Using GitFlow, the workflow is triggered by PR merges to the develop and main branches (for simplicity, the sample code omits the staging branch). Previously, there were separate files for each environment, but I consolidated them into a single file for easier maintenance. The following code sets the ECR repository URI based on the branch.
- name: Set ECR repository URI based on branch
run: |
if [[ "${{ github.ref }}" == "refs/heads/develop" ]]; then
echo "REPOSITORY_URI=************.dkr.ecr.ap-northeast-1.amazonaws.com/ecr-dev" >> $GITHUB_ENV
else
echo "REPOSITORY_URI=************.dkr.ecr.ap-northeast-1.amazonaws.com/ecr-prod" >> $GITHUB_ENV
fi
Using Commit Hashes for Container Image Tags
Instead of using the latest
tag for ECR, I used commit hashes for tagging container images. This makes it easier to track which commit the image was built from.
- name: Push to ECR
if: success()
run: docker push ${{ env.REPOSITORY_URI }}:${{ github.sha }}
Managing Task Definitions and Services with ecspresso
For managing ECS task definitions and services, I used ecspresso
, a specialized tool. Initially, I considered using Terraform, but frequent updates and diff management during deployments made it cumbersome. ecspresso
simplifies this process and is widely used by major Japanese companies.
The following code shows how to deploy task definitions and services using ecspresso
. It specifies the directory for the ecspresso.yml
file and passes environment variables for deployment.
- name: Deploy to ECS service
run: ecspresso deploy --config ecspresso.yml
working-directory: ${{ env.working_directory }}
env:
ENV: ${{ env.ENV }}
COGNITO_USER_POOL_ID: ${{ env.COGNITO_USER_POOL_ID }}
S3_URL: ${{ env.S3_URL }}
IMAGE_TAG: ${{ env.IMAGE_TAG }}
In the task definition file ecs-task-def.json
, environment variables are loaded as follows:
{
"containerDefinitions": [
{
"cpu": 256,
"environment": [
{
"name": "TZ",
"value": "Asia/Tokyo"
},
{
"name": "ENV",
"value": "{{ must_env `ENV` }}"
},
{
"name": "COGNITO_USER_POOL_ID",
"value": "{{ must_env `COGNITO_USER_POOL_ID` }}"
},
{
"name": "S3_URL",
"value": "{{ must_env `S3_URL` }}"
},
{
"name": "REGION",
"value": "{{ must_env `REGION` }}"
}
],
"essential": true,
"image": "************.dkr.ecr.ap-northeast-1.amazonaws.com/ecr-dev:{{ must_env `IMAGE_TAG` }}",
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-create-group": "true",
"awslogs-group": "/ecs/task-dev",
"awslogs-region": "ap-northeast-1",
"awslogs-stream-prefix": "ecs"
}
},
"memory": 512,
"memoryReservation": 512,
"name": "container",
"portMappings": [
{
"appProtocol": "http",
"containerPort": 8080,
"hostPort": 8080,
"name": "container-8080-tcp",
"protocol": "tcp"
}
]
}
],
"cpu": "256",
"executionRoleArn": "arn:aws:iam::************:role/ecsTaskExecutionRole",
"family": "task-dev",
"ipcMode": "",
"memory": "512",
"networkMode": "awsvpc",
"pidMode": "",
"requiresCompatibilities": ["FARGATE"],
"tags": [
{
"key": "Environment",
"value": "dev"
}
],
"taskRoleArn": "arn:aws:iam::************:role/ecsTaskRole"
}
Deployment Notifications
I used rtCamp/action-slack-notify
for notifications of security scans and deployment completion.
rtCamp/action-slack-notify GitHub Repository
During production deployments, I set the Slack member ID of the operators in the environment variables (echo "SLACK_MENTIONS=<@***********>" >> $GITHUB_ENV
) to ensure mentions in notifications. This helps the team quickly grasp the deployment status.
You can copy the member ID from the Slack screen shown below.
Security Improvements
Keyless AssumeRole with OpenID Connect
I used OpenID Connect to assume roles without credentials (access key, secret access key) when accessing ECR and other AWS resources from GitHub Actions. This significantly improves security by eliminating the need for long-lived static credentials.
For detailed implementation, please refer to my previous article.
Strengthening Security with IAM Roles and OpenID Connect in GitHub Actions Deploy Workflows
Vulnerability Scanning with Trivy and Dockle
After building the container image, I used Trivy and Dockle to perform security scans according to best practices.
Trivy
Trivy checks for known vulnerabilities in the container image, scanning OS packages and application libraries. The workflow fails if HIGH
or CRITICAL
vulnerabilities are detected.
Dockle
Dockle checks if the Dockerfile
follows best practices. It verifies that the container is not running as root, ensures there are no unnecessary files or directories, and more.
The following steps perform vulnerability scanning with Trivy and Dockle, pushing to ECR only if the scans are successful, and sending notifications to Slack.
- name: Scan image with Trivy
uses: aquasecurity/trivy-action@master
with:
image-ref: ${{ env.REPOSITORY_URI }}:${{ github.sha }}
format: "table"
severity: "CRITICAL,HIGH"
exit-code: 1
- name: Check Docker best practices with Dockle
uses: erzz/dockle-action@v1
with:
image: ${{ env.REPOSITORY_URI }}:${{ github.sha }}
failure-threshold: fatal
exit-code: 1
- name: Push to ECR
if: success()
run: docker push ${{ env.REPOSITORY_URI }}:${{ github.sha }}
- name: Notify Slack on success
if: success()
uses: rtCamp/action-slack-notify@v2
env:
SLACK_WEBHOOK: ${{ secrets.SLACK_INCOMING_WEBHOOK_URL }}
SLACK_COLOR: "#36A64F"
SLACK_MESSAGE: "Security scans have completed successfully. All checks passed."
SLACK_TITLE: "Security Scan Completed"
- name: Notify Slack on failure
if: failure()
uses: rtCamp/action-slack-notify@v2
env:
SLACK_WEBHOOK: ${{ secrets.SLACK_INCOMING_WEBHOOK_URL }}
SLACK_COLOR: "danger"
SLACK_MESSAGE: "A critical error has occurred in the build or security scan process. Please check the GitHub Actions logs for more details."
SLACK_TITLE: "Build or Security Scan Failed"
The Trivy and Dockle actions are from the following repositories:
Multi-Stage Builds
To ensure that the deployment image does not include unnecessary libraries, I used multi-stage builds. This approach separates the build environment from the runtime environment, keeping the final image small. From a security perspective, excluding unnecessary tools and libraries reduces the attack surface and lowers the risk of vulnerabilities. I chose the slim
version for the final image base.
FROM node:18.20.2 AS builder
WORKDIR /app
COPY package*.json ./
COPY yarn.lock ./
RUN yarn install --frozen-lockfile --ignore-scripts
COPY . .
RUN yarn build
FROM node:18.20.2-slim
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./package.json
EXPOSE 8080
CMD ["node", "dist/main"]
For more details on multi-stage builds, please refer to my previous article.
Optimizing Docker Images with Multi-Stage Builds and Distroless Approach
Conclusion
In this article, I introduced a secure and efficient CI/CD workflow for ECS on Fargate using GitHub Actions. Moving forward, I plan to add test steps and integrate performance monitoring tools. I hope this article helps you in setting up a secure and efficient CI/CD pipeline for your projects.
Top comments (12)
Hi Atsushi Suzuki,
Top, very nice !
Thanks for sharing
Thanks!
Thanks for all
Thanks!!
wow, this is a good article, I'm following your channel
Thanks!!
What about something with docker compose with 4 to 5 services as containers?
Absolutely! Docker Compose is super handy for managing multiple services. It makes defining and orchestrating containers a breeze, especially for development and local testing environments.
My current article focuses on deploying to production with ECS, but I actually wrote about using Docker Compose before. Feel free to check it out here:
dev.to/suzuki0430/docker-for-begin...
This is awesome, I recently did something like this, but I didn't include the slack messages, I'll definitely look into this add it to my workflow
Thanks!!
Thank you for the writeup! What about github environments tho ?!
Thanks! GitHub Environments are pretty handy for managing settings like secrets and rules differently for dev, staging, or prod environments. Makes it super easy to keep things secure for each setup!