Atsushi Suzuki

Posted on May 18

Mastering Secure CI/CD for ECS with GitHub Actions

#devops #docker #cicd #security

Recently, I had the opportunity to build a CI/CD workflow for ECS on Fargate using GitHub Actions. In this process, I incorporated various improvements regarding operational maintenance and security. Here's a summary of the key points.

Workflow Overview

Here is the complete CI/CD workflow for ECS on Fargate. This workflow includes steps for checking out code, logging into the ECR repository, building and pushing Docker images, running security scans, and deploying to ECS. For simplicity, I've omitted steps for tests and linting, but it's recommended to include them in your actual workflow.



name: ECS Fargate CI/CD

on:
  push:
    branches: [main, develop]
    paths:
      - "backend/**"
  workflow_dispatch:

permissions:
  id-token: write
  contents: read

jobs:
  build-and-push:
    if: github.ref == 'refs/heads/develop' || github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set ECR repository URI based on branch
        run: |
          if [[ "${{ github.ref }}" == "refs/heads/develop" ]]; then
            echo "REPOSITORY_URI=************.dkr.ecr.ap-northeast-1.amazonaws.com/ecr-dev" >> $GITHUB_ENV
          else
            echo "REPOSITORY_URI=************.dkr.ecr.ap-northeast-1.amazonaws.com/ecr-prod" >> $GITHUB_ENV
          fi

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-region: ap-northeast-1
          role-to-assume: ${{ secrets.AWS_IAM_ROLE_TO_ASSUME }}
          role-session-name: GitHubActions
          role-duration-seconds: 3600

      - name: Login to ECR
        uses: aws-actions/amazon-ecr-login@v2

      - name: Build Docker Image
        run: docker build -t ${{ env.REPOSITORY_URI }}:${{ github.sha }} -f ./backend/Dockerfile.ecs ./backend

      - name: Scan image with Trivy
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: ${{ env.REPOSITORY_URI }}:${{ github.sha }}
          format: "table"
          severity: "CRITICAL,HIGH"
          exit-code: 1

      - name: Check Docker best practices with Dockle
        uses: erzz/dockle-action@v1
        with:
          image: ${{ env.REPOSITORY_URI }}:${{ github.sha }}
          failure-threshold: fatal
          exit-code: 1

      - name: Push to ECR
        if: success()
        run: docker push ${{ env.REPOSITORY_URI }}:${{ github.sha }}

      - name: Notify Slack on success
        if: success()
        uses: rtCamp/action-slack-notify@v2
        env:
          SLACK_WEBHOOK: ${{ secrets.SLACK_INCOMING_WEBHOOK_URL }}
          SLACK_COLOR: "#36A64F"
          SLACK_MESSAGE: "Security scans have completed successfully. All checks passed."
          SLACK_TITLE: "Security Scan Completed"

      - name: Notify Slack on failure
        if: failure()
        uses: rtCamp/action-slack-notify@v2
        env:
          SLACK_WEBHOOK: ${{ secrets.SLACK_INCOMING_WEBHOOK_URL }}
          SLACK_COLOR: "danger"
          SLACK_MESSAGE: "A critical error has occurred in the build or security scan process. Please check the GitHub Actions logs for more details."
          SLACK_TITLE: "Build or Security Scan Failed"

  deploy:
    if: github.ref == 'refs/heads/develop' || github.ref == 'refs/heads/main'
    needs: build-and-push
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-region: ap-northeast-1
          role-to-assume: ${{ secrets.AWS_IAM_ROLE_TO_ASSUME }}
          role-session-name: GitHubActions
          role-duration-seconds: 3600

      - name: Download ecspresso
        uses: kayac/ecspresso@v2
        with:
          version: v2.3.3

      - name: Setup environment
        run: |
          echo "IMAGE_TAG=${{ github.sha }}" >> $GITHUB_ENV
          if [[ ${{ github.ref }} == 'refs/heads/develop' ]]; then
            echo "working_directory=./backend/ecspresso/dev" >> $GITHUB_ENV
            echo "ENV=${{ secrets.ENV_DEV }}" >> $GITHUB_ENV
            echo "COGNITO_USER_POOL_ID=${{ secrets.COGNITO_USER_POOL_ID_DEV }}" >> $GITHUB_ENV
            echo "S3_URL=${{ secrets.S3_URL_DEV }}" >> $GITHUB_ENV
            echo "SLACK_MENTIONS=" >> $GITHUB_ENV
            echo "SLACK_TITLE_PREFIX=Develop" >> $GITHUB_ENV
          else
            echo "working_directory=./backend/ecspresso/prod" >> $GITHUB_ENV
            echo "ENV=${{ secrets.ENV_PROD }}" >> $GITHUB_ENV
            echo "COGNITO_USER_POOL_ID=${{ secrets.COGNITO_USER_POOL_ID_PROD }}" >> $GITHUB_ENV
            echo "S3_URL=${{ secrets.S3_URL_PROD }}" >> $GITHUB_ENV
            echo "SLACK_MENTIONS=<@***********>" >> $GITHUB_ENV
            echo "SLACK_TITLE_PREFIX=Production" >> $GITHUB_ENV
          fi

      - name: Deploy to ECS service
        run: ecspresso deploy --config ecspresso.yml
        working-directory: ${{ env.working_directory }}
        env:
          ENV: ${{ env.ENV }}
          COGNITO_USER_POOL_ID: ${{ env.COGNITO_USER_POOL_ID }}
          S3_URL: ${{ env.S3_URL }}
          IMAGE_TAG: ${{ env.IMAGE_TAG }}

      - name: Set Slack message and title on success
        if: success()
        run: |
          echo "SLACK_COLOR=good" >> $GITHUB_ENV
          echo "SLACK_TITLE_SUFFIX=(${{ github.ref_name }}) on ECS Fargate Deployment Success" >> $GITHUB_ENV

      - name: Set Slack message and title on failure
        if: failure()
        run: |
          echo "SLACK_COLOR=danger" >> $GITHUB_ENV
          echo "SLACK_TITLE_SUFFIX=(${{ github.ref_name }}) on ECS Fargate Deployment Failure" >> $GITHUB_ENV

      - name: Notify Slack about deployment status
        if: always()
        uses: rtCamp/action-slack-notify@v2
        env:
          SLACK_WEBHOOK: ${{ secrets.SLACK_INCOMING_WEBHOOK_URL }}
          SLACK_COLOR: ${{ env.SLACK_COLOR }}
          SLACK_MESSAGE: ${{ env.SLACK_MENTIONS }}
          SLACK_TITLE: ${{ env.SLACK_TITLE_PREFIX }} ${{ env.SLACK_TITLE_SUFFIX }}

Improvements for Operational Maintenance

Consolidating Workflows for All Environments

Using GitFlow, the workflow is triggered by PR merges to the develop and main branches (for simplicity, the sample code omits the staging branch). Previously, there were separate files for each environment, but I consolidated them into a single file for easier maintenance. The following code sets the ECR repository URI based on the branch.



- name: Set ECR repository URI based on branch
  run: |
    if [[ "${{ github.ref }}" == "refs/heads/develop" ]]; then
      echo "REPOSITORY_URI=************.dkr.ecr.ap-northeast-1.amazonaws.com/ecr-dev" >> $GITHUB_ENV
    else
      echo "REPOSITORY_URI=************.dkr.ecr.ap-northeast-1.amazonaws.com/ecr-prod" >> $GITHUB_ENV
    fi

Using Commit Hashes for Container Image Tags

Instead of using the latest tag for ECR, I used commit hashes for tagging container images. This makes it easier to track which commit the image was built from.



- name: Push to ECR
  if: success()
  run: docker push ${{ env.REPOSITORY_URI }}:${{ github.sha }}

Managing Task Definitions and Services with ecspresso

For managing ECS task definitions and services, I used ecspresso, a specialized tool. Initially, I considered using Terraform, but frequent updates and diff management during deployments made it cumbersome. ecspresso simplifies this process and is widely used by major Japanese companies.

ecspresso GitHub Repository

The following code shows how to deploy task definitions and services using ecspresso. It specifies the directory for the ecspresso.yml file and passes environment variables for deployment.



- name: Deploy to ECS service
  run: ecspresso deploy --config ecspresso.yml
  working-directory: ${{ env.working_directory }}
  env:
    ENV: ${{ env.ENV }}
    COGNITO_USER_POOL_ID: ${{ env.COGNITO_USER_POOL_ID }}
    S3_URL: ${{ env.S3_URL }}
    IMAGE_TAG: ${{ env.IMAGE_TAG }}

In the task definition file ecs-task-def.json, environment variables are loaded as follows:



{
  "containerDefinitions": [
    {
      "cpu": 256,
      "environment": [
        {
          "name": "TZ",
          "value": "Asia/Tokyo"
        },
        {
          "name": "ENV",
          "value": "{{ must_env `ENV` }}"
        },
        {
          "name": "COGNITO_USER_POOL_ID",
          "value": "{{ must_env `COGNITO_USER_POOL_ID` }}"
        },
        {
          "name": "S3_URL",
          "value": "{{ must_env `S3_URL` }}"
        },
        {
          "name": "REGION",
          "value": "{{ must_env `REGION` }}"
        }
      ],
      "essential": true,
      "image": "************.dkr.ecr.ap-northeast-1.amazonaws.com/ecr-dev:{{ must_env `IMAGE_TAG` }}",
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-create-group": "true",
          "awslogs-group": "/ecs/task-dev",
          "awslogs-region": "ap-northeast-1",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "memory": 512,
      "memoryReservation": 512,
      "name": "container",
      "portMappings": [
        {
          "appProtocol": "http",
          "containerPort": 8080,
          "hostPort": 8080,
          "name": "container-8080-tcp",
          "protocol": "tcp"
        }
      ]
    }
  ],
  "cpu": "256",
  "executionRoleArn": "arn:aws:iam::************:role/ecsTaskExecutionRole",
  "family": "task-dev",
  "ipcMode": "",
  "memory": "512",
  "networkMode": "awsvpc",
  "pidMode": "",
  "requiresCompatibilities": ["FARGATE"],
  "tags": [
    {
      "key": "Environment",
      "value": "dev"
    }
  ],
  "taskRoleArn": "arn:aws:iam::************:role/ecsTaskRole"
}

Deployment Notifications

I used rtCamp/action-slack-notify for notifications of security scans and deployment completion.

rtCamp/action-slack-notify GitHub Repository

During production deployments, I set the Slack member ID of the operators in the environment variables (echo "SLACK_MENTIONS=<@***********>" >> $GITHUB_ENV) to ensure mentions in notifications. This helps the team quickly grasp the deployment status.

You can copy the member ID from the Slack screen shown below.

Security Improvements

Keyless AssumeRole with OpenID Connect

I used OpenID Connect to assume roles without credentials (access key, secret access key) when accessing ECR and other AWS resources from GitHub Actions. This significantly improves security by eliminating the need for long-lived static credentials.

For detailed implementation, please refer to my previous article.
Strengthening Security with IAM Roles and OpenID Connect in GitHub Actions Deploy Workflows

Vulnerability Scanning with Trivy and Dockle

After building the container image, I used Trivy and Dockle to perform security scans according to best practices.

Trivy

Trivy checks for known vulnerabilities in the container image, scanning OS packages and application libraries. The workflow fails if HIGH or CRITICAL vulnerabilities are detected.

Dockle

Dockle checks if the Dockerfile follows best practices. It verifies that the container is not running as root, ensures there are no unnecessary files or directories, and more.

The following steps perform vulnerability scanning with Trivy and Dockle, pushing to ECR only if the scans are successful, and sending notifications to Slack.



- name: Scan image with Trivy
  uses: aquasecurity/trivy-action@master
  with:
    image-ref: ${{ env.REPOSITORY_URI }}:${{ github.sha }}
    format: "table"
    severity: "CRITICAL,HIGH"
    exit-code: 1

- name: Check Docker best practices with Dockle
  uses: erzz/dockle-action@v1
  with:
    image: ${{ env.REPOSITORY_URI }}:${{ github.sha }}
    failure-threshold: fatal
    exit-code: 1

- name: Push to ECR
  if: success()
  run: docker push ${{ env.REPOSITORY_URI }}:${{ github.sha }}

- name: Notify Slack on success
  if: success()
  uses: rtCamp/action-slack-notify@v2
  env:
    SLACK_WEBHOOK: ${{ secrets.SLACK_INCOMING_WEBHOOK_URL }}
    SLACK_COLOR: "#36A64F"
    SLACK_MESSAGE: "Security scans have completed successfully. All checks passed."
    SLACK_TITLE: "Security Scan Completed"

- name: Notify Slack on failure
  if: failure()
  uses: rtCamp/action-slack-notify@v2
  env:
    SLACK_WEBHOOK: ${{ secrets.SLACK_INCOMING_WEBHOOK_URL }}
    SLACK_COLOR: "danger"
    SLACK_MESSAGE: "A critical error has occurred in the build or security scan process. Please check the GitHub Actions logs for more details."
    SLACK_TITLE: "Build or Security Scan Failed"

The Trivy and Dockle actions are from the following repositories:

Multi-Stage Builds

To ensure that the deployment image does not include unnecessary libraries, I used multi-stage builds. This approach separates the build environment from the runtime environment, keeping the final image small. From a security perspective, excluding unnecessary tools and libraries reduces the attack surface and lowers the risk of vulnerabilities. I chose the slim version for the final image base.



FROM node:18.20.2 AS builder

WORKDIR /app

COPY package*.json ./
COPY yarn.lock ./

RUN yarn install --frozen-lockfile --ignore-scripts

COPY . .

RUN yarn build

FROM node:18.20.2-slim

WORKDIR /app

COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./package.json

EXPOSE 8080

CMD ["node", "dist/main"]

For more details on multi-stage builds, please refer to my previous article.
Optimizing Docker Images with Multi-Stage Builds and Distroless Approach

Conclusion

In this article, I introduced a secure and efficient CI/CD workflow for ECS on Fargate using GitHub Actions. Moving forward, I plan to add test steps and integrate performance monitoring tools. I hope this article helps you in setting up a secure and efficient CI/CD pipeline for your projects.

Top comments (12)

João Angelo • May 28

Hi Atsushi Suzuki,
Top, very nice !
Thanks for sharing

Atsushi Suzuki • May 29

Thanks!

Damian • May 18

Thanks for all

Atsushi Suzuki • May 21

Thanks!!

Aidan • May 21

wow, this is a good article, I'm following your channel

Atsushi Suzuki • May 21

Thanks!!

Steve Yonkeu • May 20

What about something with docker compose with 4 to 5 services as containers?

Atsushi Suzuki • May 21

Absolutely! Docker Compose is super handy for managing multiple services. It makes defining and orchestrating containers a breeze, especially for development and local testing environments.

My current article focuses on deploying to production with ECS, but I actually wrote about using Docker Compose before. Feel free to check it out here:
dev.to/suzuki0430/docker-for-begin...

Konadu Akwasi Akuoko • May 19

This is awesome, I recently did something like this, but I didn't include the slack messages, I'll definitely look into this add it to my workflow

Atsushi Suzuki • May 21

Thanks!!

Jerry • May 23 • Edited

Thank you for the writeup! What about github environments tho ?!

Atsushi Suzuki • May 29

Thanks! GitHub Environments are pretty handy for managing settings like secrets and rules differently for dev, staging, or prod environments. Makes it super easy to keep things secure for each setup!

View full discussion (12 comments)

DEV Community

Mastering Secure CI/CD for ECS with GitHub Actions

Workflow Overview

Improvements for Operational Maintenance

Consolidating Workflows for All Environments

Using Commit Hashes for Container Image Tags

Managing Task Definitions and Services with ecspresso

Deployment Notifications

Security Improvements

Keyless AssumeRole with OpenID Connect

Vulnerability Scanning with Trivy and Dockle

Trivy

Dockle

Multi-Stage Builds

Conclusion

Top comments (12)

Read next

AWS Architectural Diagrams on a Commit Base: Using AWS PDK Diagram Plugin with Python

Day 16: Introduction to DockerHub

How to Master Kubernetes Troubleshooting? Do it with 35 Real-World Scenarios

Decoding the Design: The Evolution of the Amazon Web Services Logo