DEV Community

Atsushi Suzuki
Atsushi Suzuki

Posted on

Enhancing ECR Security: Scheduled Automated Container Scans and Slack Notifications

When deploying environments on AWS ECS Fargate, it is essential to integrate container image vulnerability scanning into your CI pipeline. Initially, we integrated Trivy and Dockle for this purpose. However, to address security risks that could arise between releases, we further developed a system to periodically scan ECR images and notify the results through Slack.

Screenshot 2024-06-08 11.43.11.png

This setup leverages basic AWS resources such as Lambda, EventBridge, S3, and IAM roles, making it easily replicable for anyone with basic AWS experience. We also provide the Terraform code needed for implementation.

For those interested in the initial integration of Trivy and Dockle into the CI, please refer to my previous articles:

Implementation

ECR provides a feature to scan images at the time of push, but we've configured a system using Lambda and EventBridge to automate this scan weekly (every Monday at 10:00 AM JST).

In addition, we decided to utilize ZIP deployment for Lambda due to its cost-effectiveness compared to Docker deployment using ECR. The .zip archive necessary for Lambda deployment is stored in an S3 bucket.

The documentation for deploying Lambda functions from a Docker deployment and ZIP deployment can be found here:

Terraform Configuration

The following structure outlines the Terraform setup for this project:



.
├── environments
│   └── dev
│       ├── main.tf       # Main settings for the Dev environment
│       └── backend.tf    # Terraform backend configuration
└── modules
    ├── s3
    │   ├── main.tf       # S3 bucket configuration
    │   ├── outputs.tf    # Outputs definition for the S3 module
    │   └── provider.tf   # Provider settings for the S3 module
    ├── iam_roles
    │   ├── main.tf       # IAM roles configuration
    │   ├── outputs.tf    # Outputs definition for the IAM roles module
    │   └── provider.tf   # Provider settings for the IAM roles module
    ├── eventbridge
    │   ├── main.tf       # EventBridge configuration
    │   └── provider.tf   # Provider settings for the EventBridge module
    └── lambda
        ├── main.tf       # Lambda configuration
        ├── outputs.tf    # Outputs definition for the Lambda module
        ├── provider.tf   # Provider settings for the Lambda module
        ├── variables.tf  # Variable definitions for the Lambda module
        └── ecr_weekly_security_scan
            ├── app.py          # Python script for the Lambda function
            ├── Dockerfile      # Dockerfile for the Lambda environment
            ├── requirements.txt # List of Python dependencies
            └── build.sh        # Script to build the Docker image and create the ZIP archive


Enter fullscreen mode Exit fullscreen mode

Lambda Function

The Lambda function performs security scans on the latest ECR image and notifies the results on Slack. If it detects vulnerabilities rated as CRITICAL or HIGH, it links them directly in the Slack message, enabling instant access to the CVE details.



import os
import boto3
import requests
from botocore.exceptions import ClientError

def lambda_handler(event, context):
    ecr_client = boto3.client('ecr')
    repository_name = os.environ['REPOSITORY_NAME']
    slack_webhook_url = os.environ['SLACK_WEBHOOK_URL_ECR_WEEKLY_SECURITY_SCAN']

    # Retrieve the latest image
    try:
        response = ecr_client.describe_images(
            repositoryName=repository_name,
            filter={'tagStatus': 'TAGGED'}
        )
    except ClientError as e:
        print(f"Error retrieving images: {e}")
        raise e

    images = response.get('imageDetails', [])
    if not images:
        print("No images found.")
        return {'statusCode': 200, 'body': 'No images found.'}

    latest_image = max(images

, key=lambda x: x['imagePushedAt'])
    image_digest = latest_image['imageDigest']

    # Get scan results
    try:
        scan_results = ecr_client.describe_image_scan_findings(
            repositoryName=repository_name,
            imageId={'imageDigest': image_digest}
        )
    except ClientError as e:
        print(f"Error retrieving scan findings: {e}")
        raise e

    findings = scan_results['imageScanFindings']['findings']

    # Format the message
    if not findings:
        message = f"No findings for image {repository_name}@{image_digest}"
    else:
        message = f"*Findings for image {repository_name}@{image_digest}:*\n\n"
        max_len_cve_id = max(len(finding['name']) for finding in findings) + 2
        max_len_severity = max(len(finding['severity'])
                               for finding in findings) + 2

        for finding in findings:
            cve_id = finding['name']
            severity = finding['severity']
            if severity in ['CRITICAL', 'HIGH']:
                cve_id = f"<{cve_url(cve_id)}|{cve_id}>"
                severity = f"*{severity}*"
            message += f"{cve_id.ljust(max_len_cve_id)}  {severity.ljust(max_len_severity)}\n"

    # Send message to Slack
    response = requests.post(slack_webhook_url, json={"text": message})
    if response.status_code != 200:
        raise ValueError(
            f"Request to Slack returned an error {response.status_code}, the response is:\n{response.text}")

    return {
        'statusCode': 200,
        'body': 'Security scan completed successfully'
    }

def cve_url(cve_id):
    return f"https://nvd.nist.gov/vuln/detail/{cve_id}"


Enter fullscreen mode Exit fullscreen mode

Automatic ZIP Archive Creation

The following script builds a Docker container, extracts its contents, and packages them into a ZIP file. This archive, named ecr-weekly-security-scan.zip, is then uploaded to the aforementioned S3 bucket.



#!/bin/bash

# Build the Docker image
docker build -t ecr-weekly-security-scan-build .

# Create a container from the image
container_id=$(docker create ecr-weekly-security-scan-build)

# Copy the contents of the container to a local directory
docker cp $container_id:/var/task ./package

# Clean up
docker rm $container_id

# Zip the contents of the local directory
cd package
zip -r ../ecr-weekly-security-scan.zip .
cd ..

# Clean up
rm -rf package


Enter fullscreen mode Exit fullscreen mode

The requirements.txt and Dockerfile are defined as follows:



boto3
requests

Enter fullscreen mode Exit fullscreen mode


FROM public.ecr.aws/lambda/python:3.12

# Install Python dependencies
COPY requirements.txt /var/task/
RUN pip install -r /var/task/requirements.txt --target /var/task

# Copy the Lambda function code
COPY app.py /var/task/

# Set the working directory
WORKDIR /var/task

# Set the CMD to your handler
CMD ["app.lambda_handler"]

Enter fullscreen mode Exit fullscreen mode




Function Deployment

This Terraform code deploys the Lambda function using the created ZIP archive. Environment variables are passed from Terraform to Lambda, which are utilized during the function's execution.



resource "aws_lambda_function" "ecr_weekly_security_scan" {
function_name = "ecr-weekly-security-scan"
s3_bucket = var.s3_bucket_lambda_functions_storage_bucket
s3_key = "ecr-weekly-security-scan.zip"
handler = "app.lambda_handler"
runtime = "python3.12"
role = var.iam_role_ecr_weekly_security_scan_lambda_exec_role_arn
timeout = 300 # 5 minutes
environment {
variables = {
REPOSITORY_NAME = "example-ecr-dev"
SLACK_WEBHOOK_URL_ECR_WEEKLY_SECURITY_SCAN = var.slack_webhook_url_ecr_weekly_security_scan
}
}
}

Enter fullscreen mode Exit fullscreen mode




Variable and Output Definitions

The project settings are managed with variables.tf and outputs.tf, outlined as follows:



variable "s3_bucket_lambda_functions_storage_bucket" {
description = "The S3 bucket containing the Lambda function code"
type = string
}

variable "iam_role_ecr_weekly_security_scan_lambda_exec_role_arn" {
description = "The ARN of the Lambda execution role"
type = string
}

variable "slack_webhook_url_ecr_weekly_security_scan" {
description = "The URL of the Slack webhook to post messages to"
type = string

sensitive = true
}

Enter fullscreen mode Exit fullscreen mode


output "lambda_function_ecr_weekly_security_scan_arn" {
value = aws_lambda_function.ecr_weekly_security_scan.arn
}

output "lambda_function_ecr_weekly_security_scan_name" {
value = aws_lambda_function.ecr_weekly_security_scan.function_name
}

output "lambda_function_ecs_task_scheduler_arn" {
value = aws_lambda_function.ecs_task_scheduler.arn
}

output "lambda_function_ecs_task_scheduler_name" {
value = aws_lambda_function.ecs_task_scheduler.function_name
}

Enter fullscreen mode Exit fullscreen mode




S3 Creation

An S3 bucket is created to store the Lambda function's ZIP archive. Since the bucket name must be globally unique, it should be appropriately named.



resource "aws_s3_bucket" "lambda_functions_storage" {
bucket = "unique-lambda-functions-storage" # Ensure the name is unique
}

Enter fullscreen mode Exit fullscreen mode




Output Definition

The bucket name is defined in the outputs.tf to facilitate references from within the Lambda function's main.tf.



output "s3_bucket_lambda_functions_storage_bucket" {
value = aws_s3_bucket.lambda_functions_storage.bucket
}

Enter fullscreen mode Exit fullscreen mode




IAM Role

An execution role for the Lambda function is created to allow access to ECR for retrieving image and scan result details. Policies ecr:DescribeImages and ecr:DescribeImageScanFindings are attached to this role.



resource "aws_iam_role" "ecr_weekly_security_scan_lambda_exec_role" {
name = "ecr_weekly_security_scan_lambda_exec_role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "lambda.amazonaws.com"
}
}]
})
}

resource "aws_iam_role_policy_attachment" "ecr_weekly_security_scan_lambda_basic_execution" {
role = aws_iam_role.ecr_weekly_security_scan_lambda_exec_role.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
}

resource "aws_iam_policy" "ecr_weekly_security_scan_ecr_policy" {
name = "ecr_weekly_security_scan_ecr_policy"
description = "Policy to allow Lambda to access ECR for scanning images"
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"ecr:DescribeImages",
"ecr:DescribeImageScanFindings"
]
Resource = "*" # It is recommended to limit this to specific resource ARNs if possible
}
]
})
}

resource "aws_iam_role_policy_attachment" "ecr_weekly_security_scan_lambda_ecr_policy" {
role = aws_iam_role.ecr_weekly_security_scan_lambda_exec_role.name
policy_arn = aws_iam_policy.ecr_weekly_security_scan_ecr_policy.arn
}

Enter fullscreen mode Exit fullscreen mode




Output Definition

The ARN of the IAM role is set as an output to facilitate references from other Terraform configurations or external systems.



output "iam_role_ecr_weekly_security_scan_lambda_exec_role_arn" {
value = aws_iam_role.ecr_weekly_security_scan_lambda_exec_role.arn
}

Enter fullscreen mode Exit fullscreen mode




EventBridge Configuration

Using EventBridge, we set up a schedule to periodically scan ECR container images. Since the schedule is set in UTC, adjustments must be made for local time zones, such as JST.

Setting Up the EventBridge Rule

Using a cron expression, we configure a rule to trigger the Lambda function every Monday at 1 AM UTC (10 AM JST).



resource "aws_cloudwatch_event_rule" "ecr_weekly_security_scan_schedule" {
name = "ECRWeeklySecurityScanSchedule"
schedule_expression = "cron(0 1 ? * MON *)" # 1 AM UTC, which is 10 AM JST
}

Enter fullscreen mode Exit fullscreen mode




Configuring the EventBridge Target

The Lambda function is registered as a target based on the schedule defined in the rule.



resource "aws_cloudwatch_event_target" "ecr_weekly_security_scan_target" {
rule = aws_cloudwatch_event_rule.ecr_weekly_security_scan_schedule.name
target_id = "ecrWeeklySecurityScan"
arn = var.lambda_function_ecr_weekly_security_scan_arn
}

Enter fullscreen mode Exit fullscreen mode




Granting Invocation Permissions to Lambda

Permissions are set to safely allow EventBridge to trigger the Lambda function. This configuration is essential to enable direct triggering of the Lambda function by EventBridge.



resource "aws_lambda_permission" "ecr

_weekly_security_scan_allow_eventbridge" {
statement_id = "AllowExecutionFromEventBridge"
action = "lambda:InvokeFunction"
function_name = var.lambda_function_ecr_weekly_security_scan_name
principal = "events.amazonaws.com"
source_arn = aws_cloudwatch_event_rule.ecr_weekly_security_scan_schedule.arn
}

Enter fullscreen mode Exit fullscreen mode




Additional Information: Slack Webhook URL Reference

To set up automated notifications on Slack, you need to create an application through the Slack API and enable Incoming Webhooks. The Webhook URL can be found in the Incoming Webhooks section of the Slack app configuration page. This URL is used in the Lambda function to send scan results to the designated Slack channel.

Screenshot 2024-06-08 11.52.29.png

Conclusion

This guide has demonstrated how to automate ECS on Fargate security scans and notify teams via Slack, utilizing AWS Lambda, EventBridge, S3, and IAM with Terraform for seamless integration. This system enhances security practices by ensuring continuous vulnerability management in the development lifecycle. Adopting such automated processes is crucial for maintaining robust security and operational efficiency in cloud environments.

Top comments (0)