DEV Community

Cover image for Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

I want to begin with saying that Amazon Q developer and AWS Infrastructure Composer helped me to design this solution in a matter of minutes.

Amazon Q: https://aws.amazon.com/q/
AWS Infrastructure Composer: https://aws.amazon.com/infrastructure-composer/

Problem:

Let's discuss the problem I'm attempting to tackle. IP exhaustion, which occurs when given subnets run out of IPs, is a problem that may arise if you are using Amazon EKS and your workload is growing.

Unless you have IPAM, AWS Cloudwatch metrics do not support them at the time I am writing this blog. Monitoring your available IP addresses in subnets without the use of IPAM is what I'm attempting to accomplish here.

Solution:

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

AWS Services involved in this solution:

  • AWS Lambda
  • Event Bridge Scheduler
  • AWS Cloudwatch Metrics
  • AWS Cloudwatch Alarm
  • AWS SNS

Lambda Function

I was able to create this in a matter of minutes with the help of Amazon Q Developer, however, I obviously needed to make a few little adjustments. This is very beneficial if you understand the basics and what you are doing. Instead of configuring AWS services blindly, I recommend everyone to better understand AWS services.

Full Python Script here:

import boto3
import os
from botocore.exceptions import ClientError

def lambda_handler(event, context):
    vpc_id = os.environ['VPC_ID']
    subnet_ids = os.environ['SUBNET_IDS'].split(',')
    namespace = os.environ['NAMESPACE']

    ec2 = boto3.client('ec2')
    cloudwatch = boto3.client('cloudwatch')

    try:
        response = ec2.describe_subnets(
            Filters=[
                {'Name': 'vpc-id', 'Values': [vpc_id]},
                {'Name': 'subnet-id', 'Values': subnet_ids}
            ]
        )

        for subnet in response['Subnets']:
            subnet_id = subnet['SubnetId']
            available_ip_count = subnet['AvailableIpAddressCount']
            cidr_block = subnet['CidrBlock']
            total_ip_count = 2 ** (32 - int(cidr_block.split('/')[1])) - 5  # Subtract 5 for reserved IPs

            subnet_name = subnet_id  # Default to subnet ID if no name tag
            for tag in subnet.get('Tags', []):
                if tag['Key'] == 'Name':
                    subnet_name = tag['Value']
                    break

            utilization_percentage = ((total_ip_count - available_ip_count) / total_ip_count) * 100

            # Send metrics to CloudWatch
            cloudwatch.put_metric_data(
                Namespace=namespace,
                MetricData=[
                    {
                        'MetricName': 'AvailableIPAddresses',
                        'Dimensions': [
                            {'Name': 'SubnetName', 'Value': subnet_name},
                            {'Name': 'SubnetId', 'Value': subnet_id}
                        ],
                        'Value': available_ip_count,
                        'Unit': 'Count'
                    },
                    {
                        'MetricName': 'IPUtilizationPercentage',
                        'Dimensions': [
                            {'Name': 'SubnetName', 'Value': subnet_name},
                            {'Name': 'SubnetId', 'Value': subnet_id}
                        ],
                        'Value': utilization_percentage,
                        'Unit': 'Percent'
                    }
                ]
            )

            print(f"Metrics sent for Subnet: {subnet_name} (ID: {subnet_id})")

    except ClientError as e:
        print(f"An error occurred: {e}")
        return {
            'statusCode': 500,
            'body': str(e)
        }

    return {
        'statusCode': 200,
        'body': 'Subnet monitoring completed'
    }

Enter fullscreen mode Exit fullscreen mode

Get IP address utilization:

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

Send metrics to CloudWatch:

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

Use AWS Infrastructure Composer to design the infrastructure.

This further enables you design your infrastructure visually, generate Infrastructure as Code and deploy it using AWS SAM (AWS Serverless Application Model) https://aws.amazon.com/serverless/sam/.

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

How to Deploy

Prerequisites

  • AWS CLI installed and configured with appropriate permissions
  • AWS Toolkit for Visual Studio Code installed and configured
  • AWS SAM CLI installed

Deployment Steps

Repository for entire code and instructions on how to deploy: https://github.com/awsfanboy/aws-subnet-ip-address-utilization-monitor

  • Modify the template.yaml file to adjust default parameter values or add/remove resources as needed. eg: VPC ID, Subnet Name, Subnet ID, CloudWatch Metric Namespace.
  • (Optional) Update the lambda_function.py file in the src directory.
  • Build the SAM application: sam build
  • Deploy the SAM application: sam deploy --guided
  • This will start an interactive deployment process. You'll be prompted to provide values for the parameters defined in the template. You can accept the default values or provide your own.
  • During the deployment, you'll be asked to confirm the creation of IAM roles and the changes to be applied. Review and confirm these.
  • SAM will output the ARNs of the created Lambda function and SNS topic once the deployment is complete.

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

Parameters

    VpcId: The ID of the VPC to monitor
    SubnetIds: Comma-separated list of subnet IDs to monitor
    SubnetName1: Name of the first subnet
    SubnetName2: Name of the second subnet
    CWMetericNamespace: The CloudWatch metric namespace
    AlertEmail: Email address to receive alerts
Enter fullscreen mode Exit fullscreen mode

Resources Created

  • Lambda function for monitoring subnets
  • EventBridge rule to trigger the Lambda function every minute
  • SNS topic for sending alerts
  • CloudWatch alarms for each monitored subnet

Customization

  • To monitor more than two subnets, duplicate the SubnetUtilizationAlarm resource in the template and adjust the SubnetIds parameter.
  • Modify the Lambda function code in src/lambda_function.py to implement your specific monitoring logic.
  • Adjust the alarm thresholds and evaluation periods in the SubnetUtilizationAlarm resources as needed.

Cleanup

  • To remove all resources created by this stack: sam delete
  • Follow the prompts to confirm the deletion of resources.

Demo

I have an Amazon EKS cluster running a deployment with 6 replicas. Worker nodes are running on 2 Subnets. IP address utilization is looking good.

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

The alarm state is OK.

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

Okay! let's increase the number of replicas from 6 to 600.

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

Let's check metrics from the CloudWatch and ooops! now we can see that IP utilization is high.

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

Now, let's check the Alarms in the CloudWatch. Now the state changed from OK to ALARM state.

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

Let's check my emails

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

I can see there are 2 emails in my inbox.

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

Cost

I calculated the cost using calculator.aws, and it appears to be not bad though.

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

Use Amazon Q developer and AWS Infrastructure Composer to automate the monitoring of available IP addresses in Subnets.

What Next?

These notifications can be sent to Slack, PagerDuty, and other platforms.

Conclusion

I hope my automation will help someone who doesn't want to use IPAM to monitor IP address utilization in subnets, and I truly wish we could access these metrics straight from CloudWatch.

If you have any suggestions for improvement or if you would like to use anything you currently have in a different way, please feel free to share.

Top comments (4)

Collapse
 
pzubkiewicz profile image
Pawel Zubkiewicz

This is useful solution. Few years back I've done very similar thing to monitor available addresses in a database subnets. My client had a problem with failing Glue jobs, after investigation it became clear that Glue jobs were running in parallel and were using up all free ips in the subnets (not my design). Such metric as described in this article was very helpful in configuration of max Glue jobs concurrency.

Collapse
 
awsfanboy profile image
Arshad Zackeriya πŸ‡³πŸ‡Ώ ☁️

Thanks @pzubkiewicz , yeah mate 100%. thanks for sharing another use case.

Collapse
 
zachjonesnoel profile image
Jones Zachariah Noel

That's a brilliant round up about how Serverless can be used for Ops automations and metrics!!! 😜

Collapse
 
awsfanboy profile image
Arshad Zackeriya πŸ‡³πŸ‡Ώ ☁️

Aye aye! Serverless FTW :P