Photo by Bruce Hong on Unsplash
Introduction:
AWS collects many CloudWatch metrics for every EC2 instance by default. Here is the list of available metrics . But one of the important metric is missing — disk space usage. The reason for that — AWS hypervisor knows nothing about the usage of the instance store or EBS volumes linked to the EC2 instance. Luckily, there is a simple solution for that problem — the CloudWatch agent. AWS documentation doesn’t have all the necessary information about the Disk Space metric usage. More information about specified metric names and units can be seen in this document. Here we see that for the measurement of Disk Space we can use “LogicalDisk Free Megabytes” and “LogicalDisk % Free Space”. All available metrics related to Logical Disk can be checked on our Windows server with the command Get-Counter -ListSet ‘LogicalDisk’ | Select-Object -ExpandProperty Paths.
About the project:
The main requirements for my case — collecting information about Free, Used, and Full Space from all associated disks in Bytes. This post presents a solution for gathering necessary data from a Windows EC2 instance, utilizing a scheduled task to obtain regular data on the status of the disk, and sending this data to the CloudWatch Logs group. Based on these logs the Metric filters (DiskspaceFree_C, DiskspaceUsed_C, DiskspaceTotal_C) will create metrics and based on these metrics we will have Cloudwatch Dashboard and CloudWatch Alarms (DiskSpaceWarning and DiskSpaceCritical, based on DiskspaceFree_C metric). The alarms will send notifications with SNS topic. Instance ID is used as metric dimension, part of Logs group and Metric Namespace names.
The main idea of the project is to monitor Windows EC2 instance Disk Space in Bytes with only the CloudWatch agent. Measuring free Disk Space in Bytes (or Gigabytes) instead of using a percentage can be useful for setting Disk Space thresholds for monitoring and alerting because using Bytes allows us to set absolute limits that are meaningful and consistent across systems. This ensures that alerts are triggered when the Disk Space reaches a certain level that is known to be critical for the specific use case.
All infrastructure in AWS (except the S3 bucket) is created with CloudFormation. S3 bucket is necessary for storing nested stack templates, more information about nested stack is here . The EC2 instance has preinstalled and preconfigured CloudWatch agent. User data is configured as a CloudFormation Parameter and defined during the process of creating the CloudFormation stack. The additional custom resource that invokates the Lambda function is necessary in the nestedCloudwatch.yaml template for the creation Log group before the process of creation of Metric filters. The GitLab repository with all nested stack templates is here. Information about Amazon CloudWatch resources pricing.
DiskspaceFree_C Metric filter, DiskSpaceWarning Alarm and DiskSpaceUsageAlarmsDashboard Dashboard resources from the nestedCloudwatch.yaml template:
DiskspaceFreeC:
DependsOn:
- CreateLogGroupCustomResource
Type: AWS::Logs::MetricFilter
Properties:
LogGroupName: !Sub 'monitoring/ec2/${InstanceId}'
FilterName: DiskspaceFree_C
FilterPattern: '{ $.DeviceID = "C:" }'
MetricTransformations:
- MetricName: DiskspaceFree_C
MetricNamespace: !Sub 'monitoring/ec2/${InstanceId}'
MetricValue: $.FreeSpace
Unit: Gigabytes
Dimensions:
- Key: InstanceId
Value: $.InstanceId
DiskSpaceWarning:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: !Sub '${InstanceId}.DiskSpaceWarning'
AlarmDescription: !Sub '${InstanceId} - less than 5 GB Drive space is in use'
AlarmActions:
- !Ref SNSTopicArn
OKActions:
- !Ref SNSTopicArn
MetricName: 'DiskspaceFree_C'
Namespace: !Sub 'monitoring/ec2/${InstanceId}'
Statistic: Average
Period: '300'
EvaluationPeriods: '3'
Threshold: '5000000000'
ComparisonOperator: LessThanOrEqualToThreshold
Dimensions:
- Name: InstanceId
Value: !Ref InstanceId
DiskSpaceDashboard:
DependsOn:
- DiskSpaceWarning
- DiskSpaceCritical
Type: AWS::CloudWatch::Dashboard
Properties:
DashboardName: DiskSpaceUsageAlarmsDashboard
DashboardBody: !Sub
- |
{
"widgets": [
{
"type": "metric",
"x": 0,
"y": 0,
"width": 16,
"height": 6,
"properties": {
"metrics": [
["monitoring/ec2/${InstanceId}", "DiskspaceFree_C", "InstanceId", "${InstanceId}"],
["monitoring/ec2/${InstanceId}", "DiskspaceUsed_C", "InstanceId", "${InstanceId}"],
["monitoring/ec2/${InstanceId}", "DiskspaceTotal_C", "InstanceId", "${InstanceId}"]
],
"view": "timeSeries",
"stacked": false,
"region": "${Region}",
"title": "Disk Space Metrics",
"period": 300,
"stat": "Average",
"yAxis": {
"left": {
"label": "GB"
}
},
"annotations": {
"horizontal": [
{
"value": 5000000000,
"label": "Warning Alarm Threshold in Bytes",
"color": "#FFA500"
},
{
"value": 2500000000,
"label": "Critical Alarm Threshold in Bytes",
"color": "#FF0000"
}
]
}
}
},
{
"type": "alarm",
"x": 0,
"y": 10,
"width": 16,
"height": 3,
"properties": {
"title": "Alarm Status of ${DiskSpaceWarning}",
"markdown": "Alarm Name: ${DiskSpaceWarning}",
"alarms": [ "${DiskSpaceWarningArn}"],
"fontSize": "18",
"region": "${Region}"
}
},
{
"type": "alarm",
"x": 0,
"y": 10,
"width": 16,
"height": 3,
"properties": {
"title": "Alarm Status of ${DiskSpaceCritical}",
"markdown": "Alarm Name: ${DiskSpaceCritical}",
"alarms": [ "${DiskSpaceCriticalArn}"],
"fontSize": "18",
"region": "${Region}"
}
}
]
}
- Region: !Ref Region
InstanceId: !Ref InstanceId
DiskSpaceCritical: !Ref DiskSpaceCritical
DiskSpaceWarning: !Ref DiskSpaceWarning
DiskSpaceCriticalArn: !GetAtt DiskSpaceCritical.Arn
DiskSpaceWarningArn: !GetAtt DiskSpaceWarning.Arn
Prerequisites:
Before you start, make sure the following requirements are met:
- An AWS account with permissions to create resources.
- AWS CLI installed on your local machine.
Deployment:
1.Clone the repository
git clone https://gitlab.com/Andr1500/ec2_cloudwatch_sns
2.Create an S3 bucket with a unique name for nested stack templates. Here are requirements to the bucket naming rules.
on Linux:
date=$(date +%Y%m%d%H%M%S)
on Windows PowerShell:
$date = Get-Date -Format "yyyyMMddHHmmss"
aws s3api create-bucket --bucket cloudformation-templates-${date} --region YOUR_REGION \
--create-bucket-configuration LocationConstraint=YOUR_REGION
3.Add policy to the S3 bucket for access from the EC2 instance
aws s3api put-bucket-policy --bucket cloudformation-templates-${date} \
--policy '{"Version":"2012–10–17","Statement":[{"Effect":"Allow","Principal":{"Service":"ec2.amazonaws.com"},"Action":"s3:GetObject","Resource":"arn:aws:s3:::'"cloudformation-templates-${date}"'/*"}]}'
4.Fill in all necessary Parameters in root.yaml and the bucket name in user_data.txt files and send all nested stack files to the S3 bucket
aws s3 cp . s3://cloudformation-templates-${date} --recursive --exclude ".git/*" \
--exclude README.md --exclude user_data.txt --exclude ".images/*"
5.Create a CloudFormation stack
aws cloudformation create-stack \
--stack-name monitoring-ws-cloudwatch \
--template-body file://root.yaml \
--capabilities CAPABILITY_NAMED_IAM \
--parameters ParameterKey=UserData,ParameterValue="$(base64 -i user_data.txt)" \
--capabilities CAPABILITY_NAMED_IAM \
--disable-rollback
Note: encoded user data script length can’t be greater than 4096. The full length of the user data script for creation CheckDiskspaceScript.ps1 script, creation of the task scheduler, installation, and configuration CloudWatch agent was greater than 4096 characters and it was separated into 2 different scripts: the main script is stored in the S3 bucket and the user data script download the main script and run it during the process of creation the EC2 instance.
6.Open your mailbox and confirm your subscription from the SNS topic.
Go to AWS Console -> CloudWatch and see if Log group, Metric filters, Alarms, and Dashboard exist. You need to wait some period for new log data from the EC2 instance. A simple PowerShell script that makes a copy of the AmazonCloudWatchAgent.msi in the same folder every second was created for the test. When the Disk Space was less than 5 GBs, the DiskSpaceWarning alarm changed state to “In Alarm” and the SNS topic sent a notification.
Access to the EC2 instance is possible through the Systems Manager. Go to AWS console -> AWS Systems Manager -> Fleet Manager -> choose the created EC2 instance -> Node actions -> Connect -> Start terminal session. Here you can check if everything was created and configured correctly.
7.Delete the CloudFormation stack
aws cloudformation delete-stack --stack-name monitoring-ws-cloudwatch
8.Delete all files from the S3 bucket and the S3 bucket
aws s3 rm s3://cloudformation-templates-${date} --recursive
aws s3 rb s3://cloudformation-templates-${date} --force
Conclusion:
In this post, we have presented a solution for gathering data on Windows Disk Space and using this to visualize it and notify us when the Disk Space is in critical condition.
If you found this post helpful and interesting, please click the reaction button to show your support.
You can also support me with a virtual coffee https://www.buymeacoffee.com/andrworld1500 .
Top comments (0)