In this post I'm looking at how Software as a Service (SaaS) providers running on AWS can use a few AWS Services to build out a mechanism for collecting billing/metering metrics from their software and process them in order to bill a customer based on usage.
The main services I will cover are use of AWS CloudWatch embedded metric format (EMF) together with AWS CloudWatch Metric Streams.
What is EMF?
The CloudWatch embedded metric format allows you to generate custom metrics asynchronously in the form of logs written to CloudWatch Logs. You can embed custom metrics alongside detailed log event data, and CloudWatch automatically extracts the custom metrics so that you can visualize and alarm on them.
What is CloudWatch Metric Streams?
You can use metric streams to continually stream CloudWatch metrics to a destination of your choice, with near-real-time delivery and low latency. Supported destinations include AWS destinations such as Amazon Simple Storage Service and several third-party service provider destinations.
Using EMF in your Application
Imagine a sample Python application returning "hello world" to simulate a successful call. Each call to the application is captured for billing purposes using EMF. Lambda Powertools is used to reduce the amount of code we need to write.
metrics.add_metric(name="SuccessfulGet", unit=MetricUnit.Count, value=1)
metrics.add_dimension(name="Customer", value="MattHoughton")
These two lines output the required billing metrics.
The SuccessfulGet can be customised for your application. This value should indicate a sensible identifier for the chargeable action. For example in the world of insurance you may have actions such as CreatePolicy, CreateQuote, UpdateCar etc.
On the Lambda function configuration the following environment variables also need to be set.
POWERTOOLS_SERVICE_NAME: SuggestTheNameOfYourSoftware
POWERTOOLS_METRICS_NAMESPACE: SuggestSomethingLikeBilling
Here is the sample Lambda function.
import json
from aws_lambda_powertools import Metrics
from aws_lambda_powertools.metrics import MetricUnit
metrics = Metrics()
@metrics.log_metrics
def lambda_handler(event, context):
#do something of value
metrics.add_metric(name="CreatePolicy", unit=MetricUnit.Count, value=1)
metrics.add_dimension(name="Customer", value="MattHoughton") #just an example dont hard code this for real source it from the payload or something
return {
"statusCode": 200,
"body": json.dumps({
"message": "hello world"
}),
}
Testing the function in the console you should get this response.
{
"statusCode": 200,
"body": "{\"message\": \"hello world\"}"
}
START RequestId: xxxx Version: $LATEST
{
"_aws": {
"Timestamp": 1709911806737,
"CloudWatchMetrics": [
{
"Namespace": "DemoBilling",
"Dimensions": [
[
"Customer",
"service"
]
],
"Metrics": [
{
"Name": "CreatePolicy",
"Unit": "Count"
}
]
}
]
},
"Customer": "MattHoughton",
"service": "DemoProductName",
"CreatePolicy": [
1
]
}
END RequestId: xxxx
REPORT RequestId: xxxx Duration: 1.42 ms Billed Duration: 2 ms Memory Size: 128 MB Max Memory Used: 37 MB Init Duration: 177.72 ms
When the Lambda is executed the billing metrics get stored in CloudWatch Logs and are visible in CloudWatch Metrics.
Costs for EMF are based on CloudWatch log ingestion which in EU-WEST-1 is $0.57 per GB. When I was testing with an example 624 byte payload that is generated by Powertools the costs came out as:
- Each metric above stored costs $0.0000003
- One million metrics stored costs: $0.33
- Ten million metrics stored costs: $3.31
Collecting and Processing the Billing Metrics
To pull out all of the EMF metrics relating to billing we will setup a Metric Stream to send them to an S3 bucket.
Under Cloudwatch in the console select Metric Steams and Create a metric stream. We will walk through the Quick setup for S3.
Under metrics to be streamed limit this to only the metrics related to billing.
Looking at the metric stream that is created you will see details for the other components created for you.
- Amazon Data Firehose
- IAM Roles
- S3 Bucket
If you run the Lambda function a few more times then view the Data Firehose you will see the metrics being delivered.
Now if you look in the S3 bucket you will find object created. By default they are partitioned by Year/Month/Day/Hour.
Here is sample content published to S3.
{"metric_stream_name":"DemoBillingMetricStream","account_id":"xxxx","region":"us-east-1","namespace":"DemoBilling","metric_name":"CreatePolicy","dimensions":{"Customer":"MattHoughton","service":"DemoProductName"},"timestamp":1709913900000,"value":{"max":1.0,"min":1.0,"sum":9.0,"count":9.0},"unit":"Count"}
{"metric_stream_name":"DemoBillingMetricStream","account_id":"xxxx","region":"us-east-1","namespace":"DemoBilling","metric_name":"CreatePolicy","dimensions":{"Customer":"MattHoughton","service":"DemoProductName"},"timestamp":1709913960000,"value":{"max":1.0,"min":1.0,"sum":30.0,"count":30.0},"unit":"Count"}
Further Processing
From this point we have a lot of flexibility in how we can choose to process this data.
We can trigger a Lambda function that sends these metric payloads to an accounting / invoicing system.
We can also continue to use AWS Services. As the data is in S3 we can easily add this to a Glue data catalog and query it using Athena. We could even start to build dashboards and reports using QuickSight.
Top comments (1)
Thanks for sharing. I like this approach!