This post describes how you can implement a simple URL shortener using native low code AWS services. Some previous knowledge of API Gateway and StepFunctions is assumed.
Motivation
At Mathem, we use a home built URL shortener in SMS communication with our customers. Until now, this has been hosted on an on-prem legacy environment which is soon to be turned off, so we had to move it to our serverless architecture in AWS one way or another.
There are plenty of serverless URL shorteners publically available on GitHub. However, I wanted to explore if there's a way to achieve this without having to write any Lambda function code at all.
I'm a huge fan of Lambda, but it does come with (for this use case, arguably insignificant) cold starts and the responsibility of keeping the runtime version up to date.
The StepFunctions team has made a couple of releases during 2022 that has enabled state machines to do more than just orchestrating Lambda functions invocations, and this solution would not have been possible a few months ago prior to the new intrinsic functions were released.
The full solution is available on GitHub
Creating a short URL
State machine design
The state machine consists of three states, Initialise, Create hash and Store URL as well as some logic to handle duplicate hashes.
Let's go through each of them from top down:
Initialise
Initialise:
Type: Pass
OutputPath: $
Parameters:
Splitter: "-"
Attempts: 0
Next: Create hash
This is a Pass
state that initialises the execution. It passes on two parameters; Splitter
, which is used to split the UUID in the next step as well as Attempts
which is used to avoid an infinite loop if all hashes are already taken.
Create hash
To get a short, but unique hash to hide the long URL behind we'll make use of three new intrinsic functions:
This is also a Pass
state and looks like this:
Create hash:
Type: Pass
OutputPath: $
Parameters:
Hash.$: States.ArrayGetItem(States.StringSplit(States.UUID(), $.Splitter), 1)
Attempts.$: States.MathAdd($.Attempts, 1)
Next: Store URL
Note how the output from the first state, $.Splitter
is used here. Ideally we'd like to just use States.StringSplit(States.UUID(), "-")
, but the StringSplit
function expects a valid JSON path as the second argument.
The UUID is formatted like this: ca4c1140-dcc1-40cd-ad05-7b4aa23df4a8
. Splitting it on the dash ('-') character gives us this array:
["ca4c1140", "dcc1", "40cd", "ad05", "7b4aa23df4a8"]
It's always divided in lower case, alphanumeric sequences of 8, 4, 4, 4, and 12 characters.
Next, we have to decide how long a hash we want and it comes down to how many different combinations (n36) of URLs we need.
4 characters: 1,679,616 permutations
8 characters: 2,821109907×1012 permutations
12 characters: 4,738381338×1018 permutations
We are fine with 4 for our use case, so we'll access it using index 1: States.ArrayGetItem(splitArray, 1)
.
Note that we are limited to lower case characters. To make short hashes with a mix of casing, we'd need a Lambda function in the mix.
Store URL
This state takes the output and stores it in DynamoDB using a native service integration
Store URL:
Type: Task
Resource: arn:aws:states:::aws-sdk:dynamodb:putItem
ResultPath: null
Parameters:
TableName: ${UrlTable}
ConditionExpression: attribute_not_exists(Id)
Item:
Id:
S.$: $.Hash
Url:
S.$: $$.Execution.Input.Url
HitCount:
N: "0"
Catch:
- ErrorEquals:
- DynamoDb.ConditionalCheckFailedException
Next: Continue trying?
ResultPath: null
End: true
Continue trying?:
Type: Choice
Choices:
- Variable: $.Attempts
NumericLessThan: 10
Next: Create hash
Default: Fail
Fail:
Type: Fail
Note the ConditionExpression
and the error handling in the Catch
clause. This handles scenarios of duplicate hashes and will simply generate a new one until it finds an available one. As a safety guard it will bail out after 10 attempts. In a production environment you'd want an alarm on when that happens as it's an indication that the number of available permutations are running out.
Accessing a short URL
This state machine is much simpler and only contains a single state that does two things; increments a hit counter and returns the long URL.
The ASL looks like this and uses an SDK integration to DynamoDB:
StartAt: Do redirect
States:
Do redirect:
Type: Task
Resource: arn:aws:states:::aws-sdk:dynamodb:updateItem
Parameters:
TableName: ${UrlTable}
ConditionExpression: attribute_exists(Id)
ReturnValues: ALL_NEW
UpdateExpression: SET HitCount = HitCount + :incr
ExpressionAttributeValues:
:incr:
N: "1"
Key:
Id:
S.$: $.hash
ResultSelector:
Url.$: $.Attributes.Url.S
End: true
Hooking the state machines up with API Gateway
At first I wanted to use a HttpApi to enjoy lower latency and less cost, but it proved really hard to get the request and response mapping working. The main issue was that the output from the state machine comes as stringified JSON, and when using HttpApi, the $util.parseJSON()
function wasn't available. Shoutout to all Community Builders, and in particular @jimmydahlqvist who got engaged in the problem <3
After much frustration I swapped to use a RestApi, which made my life easier. I will not go into details here, but let's zoom in on the request and response mapping. The full OpenAPI
Create URL: (POST /)
responses:
200:
statusCode: 200
responseTemplates:
application/json:
Fn::Sub: "#set($response = $input.path('$'))\n { \"ShortUrl\": \"https://${DomainName}/$util.parseJson($response.output).Hash\" }"
requestTemplates:
application/json:
Fn::Sub: "#set($data = $input.json('$')) { \"input\": \"$util.escapeJavaScript($data)\", \"stateMachineArn\": \"${CreateUrl}\" }"
Redirect to URL (GET /{id})
responses:
200:
statusCode: 301
responseTemplates:
text/html: "#set($response = $input.path('$'))\n#set($context.responseOverride.header.Location = $util.parseJson($response.output).Url)"
requestTemplates:
application/json:
Fn::Sub: "#set($data = $util.escapeJavaScript($input.params('id'))) { \"input\": \"{ \\\"hash\\\": \\\"$data\\\" }\", \"stateMachineArn\": \"${RedirectToUrl}\" }"
Too see the above mappings in context, visit the OpenAPI spec here
Conclusion
This article showed how we can use StepFunctions new intrinsic functions together with its native SDK service integrations to create a fully functional, yet simple, URL shortener. It certainly comes with some limitations that Lambda can solve and if you hit them, feel free to extend the workflow with a function.
If you have any improvements, such as converting to HttpApi or introducing better hashing, please submit a pull request
Building this project I spent 5% on creating the state machines and 95% on the API Gateway mappings. I'm hoping to see an improved SAM support for connecting API Gateway and synchronous StepFunctions Express state machines. Please upvote this issue if you agree.
Top comments (0)