I recently made a switch from Postgres to Dynamodb and have been learning some pretty basic stuff. I’m working with a single table with a single kind of item so none of the tough NoSQL relational paradigm stuff, just some basic putting and updating. In using DynamoDB's built-in update
function, I came across what I imagine is a pretty ordinary case that was not covered in documentation: how to update records allowing any possible combination of properties.
Update Expressions from the Amazon DynamoDB docs, for reference
The Case
Here’s the case:
- I have an item that I want to insert into my table, that has an arbitrary amount of properties;
- I expect it will be updated, adding to those initial properties another, different set of arbitrary properties.
In other words: I’ll create an item with x
and y
properties, and later on will update it with property z
.
I read that the default behavior for DynamoDB’s update
is like Postgres’ UPSERT
, ie it will create the record in the database if it cannot find it by its primary key. So it sounded like I could do my insert and update logic in the same function, which would be great.
By its nature, DynamoDB (and NoSQL databases in general, as far as I understand) have the flexibility to create records with any properties. That’s why it was surprising to me that the problem I kept running into was specifically of that nature: DynamoDB would not let me update a record with a property it didn’t already have if I didn’t supply all its other already-existing properties.
In the most simplified example, this is what was happening:
- I create an item with
x
andy
properties; - I then update that item with
z
property; - The update fails because I did not supply
x
andy
properties. The error message was: "ExpressionAttributeValues cannot be NULL" (if you're new to DynamoDB I cover ExpressionAttributeValues below).
Here’s a good Stack Overflow post with a similar problem.
Coding the Problem
In my project, I’m building a database of video games. I want to save some info about each one: title
, developer
, publisher
, releaseDate
, and a lot more. But let’s start simple, with just one attribute.
In the example below, let’s imagine we have a table where the primary key (Key
below) is the title, and we already have a record in the database that exists with that Key
. This is how we would update a single attribute, developer
.
const data = {title: 'Uncharted', developer: 'Naughty Dog'}
const params = {
TableName: process.env.DYNAMODB_TABLE,
Key: { title: data.title },
UpdateExpression: 'SET developer = :dev',
ExpressionAttributeValues: {
':dev': data.developer
}
};
await dynamodb.update(params).promise()
}
So this works, but with this approach, we’ve totally hamstrung the flexibility of DynamoDB. We are expecting a specific bit of data to be sent in with the request, data.developer
.
What happens if we want to add any other properties to this request? Well, if we don’t change the code at all and send a request with, say, a publisher
property in the data with our developer
and title
, it simply won’t get written to the record. To accommodate new data to add, we'd have to change the UpdateExpression
and the ExpressionAttributeValues
like this:
const data = {title: 'Uncharted', developer: 'Naughty Dog', publisher: 'Sony'}
const params = {
TableName: process.env.DYNAMODB_TABLE,
Key: { title: data.title },
UpdateExpression: 'SET developer = :dev', publisher = :pub,
ExpressionAttributeValues: {
':dev': data.developer,
':pub': data.publisher
}
};
await dynamodb.update(params).promise()
}
Again, this works, but if you ever update your object without every single attribute, the request will fail with the ExpressionAttributeValues cannot be NULL error. You're not allowed to update just the developer, or just the publisher with this approach.
This completely goes against what I’d read and learned about as one of the main selling points of NoSQL: flexibility.
How We Fix It
So looking at the Stack Overflow article I posted above gave me a few ideas that got me to a solution I feel good about.
If you look at the structure of the params that you actually supply to the dynamodb.update()
function, it’s just a simple object with UpperCamelCased keys. So rather than hardcode the attributes we want to update, we can dynamically build our update function based on whatever attributes have been provided.
So the key things we need to build are the UpdateExpression
and the ExpressionAttributeValues
.
Programming UpdateExpression
UpdateExpression
uses an SQL-like syntax to SET, ADD (increment), or REMOVE certain attributes on the item you’re updating. If you want to SET multiple attributes in a single expression it will look like what we did above:
SET latestUpdate = :updateTime, developer = :developer, publisher = :publisher
Those colon-prefixed keys like :updateTime
will be found in the ExpressionAttributeValues
. They're arbitrary placeholders, so name them anything you want as long as they match a value in the ExpressionAttributeValues
. You can think of it as the value substitution (aka "Parameterized Queries") you find in SQL.
You can probably see how we go about solving this then:
- we take all the keys found in data object we're updating,
- map them to properties we want to update,
- create a colon-prepended key for them, then
- add those values to the
ExpressionAttributeValues
list.
The final result looks like this:
const data = {title: 'Uncharted', developer: 'Naughty Dog', publisher: 'Sony'}
const params = {
TableName: process.env.DYNAMODB_TABLE,
Key: { title: data.title },
};
params.UpdateExpression = "SET latestUpdate = :updateTime"
params.ExpressionAttributeValues = {}
for (let k in data) {
if (k !== 'title') {
params.UpdateExpression += `, ${k} = :${k}`
params.ExpressionAttributeValues[`:${k}`] = data[k]
}
}
params.ExpressionAttributeValues[':updateTime'] = Date.now()
await dynamodb.update(params).promise()
You’ll notice a few things about the above code:
- we start our
UpdateExpression
withSET latestUpdate = :updateTime
; the purpose of that is so we don’t need to deal with the edge case of the first item in our iteration through the keys. With that starting off our expression, we can simply concatenate the string with all the keys, starting with a colon to separate each value. Makes life easier, plus it's good practice to save that information. Two birds, one stone. - We have to set the initial value of
ExpressionAttributeValues
to an empty object{}
, so we can write the keys and values to it. - In our
for
loop, there’s anif
statement because we want to get every attribute excepttitle
, which is theKey
to our table. Even if it hasn’t changed, we can’t put that in theUpdateExpression
or DynamoDB will error. You're not allowed to put theKey
in theUpdateExpression
.
Now we should see no errors when we send two separate requests to the same endpoint, the first which will create the game, the second which will update it with a new attribute.
Personal Takeaway
I’m new to DynamoDB so I’m not sure if it’s crazy or clever that this was the way to achieve this type of behavior in the update
function. I have my complaints about the default implementation of the update
function, but at the same time it wasn't a huge leap of imagination to get the desired behavior, so perhaps in a sense this is as flexible as marketed.
In only a few days of playing with it, I'm starting to see the patterns one can utilize to fill in the gaps where desired built-ins might be missing. Let me know if this was helpful for you or if you've faced any similar blockers with DynamoDB!
Top comments (2)
If this is the way to update Dynamodb, it seems like race conditions are going to be a problem. Imagine multiple editors of a document. Each editor will have its own state, slightly out of sync with the other editor. If editor 1 changes attribute A1, and editor 2 changes attribute A2, the result won't be what one of the editors expects.
Good topic. Still reading. I think you meant to close the quote for this string:
UpdateExpression: 'SET developer = :dev', publisher = :pub,