In this tutorial, we'll dive deep into Amazon DynamoDB, a fast and fully managed NoSQL database service designed for seamless scalability and low-latency performance. You'll gain a solid understanding of both the theoretical concepts and practical aspects of working with DynamoDB, allowing you to leverage its power for your data storage needs.
DynamoDB is the preferred choice when it comes to applications that need low-latency data access. Let's start to learn and practice some of the fundamentals on this amazing fully managed NoSQL database service that AWS offers. If you’re into building scalable, serverless and high-performance applications, this is right for you!
Table of Contents
- What is Amazon DynamoDB
- Main Components
- Creating DynamoDB Tables
- Working with Data
- Data Model and Schema
- Why DynamoDB?
- DynamoDB vs. other DB Services
- Conclusion
- Additional Resources
1. What is Amazon DynamoDB
Amazon DynamoDB is a fully managed NoSQL database service offered by Amazon Web Services (AWS). It is designed for developers who need a fast, scalable, and highly available database for modern applications. In this section, we'll explore why DynamoDB is a popular choice and its key features.
For those of you who are new to this terminology, NoSQL databases are known as tabular databases and store data differently than relational tables. They come in a variety of types based on their data model and DynamoDB here works on key-value pairs and other data structure documents provided by Amazon.
DynamoDB only requires a primary key and doesn't require a schema to create tables. Hence it can store any amount of data and serve any amount of traffic, therefore, you can expect a good performance even when it scales up. It's pretty simple to learn and a small API that follows a key-value method to store access and perform advanced data retrieval.
2.Main components
DynamoDB comprises three fundamental units:
Attributes: This is the simplest element in DynamoDB that stores data without any further division. Each attribute has a name and a value. DynamoDB supports various data types for attributes, including strings, numbers, binary data, lists, and maps. Attributes are used to store the actual data in your items.
Items: Items are individual data records within a DynamoDB table. Each item is uniquely identified by a primary key, which can consist of one or two attributes: a partition key (mandatory) and an optional sort key. Items can also have additional attributes that provide data for each record.
Tables: Tables are the highest-level data structures in DynamoDB. They are where you store your data. Each table consists of items, and each item represents an individual data record. Tables are schema-less, meaning that items within a table do not need to have the same attributes, allowing flexibility in your data modeling.
Key Features
Managed Service: DynamoDB is a fully managed service, which means AWS takes care of the operational aspects like server provisioning, scaling, and maintenance. This allows developers to focus on building applications instead of managing infrastructure.
Scalability: DynamoDB can automatically scale to handle high-traffic workloads without the need for manual intervention. It can handle millions of requests per second, making it suitable for applications with variable workloads.
Performance: It offers single-digit millisecond latency for read and write operations, making it an excellent choice for applications that require low-latency access to data.
High Availability: DynamoDB is designed for high availability with built-in data replication and automatic failover across multiple Availability Zones. Your data is always accessible, even in the event of hardware failures.
Security: It provides robust security features, including encryption at rest and in transit, fine-grained access control with AWS Identity and Access Management (IAM), and VPC (Virtual Private Cloud) integration for network isolation.
Serverless Triggers: You can integrate DynamoDB with AWS Lambda to create serverless workflows that respond to changes in your data, enabling real-time processing and automation.
Global Reach: DynamoDB offers multi-region and multi-master capabilities, allowing you to deploy databases globally with low-latency access for users worldwide.
Pay-as-You-Go Pricing: DynamoDB uses a pay-as-you-go pricing model, where you only pay for the read and write capacity you consume and the storage you use, with no upfront costs or long-term commitments.
Use Cases
DynamoDB is well-suited for a wide range of use cases, including:
Real-time applications that require low-latency access to data.
Internet of Things (IoT) applications for managing device data.
Gaming applications for user profiles, leaderboards, and in-game items.
Session management and user authentication in web and mobile apps.
Content management systems and catalogs.
Ad tech platforms for tracking user behavior and ad impressions.
3. Creating DynamoDB Tables: (Practical Exercise) Creating a Sample Data Model
Let's go through the steps to create a DynamoDB table using the AWS Management Console:
Log in to the AWS Management Console.
Navigate to the DynamoDB Dashboard.
Click "Create table" and specify the table name and primary key attributes.
Configure the provisioned throughput or choose on-demand capacity.
Create the table.
Let's create a sample data model for a hypothetical e-commerce application using DynamoDB.
{
"TableName": "EcommerceProducts",
"KeySchema": [
{ "AttributeName": "ProductID", "KeyType": "HASH" },
{ "AttributeName": "Category", "KeyType": "RANGE" }
],
"AttributeDefinitions": [
{ "AttributeName": "ProductID", "AttributeType": "N" },
{ "AttributeName": "Category", "AttributeType": "S" }
],
"ProvisionedThroughput": {
"ReadCapacityUnits": 5,
"WriteCapacityUnits": 5
}
}
This schema defines a DynamoDB table for storing e-commerce product data.
4. Working with Data
Now let's populate with some sample data our EcommerceProcucts table, so once the table is created, you will need to click in the table, then Explore table items
button in the top right, and then, Create item
button. The easy way is to add the Items manually by assigning the Name, Value and Type, I prefer the JSON View
, you can put these values in that field, and then Create item
{
"ProductID": {
"N": "101"
},
"Category": {
"S": "Electronics"
},
"Price": {
"N": "699.99"
},
"ProductName": {
"S": "Smartphone"
}
}
Easy peasy isn't it!!
You can use the AWS SDKs or AWS CLI to interact with DynamoDB programmatically. Here's an example of adding data to our "EcommerceProducts" table:
aws dynamodb put-item --table-name EcommerceProducts --item '{
"ProductID": {"N": "101"},
"Category": {"S": "Electronics"},
"ProductName": {"S": "Smartphone"}
}'
This command will add the Item to the table, just as we did before in AWS Console.
5. Data Model and Schema
DynamoDB is schema-less, and what does schema-less mean? it means that DynamoDB doesn't require a schema to create a table, allowing you to define your data structure dynamically. It utilizes the concept of items (records) and attributes (fields) to store and retrieve data.
The structure of a DynamoDB table is also comprised of Primary Keys, Partition Keys, Sort Keys and Partitions, you can deep dive into it in AWS official documentation. Basically keys work the same as it happens in JSON documents, but wait...so DynamoDB is basically comprised of JSON documents? Yes that's exactly what happens.
In the following table structure in JSON for a sample DynamoDB table called Employees
, you can find Items and Attributes, where Items are similar to the concept of rows and records in relational DBs systems, each Item represents a different Employee in the table. Each item is composed of one or more attributes, so in this sample table each employee has attributes such as EmployeeID, Company, Username and so on.
Employees table
{
'EmployeeID': 101,
'Company': 'Rose-Hill',
'Username': 'ahenry',
'Name': 'Antonio Henry',
'Sex': 'M'
},
{
'EmployeeID': 102,
'Company': 'Lowe, Johnson and Flynn',
'Username': 'fwashington',
'Name': 'Frank Washington',
'Sex': 'M',
'Address': {
'Street': '470 David Ports Apt.281',
'City': 'Chapmanchester',
'ZIPCode': 'MP 12525'
}
},
{
'EmployeeID': 103,
'Company': 'Cook-Crawford',
'Username': 'lhawkins',
'Name': 'Lisa Hawkins',
'Sex': 'F',
'Mail': 'hyoung@hotmail.com',
'Birthdate': datetime.date(2002, 11, 21)}
}
Note the following features of the Primary Key in the Employees table:
- Each item in the table has a primary key as we know them from the relational model, that sets a unique identifier from all of the others in the table. In the Employees table, the primary key consists of one attribute (EmployeeID).
- Some of the items have a nested attribute, which is Address. DynamoDB allows this kind of attribute to have up to 32 levels deep.
Now let's understand a little bit more about how primary keys and sort keys works in DynamoDB, by loading a table called Music, that has some of my favourite Artists on it:
Music table
{
"ArtistID": {"N": "50"},
"ArtistName": {"S": "Metallica"},
"AlbumID": {"N": "1"},
"AlbumName": {"S": "...And Justice For All"},
"SongTitle": {"S": "Dyers Eve"},
"Genre": {"S": "Metal"}
}
{
"ArtistID": {"N": "1"},
"ArtistName": {"S": "AC/DC"},
"AlbumID": {"N": "4"},
"AlbumName": {"S": "Let There Be Rock"},
"SongTitle": {"S": "Whole Lotta Rosie"},
"Genre": {"S": "Rock"}
}
{
"ArtistID": {"N": "132"},
"ArtistName": {"S": "Soundgarden"},
"AlbumID": {"N": "7"},
"AlbumName": {"S": "Superunknown"},
"SongTitle": {"S": "Black Hole Sun"}
"Genre": {"S": "Rock"}
}
Note that the primary key for the table Music consists of two attributes (ArtistID and ArtistName). Each item in the table must have these two attributes. The combination of ArtistID and ArtistName distinguishes each item in the table from all of the others.
DynamoDB also supports two different kinds of primary keys:
- One of them is the Partition key, which is basically a primary key, composed of one attribute known as the partition key.
In a table that has only a partition key, no two items can have the same partition key value.
- The other one is the composite primary key, which is the combination of the Partition key and the Sort key. This type of key is composed of two attributes. The first attribute is the partition key, and the second attribute is the sort key.
DynamoDB uses the partition key value as input to an internal hash function. The output from the hash function determines the partition (physical storage internal to DynamoDB) in which the item will be stored. All items with the same partition key value are stored together, in sorted order by sort key value.
In a table that has a partition key and a sort key, it's possible for multiple items to have the same partition key value. However, those items must have different sort key values.
In our Music
table, such a thing would happen if we add to the table another item that has the same ArtistID
for Soundgarden, in this case the Composite Primary Key will be the ArtistID and SongTitle, and ArtistID
identified with 132 in the table will be stored in the same partition.
{
"ArtistID": {"N": "132"},
"AlbumID": {"N": "5"},
"AlbumName": {"S": "Badmotorfinger"},
"ArtistName": {"S": "Soundgarden"},
"Genre": {"S": "Rock"},
"SongTitle": {"S": "Face Pollution"}
}
You can read and learn more from the DynamoDB Core Components in AWS official documentation.
6. Why DynamoDB?
When data no longer fits on a single MySQL server or when a single machine can no longer handle the query load, some strategy for sharding and replication is required. The pitch behind most NoSQL databases is that because they were designed from the ground up to be distributed and to handle large data volumes, they provide some combination of benefits that a simple relational database can't easily offer.
So NoSQL databases allow us to model data close to what the application requires. The relational database management system's way of thinking is to usually force fit every domain model into a structure of tables and columns, and this has led to a plethora of artifacts that try to solve the impedance mismatch problem.
What you'd need in this case, is something that can scale up to that number first and accept various different data types in such a case something like Dynamo DB is easy to set up and the scaling would be extremely smooth hence companies such as Expedia and next in use Amazon Dynamo DB as the primary database to leverage its features to deliver steady latency and a stable application experience to their customers.
7. DynamoDB vs other DB Services
Moving on we shall discuss DynamoDB features compared with other sister services from AWS RDS, each of these services is slightly different in terms of Storage, Maintenance and Scaling features, and sometimes can be pretty difficult to determine which best fits your needs.
In fact, in many situations, multiple databases are a part of a single solution. So let's go through an overview of each of them to help inform your decision further.
8. Conclusion
In this tutorial, we've delved into the world of Amazon DynamoDB, from its data model and schema to creating tables, and working with data.
As you continue your journey with DynamoDB, consider exploring advanced topics such as best practices for optimizing query performance and features like Global Tables and security controls.
Now equipped with a solid foundation, you can confidently harness the scalability and speed of Amazon DynamoDB for your data storage needs.
9. Additional Resources
[AWS DynamoDB Documentation](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Introduction.html)
[AWS DynamoDB Tutorial](https://www.youtube.com/watch?v=k0fcbRj_pZE)
[AWS DynamoDB Schema Design](https://www.youtube.com/watch?v=XvD2FrS5yYM)
Happy data modeling and querying!
Top comments (0)