I've been learning about Data Structures and it all seems very rocket science. 🙃 To help me understand it better (learning to learn) I've made a small Zine about one of them - Hash Tables ⚡️
At the bottom, you can check the full written version with resources, but for now, check out the illustrative version!
Feel free to download the printable version. Here's also the original handmade version.
... and now the written version...
The Magic of Hash Tables
What it is
The most efficient data structure to 🔍Search, ➕Insert and 🗑Delete.
Does all of it in O(1) complexity on average (i.e. Takes the same time to do it, no matter the size of data available)
It's used for Caching, Database Searching, and Sets
How it works
An unordered collection of unique keys is mapped to values through the process of hashing.
Take a key --> Run it in a hash function --> that generates a hash
Portuguese --> ✨ --> 2
French --------> ✨ --> 0
English -------> ✨ --> 3
... and add it to the matched index on the table (also known as "bucket"):
hash | value |
---|---|
0 | Salut |
1 | - |
2 | Olá |
3 | Hello |
Hashing is one-way!
A hash can’t be converted back to its original key.
ex: Hashing "English" returns 3, but with 3 you can't get "English"
A hash function must:
- 🏎 Be fast to run.
- 🧩 Uniform distribution.
- 💥 Resolve hash collisions.
How to handle Collisions?
A collision happens when a key corresponds to a hash (index) already taken by another key.
Open Addressing
When a bucket is already taken, other buckets are examined until an unused bucket is found.
- 💚 Fast when the number of collisions is small.
- 🚨 Tables can’t be smaller than the number of keys.
Chaining
A bucket can take multiple keys with the same index.
- 💚 Good for a large number of collisions.
- 🚨 Wastage of space (some buckets are never used).
💡FAQ
What is the ideal Load Factor?
Some say it’s around 0.7. This means a table with 70 items should have around 100 buckets. But it’s a balance. The emptier the buckets, the faster it is, you have fewer collisions but more memory is used.
What is Dynamic Resize?
It happens when a table is getting out of balance - it’s too full or too empty. It resizes the table and rehashes the keys.
What’s the best Hash Function?
There isn’t a one-fits-all solution. It’s a trade-off between speed and distribution.
A good hash function is the secret to an efficient hash table.
TIP: When the keys are static use a Perfect Hash Function. * No collisions * No empty buckets *
When should I avoid Hash Tables?
- Search is not the main operation.
- Need to iterate over the elements.
- There are duplicated keys.
🎁 Bonus: Applying Simple Hash Table Concept in Javascript!
😰 When using If statements...
You end in an if hell (or switch hell) when dealing with different scenarios to assign a variable.
if(viewport === 'sm') {
fontSize = 1.0;
} else if (viewport === 'md') {
fontSize = 1.2;
} // else if ... else if ...
⚡️ But using key:value
mapping...
You get a direct lockup - O(1) complexity - no matter the number of options, with a cleaner code. How cool is that?
const sizes = {
sm: 1,
md: 1.2,
lg: 1.2
};
const fontSize = sizes[viewport];
Resources to get started:
Thank you for reading!
Top comments (3)
Understood it better from this article than I did from my semester course.
Also I believe it is else if not if else after the initial if statement.
Wow, that's great feedback, thank you! That
if else
typo was so 🤦♀️, fixed!Thanks for making it super simple!