Intro 🌐
Last time, we learned what a Hash Table is and why we want to use it.
Today, we'll learn how to write a simple Hash Function.
Problem: WHY do we need a Hash Function?
To access a value in an array, we need its index.
But we want to use our Hash Table with a key instead of an index, e.g. user.name
instead of user[0]
.
To convert the index to a key, we need a helper function that does this task for us.
This is where the hash functions comes into play.
What is a Hash Function? ▶️
For our Hash Table, we need a function, that converts a key into an array index.
Let's use the example from the last post.
I want to build something like this:
const user = {
name: 'miku86',
homeCountry: 'Germany',
age: 33,
}
Because we want to use an array under the hood, the implementation could look like this:
const user = [33, 'miku86', 'Germany']
So when we want to get the correct index of:
-
name
, we want to runmyHashFunction("name")
and get back1
. -
homeCountry
, we want to runmyHashFunction("homeCountry")
and get back2
. -
age
, we want to runmyHashFunction("age")
and get back0
.
As you can see, there is no order in the array, the array index is solely bound to the key.
Let's think about the hash function's constraints:
Deterministic: Anytime we input the same key, we should get back the same array index, otherwise we won't find our value, because our data doesn't change its place in the array.
Fast: We need to use the hash function every time we read, create, update, delete data, therefore it should be fast and shouldn't be connected to the length of the existing data (O(1)
).
Uniform Distribution: Think about an array of length 2. If we want to add 3 values to an array with 2 places to store data, we would have to store 2 values in 1 place (or lose data (we don't want to do that)). This would be a collision, meaning two values live at one place. Because collisions lead to additional computational work (we have to find the desired value), we want an uniform distribution of our array indexes.
How do we build a Hash Function? 🛠
There are a lot of hash functions out there. Because we want to learn about the bigger concepts, our hash function will be far (really far) away from being good.
Version 1: length of the key
Let's be creative and think about a simple hash function, that could probably work, let's take the length of our key.
The length of:
-
name
is 4 -
homeCountry
is 11 -
age
is
This works for this small example. But as you can guess, there is a high probability that this will lead to collisions very quickly. If we increase the amount of our key-value pairs to 100 instead of 3, we would end up with an array that has a lot of collisions between index ~3 and 10, because most (english) words are fairly short in length, so many keys would get the same hash.
Version 2: sum of character codes of the key
Every character has a UTF-16 code. JavaScript has a function to get this value, charCodeAt
.
Example: name
has the charcodes:
- n:
110
- a:
97
- m:
109
- e:
101
If we sum these charcodes, we get 417
.
Implementation of our simple (but not-so-good) hash function
Our tasks:
- split the key into its characters:
name
=>n
,a
,m
,e
- convert every character into its character code: n:
110
, a:97
, m:109
, e:101
- sum all character codes: 110 + 97 + 109 + 101 => 417
const hash = key => {
// split the key into its characters
const chars = key.split('')
// convert every character into its character code
const charCodes = chars.map(char => char.charCodeAt())
// sum all character codes
const charCodeSum = charCodes.reduce((acc, cur) => acc + cur)
// return the sum
return charCodeSum
}
Result:
hash('name')
// 417
Thoughts 💭
We created a simple hash function to grasp the concept.
The hash function takes a key as input and returns the array index where it should get saved.
As you can see, hash functions are a very complex topic. If you want to dive deeper into it, I invite you to read the Further Reading
section.
Next Part ➡️
We will learn how to handle collisions.
Don't miss interesting stuff, go to!
Further Reading 📖
- Simple Wiki: Hash Function
- Wikipedia: Hash Function
- Wikibooks: Hash Function
- Wikipedia: List of Hash Functions
Questions ❔
- What's the easiest method to generate a collision with our hash function?
- How can we improve our hash function?
- What are the advantages of our implemented hash function?
- What are the disadvantages of our implemented hash function?
Top comments (0)