This is the second part of my rambling discussion on stuff I want to build. You can find the first part here.
Simple Structures
I’m going to start from some first principles. The odds are good that you’re already familiar with everything I’m about to say, and that’s great. But maybe it's new, or you want a refresher. Either way, I want to make sure we’re on the same page before I move to the next step.
Most data today is in the form of a record. This is just a fancy term for grouping data together. For instance, I could group names and ages.
(Mary, 32)
(Paula, 21)
(Andrew, 35)
(Oh, you didn’t think I’d put my real age there, did you? 😉)
Within a record, there are either simple data types, such as strings and numbers, or other record types. This ought to look familiar to anyone who has used any programming languages. The parts of each record will usually have labels to refer to each specific part, or else the developer is left to remember which part is in which position (such as in a tuple for languages like Python). So the first parts here are called name and the second parts are called age.
In modern languages, we have a choice of designing a record specifically for the data or using ad hoc map structures that use the part labels as keys. For instance, in Javascript, we can have:
a_person = {name: "Mary", age: 32}
class Person {
constructor(name, age) {
this.name = name;
this.age = age;
}
}
a_person = new Person('Mary', 32);
The latter looks more complex, but it is especially useful when we want to guarantee that the structure contains required information, or otherwise validate the data being provided.
Once we have this simple record type, how do we go about storing more than one? Most languages provide several ways. In Javascript, you would probably use an array. But now that we have some basic elements of records I’m going to look at the Linked List.
Linked Lists
Linked Lists are one of the first data structures taught in CS courses. Well, they were one of the first that I was taught (confession: I skipped class that day). They are often overlooked, due to their limitations, but there are occasions when they are quite valuable. They’re also illustrative of some of the things that I want to explore, so I’m going to work with them here.
Say I want to store a series of numbers: 1, 1, 2, 3, 5, 8, 13, 21, 34
The basic idea of a linked list is to create a daisy chain of simple elements which each hold a single value, and then refer to the next element.
Each element then needs to have the value being stored there, and also refer to the next one along. The trick is that the last one refers to nothing when it points to the next one.
Let’s try that, using Javascript again:
class Element {
constructor(value, next = null) {
this.value = value;
this.next = next;
}
}
Worth noting is that when the next
parameter is not provided, then it gets set to null
. This just means that a single element list is being created, with nothing else following it.
A trick to building a linked list is to build them from the back. That way when you want the “next” element in the chain, you’ve already made it. You can do it from the front, but then you need to use up extra memory (usually on the call stack) as you run towards the end. The computer still builds the chain from the back, even if your code doesn’t look that way.
But in this case, I’m going to build the chain, starting from 34 and moving back to the front:
let list = new Element(34);
list = new Element(21, list);
list = new Element(13, list);
list = new Element(8, list);
list = new Element(5, list);
list = new Element(3, list);
list = new Element(2, list);
list = new Element(1, list);
list = new Element(1, list);
This can (and should be!) written in a friendlier way, but that won’t get us any closer to where we’re going, so I’m going to leave all the raw parts here for you to see.
First of all, you can see that the final element in the chain has nothing following it, so this is left as the default, which is just null
. You’ll also see that I’m using a list
variable to hold the start of the list all the time. So the list
variable keeps shifting to the most recent start of the list. Each time a new element is added to the start of the list, then the existing list
value is used for the remainder of the list, and then the list
variable is updated to refer to the new start.
I keep referring to "the list" and "the rest of the list". Traditionally, the beginning of a linked list is called the head, and the remaining list is called the tail. Note that both the head and the tail are both valid lists, and that the tail list is always 1 shorter than the list that starts at the associated head.
What does the resulting structure look like? Is it as we expect?
Element {
value: 1,
next: Element { value: 1, next: Element { value: 2, next: [Object] } } }
It seems to be, but the nested data gets cut off quickly, so let’s build a function that creates a more readable string.
Linked lists are ideal for being processed with recursive functions. I remember when recursion was first shown to me. I followed what was being said easily enough, but every time I tried to write something recursive I always struggled. I just wasn’t getting it. But these days I take recursive functions for granted. So it is possible to learn them! Really.
Meanwhile, if I gloss over concepts like that, then I apologize. Ask me for clarification.
For linked lists, a recursive function generally falls into a standard pattern:
- Do something with the first element of the list.
- If there is a tail, then call the function again on that tail.
In this case, the operations are short enough:
- Look at the start of the list and return what’s there.
- If there is no more list, then that’s it.
- But if there is more list, then add a comma and then add on... the same thing again for the rest of the list:
function listString(element) {
if (element.next === null) return element.value;
else return element.value + ', ' + listString(element.next);
}
listString(list)
'1, 1, 2, 3, 5, 8, 13, 21, 34'
Which is what we expected, so we know the structure is working.
There are lots of things that can be done with linked lists. We can fill them from arrays, or pull them back out into arrays. We can insert extra numbers into the middle, we can remove elements, we can reverse them, we can process each element in turn… the possibilities go on. Many of these operations are quite important in contexts where we want to update structures, but this discussion will be focusing on structures that are rarely updated, so we can leave that for another time.
Instead, we are looking at how these structures may be represented in storage. To do that, the next article will consider how this linked list structure might look in an older language, like C.
On to Part 3.
Top comments (2)
This is similar to how they suggest you insert data into lists in Elixir as well. By adding an element as the head and the rest of the list as a tail, you save up on precious time trying to insert new elements.
took me a minute to realise that was a picture of an actual chain of daisies. Good joke