Array is the most commonly known term which associates with [] across programming languages.
In Python we have lists, which could be considered as slice in Golang. So, why do both arrays and slice exist when we could have just one?
If you are familiar with arrays and slices, or just interested in the performance part, feel free to jump to the performance section.
How to use array and slices
Before we jump into comparing which is better, it is important to know when should array and slice be used and how are they structured. The people at Golang did a nice in depth tutorial on arrays and slices at their main site.
But for the beginners and tldr's, slices are built on top of arrays to introduce flexibility for developers. Reason being arrays requires a defined fixed length to be declared, and in a typical program, we might not know the exact size of the array we will be using (perhaps generating a dynamic tree, rows from databases etc). To further simplify it down:
var arraySpace [10]string // 1. Declare
arraySpace[1] = "personA" // 2. Assign
arraySpace[20] = "personX" // 3. Assign invalid
An array should say: I have prepared a room for 10 items made of strings.
Then, it will go to the second space (array starts from 0), and put "personA" there.
But now, someone asked array to put "personX" in position 20. However the room has allocated only 10 space, therefore array panics and has no clue what to do.
For slice:
sliceSpace := []string{} // 1. Declare
sliceSpace = append(sliceSpace, "personA") // 2. Add personA
sliceSpace = append(sliceSpace, "personB") // 3. Add personB
A slice would say: I am not sure how many items made of strings will be added, so I will just create a non existence room first.
What happens next is the slice will have to first create a room for one person then put "personA" in it. Now when we try to append "personB", slice will say: seems like the room has a capacity of 1 person only, but that is not an issue. I just have to widen the room for one more person and put "personB" in.
Declare a slice length?
Despite being something that is meant to be dynamic, we can actually declare the length and capacity of the slice. Doing so will declare a slice with initialized default value of the length defined, and we can access the intialized values just like how we access an array!
dynamicSlice := make([]int, 2, 5) // make(datatype, length, capacity)
fmt.Println(dynamicSlice ) // [0 0]
fmt.Println(len(dynamicSlice )) // 2
fmt.Println(cap(dynamicSlice )) // 5
fmt.Println(dynamicSlice[0]) // 0
What is the difference between the length and capacity of a slice? Simply put it, capacity is the room space and length will be the number of person in that room. In our example above, we are creating a slice with a room that is capable of having 5 people, and initializing 2 people in it by default. This means I can keep appending into dynamicSlice another 3 more times (because 2 person are already in the room) before slice needs to find another room to widen its space.
Performance
Now we understand how does array and slices come about, we will be comparing between a declared length slice which closely manipulates the array directly and a dynamic length slice.
Setup
For the setup, we will be initializing a struct rather than a string or int to "up" the complexity. The idea will be for looping a range (iteration), and filling in values into the struct below and adding it into a slice.
type DemoStruct struct {
iterationIndex int
stringValue string
used bool
}
const someStringValue = "AbCdEfG123"
The function for a declared length slice. Rather than using append, we will directly access the initialized DemoStruct.
func LengthDefined(iterations int) []DemoStruct {
DemoStructs := make([]DemoStruct, iterations)
for i := 0; i < iterations; i++ {
DemoStructs[i].iterationIndex = i
DemoStructs[i].stringValue = someStringValue
DemoStructs[i].used = true
}
return DemoStructs
}
The below function will be used for the dynamic length slice. A DemoStruct is initialized at every iteration and then added into DemoStructs slice.
func DynamicLength(iterations int) []DemoStruct {
var DemoStructs []DemoStruct
for i := 0; i < iterations; i++ {
DemoStructs = append(DemoStructs, DemoStruct{
iterationIndex: i,
stringValue: someStringValue,
used: true,
})
}
return DemoStructs
}
Next, the similar benchmark setup that I used in the previous episode of this series was used.
Results
Running the basic Go builtin benchmark tool
go test -bench=. -benchmem
The table below shows the results for a declared length slice.
Type | TestingIterationsSeq | Loops | ns/op | b/op | allocs/op |
---|---|---|---|---|---|
LengthDefined | 1 | 21062053 | 57.2 | 32 | 1 |
10 | 6153226 | 176 | 320 | 1 | |
100 | 1000000 | 1157 | 3200 | 1 | |
1000 | 120656 | 19127 | 32768 | 1 | |
10000 | 6002 | 207156 | 327681 | 1 |
The table below shows the results for a dynamic length slice.
Type | TestingIterationsSeq | Loops | ns/op | b/op | allocs/op |
---|---|---|---|---|---|
DynamicLength | 1 | 14973589 | 67.3 | 32 | 1 |
10 | 1745436 | 683 | 992 | 5 | |
100 | 376430 | 5371 | 8160 | 8 | |
1000 | 38653 | 38165 | 65504 | 11 | |
10000 | 1012 | 1034081 | 1761253 | 21 |
Clearly declaring length on a slice with initialized value is operationally quicker (More for loops per iteration) with less overhead (smaller ns/op). Looking at allocs/op, we can clearly see how append is affecting the shift in memory (constantly allocating new rooms for widening its space).
Now some of you may wonder, how would the below code perform:
func CapDefined(iterations int) []DemoStruct {
// Notice that we are allocating a room for 100 without initializing a single value
// Therefore we need to append
DemoStructs := make([]DemoStruct, 0, 100)
for i := 0; i < iterations; i++ {
DemoStructs = append(DemoStructs, DemoStruct{
iterationIndex: i,
stringValue: someStringValue,
used: true,
})
}
return DemoStructs
}
Results:
Type | TestingIterationsSeq | Loops | ns/op | b/op | allocs/op |
---|---|---|---|---|---|
CapDefined | 1 | 1000000 | 1038 | 3200 | 1 |
10 | 1237651 | 1032 | 3200 | 1 | |
100 | 857613 | 1244 | 3200 | 1 | |
1000 | 37993 | 31771 | 107904 | 5 | |
10000 | 2194 | 571945 | 1705351 | 13 |
What we can immediately notice is from 1 - 100 iterations, the performance is significantly horrible compared to both LengthDefined and DynamicLength slice. This is due to the fact is is always allocating a room for 100 people regardless whether it is being used or not. Now looking at 100 iterations and above, the allocs/op starts increasing. This is due to the fact that we are trying to fit 1000 people into a room with a capacity of 100 people, therefore it has to start allocating new room with a capacity of 100 people (because that is what we declared, a 100 capacity room) every time one is filled. Apart from that, we are getting the overhead of initializing a new Struct. It is only a little better than a dynamic length slice at a scale.
Summary
What we can deduce from these benchmark would be a slice with declared length would almost always perform better. Whereby over allocating or under allocating the capacity would certainly impact the performance of the system you are building. Looking at the range below 100 loops, declaring the slice with a length or dynamically would actually be unnoticeable in a real world application.
In a real world application, there are certain factors that does not allow declaring a fixed length because we may not know the number of items, such as SELECT
all the "products" from a database. However if we have a TOP
or LIMIT
of 100, we could always have a fixed size slice.
The takeaway would be not to spend too much time and over optimize unless there is a visible performance issue. Personally this has became a habit of writing slices more than a checklist for optimizing (to know when to declare a slice with length or dynamically), and I hope you would benefit from the information here.
Top comments (1)
Another key difference between them is that arrays always passes by value and slices always passed by reference. It's a reason for better performance while using slices.