In this article we will be taking a look at one of the worst sorting algorithms in existence. Bogosort is an algorithm that has a best case big O of O(n)
which might make you think it's not that bad but when you see the average case big O which is O((n + 1)!)
then you begin to realise the problem.
If you don't know much about big O notation it is basically just a way to calculate and measure the time and space complexity of our implementations. In this case the best case performance is dependant on the size of the collection being sorted but the average case is the size of the collection plus 1 factorial.
So, how does Bogosort actually work? Well, it takes a collection and while the collection is not in order it shuffles the whole collection on each iteration until it is sorted. Imagine this like taking a pack of cards, throwing them into the air, picking up each card randomly until all are collected and then checking if they are in order and if they aren't then repeat the process until they are in order. 😅
Clearly this is highly inneficient and actually I have found that collections over 15-20 items tend to take so long to sort that you might aswell give in. To show why this is, let's do the math on how many comparisons bogosort has to make on average for collections of 0 - 20 items in size:
Collection size | Comparisons run |
---|---|
0 | 1 |
1 | 2 |
2 | 6 |
3 | 24 |
4 | 120 |
5 | 720 |
6 | 5040 |
7 | 40320 |
8 | 362880 |
9 | 3628800 |
10 | 39916800 |
11 | 479001600 |
12 | 6227020800 |
13 | 87178291200 |
14 | 1307674368000 |
15 | 20922789888000 |
16 | 355687428096000 |
17 | 6402373705728000 |
18 | 121645100408832000 |
19 | 2432902008176640000 |
20 | 51090942171709440000 |
Ok, so, let's take a second to comprehend these numbers, take a deep breath and repeat the immortal words of Ron Burgundy:
Tests
<?php
declare(strict_types=1);
use PHPUnit\Framework\TestCase;
final class BogoSortTests extends TestCase {
public function testSort(): void {
$expected = [0, 1, 2, 3, 4];
$actual = bogosort([3, 1, 2, 4, 0]);
$this->assertEquals($expected, $actual);
}
public function testExceptionForInvalidInput(): void {
$this->expectException(TypeError::class);
bogosort("oops!");
}
}
final class InOrderTests extends TestCase {
public function testFalsePath(): void {
$expected = false;
$actual = in_order([3, 1, 2, 4, 0]);
$this->assertEquals($expected, $actual);
}
public function testTruePath(): void {
$expected = true;
$actual = in_order([0, 1, 2, 3, 4]);
$this->assertEquals($expected, $actual);
}
public function testExceptionForInvalidInput(): void {
$this->expectException(TypeError::class);
in_order("oops!");
}
}
Since I want the tests to run without too much runtime required I have kept the array to be tested at only 5 items in size. The tests are written in PHPUnit which is an awesome testing framework for PHP so if you haven't used it before I recommend it.
Anyway, effectively we just test that the sorting actually works when the bogosort
function is run and that invalid input throws as expected. We also test the true
, false
and invalid input paths of the in_order
function.
Implementation
<?php
declare(strict_types=1);
function bogosort(array $array) {
while (!in_order($array)) {
shuffle($array);
}
return $array;
}
function in_order(array $array) {
for ($i = 1; $i < count($array); $i++) {
if ($array[$i] < $array[$i - 1]) {
return false;
}
}
return true;
}
When the bogosort
function is ran it expects an array as input and while that array is not in order it uses PHPs default array shuffle function to pseudo-randomly shuffle the collection. This is inkeeping with the throwing a deck of cards in the air and randomly collecting them again analogy that was used in the introduction to this article. When the in_order
function eventually returns true
the sorted array is returned.
The in_order
function itself loops over the array and checks if the current item is less than the item to the left of it and if it is returns false
since the array cannot be in ascending order. If all items are in the right order then it will return true
.
Conclusions
Although bogosort is a fun algorithm to ponder and see in action, it is of course one that shouldn't be used unless your aim is to seize up your system or production environment the second a middling collection comes along.
If I were to change one thing about my implementation though, it would be to use something like the Fisher-Yates shuffle algorithm over the PHP default shuffle function since the default shuffle mutates the input collection and this is something we would want to generally avoid.
The implementation of Fisher-Yates I would implement would look something like this:
<?php
declare(strict_types=1);
function fisherYates(array $array) {
$output = $array;
for($index = count($output) - 1; $index >= 0; $index--) {
$randomIndex = rand(0, $index + 1);
if(array_key_exists($randomIndex, $output)) {
$tmp = $output[$index];
$output[$index] = $output[$randomIndex];
$output[$randomIndex] = $tmp;
}
}
return $output;
}
If we did use this implementation then the bogosort
function would keep the original input array unaffected and thus preserve immutability. The new implementation of the bogosort
function would end up looking like this if we were to use the fisherYates
function to handle the shuffling:
function bogosort(array $array) {
$output = $array;
while (!in_order($output)) {
$output = fisherYates($output);
}
return $output;
}
I hope you found some value in this article and that you go forth and use your new knowledge wisely but remember:
Top comments (6)
Hi James, thanks for your article.
Can you explain why you need to check if the generated random index is in the output's indexes in your Fisher Yates implementation?
Good question Amin!
Imagine the following code:
There is a chance that we get a notice which starts off like this
PHP Notice: Undefined offset: 6...
. This becomes understandable when we go through what is happening on each line of thefisherYates
implementation.Let's look at our example code from above in action assuming we didn't have the
array_key_exists
conditional with a focus on the first loop iteration:Clearly in this example 6 is not a valid offset in the array since arrays are indexed from 0 and thus the array in this example can only go up to offset 5. We still need the
$index + 1
to be passed to therand()
function call though to increase the chances and possibilities for a swap, especially at lower indexes.To resolve this we add the check for
array_key_exists
for the minority of cases where therand()
call returns an invalid offset and move to allow iteration to continue without a notice being triggered.From my testing this case with the notice appearing happens around once in every 4
fisherYates
calls if thearray_key_exists
conditional isn't part of the implementation but ofcourse it's hard to accurately averange since it depends on the output of therand()
call which is inherintly random. Either way with the implementation I wrote it will never happen anyway but hopefully you understand why it is there now.Hopefully that makes sense and was helpful to you!
I see, indeed that makes totally sense, thanks for the clarification.
I still don't understand why we need to increase the index to increase our chances but there must be something I'm missing in the Fisher Yates' algorithm.
Don't worry, that's a simple one too!
Imagine the following:
In this case only index 0 or 1 would be valid to come back but if we do
$index + 1
instead then the possibile indexes for swapping are 0, 1 and 2.This way we have more possible indexes that can be used to swap with the current item and more than that it also helps us to make the output a little more random during the shuffle since more indexes can be made available for the swap during each iteration.
Thank you. That was a great explanation.
Anytime, glad to of been able to clear that up for you 😊. Thanks for your comments and for stopping by to read the article! 📚