Continuing our recent dive into search algorithms, we will now be covering the Jump Search algorithm. Similar to Binary Search the Jump Search algorithm is a searching algorithm for pre-sorted arrays.
The basic idea is to check fewer elements than Linear Search by jumping ahead by a fixed increment and backtracking when a larger value than the one searched for is found, running a Linear Search from there to find if the value exists. This may sound strange at first but by the end of this article, all will be clear.
This algorithm saves time and is thus faster when compared to a Linear Search since we do not look at all elements in the collection. It is however slower than Binary Search since the Jump Search algorithm runs with a best case Big O of O(โn)
.
Jump search has one advantage over Binary Search however and that is that we only go backwards once whereas Binary Search can jump back as many as O(Log n)
times. This being the case, in situations where Binary Search seems to be running slow due to backward lookups, Jump Search might be able to help you in such a scenario to speed things up.
Tests
from main import jumpSearch;
def test_strings():
assert jumpSearch(["Dave", "James"], "Dave") == 0;
assert jumpSearch(["Dave", "James"], "abc") == -1;
def test_integers():
assert jumpSearch([1,2,3,4], 3) == 2;
assert jumpSearch([1,2,3,4], 5) == -1;
def test_floats():
assert jumpSearch([3.14], 3.14) == 0;
assert jumpSearch([3.141], 3.14) == -1;
def test_booleans():
assert jumpSearch([False, True], True) == 1;
assert jumpSearch([False, True], False) == 0;
def test_lists():
assert jumpSearch([[3.14]], [3.14]) == 0;
assert jumpSearch([[3.141]], [3.14]) == -1;
def test_nested_lists():
assert jumpSearch([[3.14, [1]]], [3.14, [1]]) == 0;
assert jumpSearch([[3.141]], [3.14]) == -1;
def test_sets():
assert jumpSearch([set([1])], set([1])) == 0;
assert jumpSearch([set([1])], set([1,2])) == -1;
def test_tuples():
assert jumpSearch([tuple([1])], tuple([1])) == 0;
assert jumpSearch([tuple([1])], tuple([1, 2])) == -1;
You may notice that these are similar to the tests for Linear Search and Binary Search that we wrote in previous articles and this is on purpose.
The difference in tests comes, as with binary search, in the comparison operators we are using. In our case the tests for dict
were removed and the test for boolean
vs None
have also been removed. These are not possible to have due to these not being able to compare to one another using the less than (<
) operator which is used in the Jump Search algorithm implementation.
You can run and explore through the tests via the repl below:
Implementation
from math import sqrt;
from typing import List, TypeVar;
from operator import eq as deep_equals, ge as greater_than_or_equal, ne as not_equal, lt as less_than;
T = TypeVar('T')
def linearSearch(collection: List[T], value: T) -> int:
for index, item in enumerate(collection):
if deep_equals(item, value): return index;
return -1;
def jumpSearch(collection: List[T], value: T) -> int:
n = len(collection);
step = sqrt(n);
prev = 0;
while less_than(collection[int(min(step, n) - 1)], value):
prev = step;
step *= 2;
if greater_than_or_equal(prev, n): return -1;
foundIndex = linearSearch(collection[int(prev):], value);
if not_equal(foundIndex, -1): foundIndex += int(prev);
return foundIndex;
Note 1: Since Jump Search utilises Linear Search, we will use the implementation we built previously in this article series for that.
For the jumpSearch
function, we require a collection
and a value
to be provided and will return an int
(integer) representing the index of the value
if it is found and -1
if it is not. The value
itself is what we will be looking for in the collection
.
Note 2: The
collection
should be aList
of items matching the same type as thevalue
we are searching for and thus, if thevalue
is of typestr
(string) then we expect thecollection
to be aList[str]
for example.
Whence we enter the jumpSearch
function body, we declare 3 variables:
- The variable
n
to represent the amount of items in thecollection
- The variable
step
to be the value of how far we should jump each time, this is defined as the square root ofn
- The variable
prev
to represent the last index visited when the laststep
was taken
From here we run a while
loop as long as the current item at int(min(step, n) - 1)
is less than the value
we are searching for. You may be wondering why we use int(min(step, n) - 1)
. Let's break it down from the inside out. Let's use the first iteration of the while
loop for a Pseudocode example:
collection = [1, 2, 3];
value = 2;
n = len(collection); # 3
step = sqrt(n); # 1.73205080757
prev = 0; # 0
nextIndex = min(step, n) - 1; # 0.73205080757
indexAsInteger = int(nextIndex) # 0
itemAtIndex = collection[indexAsInteger]; # 1
condition = less_than(itemAtIndex, value); # True
while condition is True: # now we run the loop since the condition is True
That is the essential breakdown of the while
loops condition and hopefully that makes sense as to why we run int(min(step, n) - 1)
in the implementation now.
Inside the loop we set the prev
to equal the current step
and then add the step
to itself for the next iteration of the while
loop. During the loop execution we check that the prev
variable is not greater than or equal to the length of the collection
which we stored in the variable n
. If it is, we haven't found the value
in the collection
and return -1
to represent that it wasn't found.
If we exit the loop and didn't return -1
, we must have the value
still in the collection
and so we then run the linearSearch
function with a new array created from a slice of the collection
running from the prev
step index to the end of the collection and pass in the value
we are looking for also. If the linearSearch
run doesn't return -1
, we have found the value
and all we need to do is add the prev
step index value to the foundIndex
.
Note 3: We do this because if
-1
wasn't returned and thus thevalue
was found. If you remember we took a slice of thecollection
beginning from theprev
step to the end of thecollection
for thelinearSearch
to run over and thus if the value was found bylinearSearch
at index1
then it would be in thecollection
at indexprev + 1
.
This works well for us since we return the foundIndex
either way and so if the value
is not found we will just return the -1
anyway.
Conclusions
Jump Search is useful in places where Binary Search is potentially slow due to backwards lookups. It is faster than a regular Linear Search and tries to maximise efficiency by using jumps through the sorted collection.
I hope you found some value in this article and if you have any questions or comments, feel free to write them below!
Top comments (0)