When we want to filter data to a subset of that data, we can use something like a filter function.
What happens though if you just want to search a collection see if a value exists in the first place?
To this end, today we will be implementing the most basic search algorithm there is, the linear search algorithm.
Tests
For the tests I will be using the PyTest runner. If you haven't used it before I highly recommend it if you are a python developer.
from main import linearSearch;
def test_strings():
assert linearSearch(["Dave", "James"], "Dave") == 0;
assert linearSearch(["Dave", "James"], "abc") == -1;
def test_integers():
assert linearSearch([1,2,3,4], 3) == 2;
assert linearSearch([1,2,3,4], 5) == -1;
def test_floats():
assert linearSearch([3.14], 3.14) == 0;
assert linearSearch([3.141], 3.14) == -1;
def test_booleans():
assert linearSearch([False, True], True) == 1;
assert linearSearch([False, True], None) == -1;
def test_lists():
assert linearSearch([[3.14]], [3.14]) == 0;
assert linearSearch([[3.141]], [3.14]) == -1;
def test_nested_lists():
assert linearSearch([[3.14, [1]]], [3.14, [1]]) == 0;
assert linearSearch([[3.141]], [3.14]) == -1;
def test_sets():
assert linearSearch([set([1])], set([1])) == 0;
assert linearSearch([set([1])], set([1,2])) == -1;
def test_tuples():
assert linearSearch([tuple([1])], tuple([1])) == 0;
assert linearSearch([tuple([1])], tuple([1, 2])) == -1;
def test_dictionaries():
assert linearSearch([{"a": 1}], {"a": 1}) == 0;
assert linearSearch([{"a": 2}], {"a": 1}) == -1;
Since we want to find items of different types in our collections, we need to cover our bases and run checks for each type that we can envision coming up during a search run and to make sure that false positives shouldn't come up, heed the -1
cases in each test.
You can find the test repl here to run and explore the tests:
Implementation
Initially for this article I had planned to go with the naive approach and use the simplest form of the linear search algorithm that exists:
from typing import List;
def linearSearch(collection: List, value) -> int:
for index, item in enumerate(collection):
if item is value: return index;
return -1;
In theory this is fine but in practice it is not the best way we could handle comparisons because the is
operator in python doesn't cover all types nor does it run deep equality checks such as in the case of nested arrays. To make sure we are able to handle these scenarios I have updated the solution to work with multiple types for use during a search run:
from typing import List;
from operator import eq as deep_equals;
def linearSearch(collection: List, value) -> int:
for index, item in enumerate(collection):
if deep_equals(item, value): return index;
return -1;
Simple, right? With this version of the implementation we can handle pretty much every type there is and thanks to the power of the eq
function exposed in the operator
module. With this function we can handle things such as deep equality checks, type comparisons and some other things too.
For those of you who want to be really strict about collections holding the same shape of values and their types, you could also write this implementation as:
from typing import List, TypeVar;
from operator import eq as deep_equals;
T = TypeVar('T')
def linearSearch(collection: List[T], value: T) -> int:
for index, item in enumerate(collection):
if deep_equals(item, value): return index;
return -1;
This is great when data is consistently formatted as you will get a swath of benefits that static types can provide. If data is not uniform however, this may not be so easy to rectify but if you know what you are doing it will be worth using in the end.
Conclusions
Knowing how to search through our data is imperative as developers and the linear search algorithm is the simplest implementation of a search algorithm you can have in your toolbelt.
In future articles we will look at more advanced and use-case specific algorithms for searching through different data structures.
I hope you found value in this post and learned something new perhaps too!
Top comments (0)