Task description
An isogram is a word that has no repeating letters, consecutive or non-consecutive. Implement a function that determines whether a string that contains only letters is an isogram. Assume the empty string is an isogram. Ignore letter case.
isIsogram("Dermatoglyphics")
returnstrue
isIsogram("aba")
returnsfalse
isIsogram("moOse")
returnsfalse
Task solution
Tests
Using the pytest framework we will write tests to cover the following cases:
- Does a
TypeError
raise for invalid input - Do lower case, upper case and mixed case strings pass as expected?
- What if an empty string is provided?
These conditions accounted for, the tests are as follows:
import pytest;
from isograms import is_isogram; # ./isograms.py
def invalid_input_throws():
with pytest.raises(TypeError):
is_isogram(1);
def happy_path_tests():
assert is_isogram("Dermatoglyphics") == True;
assert is_isogram("isogram") == True;
assert is_isogram("aba") == False;
assert is_isogram("moOse") == False;
assert is_isogram("isIsogram") == False;
assert is_isogram("") == True;
Implementation
Originally I went for a simple enough approach whereby I add each character into the variable cache
. From here I loop over the string and for each character in the string I convert it to lowercase and if it is not in the cache, we add it in, otherwise if it is already in the cache, it cannot be an isogram and thus we return False
. Naturally, if the loop ends without returning, we return True
since it must be an isogram afterall. The implementation at first was like so:
def is_isogram(string: str) -> bool:
cache = {};
for char in string:
lowerChar = char.lower();
if not lowerChar in cache: cache[lowerChar] = True;
else: return False;
return True;
This implementation is basically pseudo-code code but we can do better. Since we have covered the red and green portions of the red, green, refactor cycle, it is time for the refactor part. For this I have implemented the following:
def is_isogram(string: str) -> bool:
characters = [char.lower() for char in string];
uniqueCharacters = set(characters);
return True if len(characters) == len(uniqueCharacters) else False;
This implementation does exactly the same as the first version but in half the lined of code. I also think it is a bit more readable personally. In short, we are creating a characters
array made up of the lower cased characters in the input string. Next we convert the array to a set, if you have never used a set before, it is basically just a collection or unique values and in this case, that means if our array has duplicate values, they will be removed. Finally, we do exactly as the last line of the function says: If the characters
array is the same length as the uniqueCharacters
set, then it must be an isogram and we must return True
. Otherwise return False
because it cannot be an isogram in the first place.
Conclusions
Python is a language I have a love/hate relationship with but one thing I love about it is that it is an expressive language. I especially love the way you can write conditionals like the one we have in the refactored implementation which are so human-readable that its almost silly. On the flip side, the module system and the inconsistency of parts of the languages implementation does irk me somewhat but overall it is a powerful language with a good community.
The challenge at hand was pretty simple to resolve but I enjoyed completing it never the less, especially when I had chance to look into the timeit library for execution timing tests and could see some variability in speed on each run but in almost all cases both implementations were pretty close, usually 7 seconds up to 8 seconds to run each version 1,000,0000 times.
See you in the next one!
Top comments (1)
Hi Andy, I totally agree with your sentiment about
set
s but noone will use them or understand them if noone else shows how they work, so avoidance isn't the answer I would say. I have seen lots of examples of code that creates arrays of unique values which totally bypass usingset
s which is ashame since it is a common thing in most languages. I am glad you noticed such a detail about the naming too, I always try to implement code in a readable, testable and understandable manner more than a short, one line hot-shot one even though as you said, doing so would be simple enough. 🤷♂️