Problem statement : Write a function to Compress a given string. Only return the compressed string if it saves space.
Difficulty level: Easy
Test Cases
- 'AAABAACCDDDD' ---> 'A3BA2C2D4'
- 'BAAACCDDDD' ---> 'BA3C2D4'
- 'AABBCC' --> 'AABBCC'
- 'BBBAACCCCDD' --> 'B3A2C4D2'
Algorithm
- Setup a pointer at first character of string, initialise an empty string and a counter.
- Iterate through the string and check if character is equal to the first character of the string.
- if both are equal increment counter.
-
Else,
- add the first character and it's count to result string.
- if count is greater than 1 then only add count to the result string.
- Else, add an empty string to the result string.
- now, the first character will be the current character.
- reset the counter value to 1.
- add the first character and it's count to result string.
Add the last character and it's count to result string.
Return the resultant compressed string if it's length is less than the given string, else return the given string.
Time and Space Complexity
- Time complexity: O(n)
- Space complexity: O(n)
Code
class CompressString(object):
def compress(self, input):
if input is None or not input:
return input
result = ''
count = 0
prev_char = input[0]
for char in input:
if char == prev_char:
count += 1
else:
result += prev_char + (str(count) if count > 1 else '')
prev_char = char
count = 1
result += prev_char + (str(count) if count > 1 else '')
return result if len(result) < len(input) else input
More precise compression
class CompressString(object):
def compress(self, input):
if input is None or not input:
return input
result = ''
count = 0
prev_char = input[0]
for char in input:
if char == prev_char:
count += 1
else:
if count > 2:
result += prev_char + (str(count))
prev_char = char
count = 1
elif count == 1:
result += prev_char
count = 1
prev_char = char
else:
result += prev_char
result += prev_char
count = 1
prev_char = char
result += prev_char + (str(count) if count > 2 else prev_char)
return result if len(result) < len(input) else input
Test
import unittest
from compressString import CompressString
class TestCompress(unittest.TestCase):
def testCompress(self, func):
self.assertEqual(func(None), None)
self.assertEqual(func(''), '')
self.assertEqual(func('AABBCC'), 'AABBCC')
self.assertEqual(func('AAABCCDDDDE'), 'A3BC2D4E')
self.assertEqual(func('BAAACCDDDD'), 'BA3C2D4')
self.assertEqual(func('AAABAACCDDDD'), 'A3BA2C2D4')
print('Success: testCompress')
def main():
test = TestCompress()
compress_string = CompressString()
test.testCompress(compress_string.compress)
if __name__ == '__main__':
main()
Top comments (3)
Simple improvement proposal: you could remove the last check by replacing chars only if there are more than 2 in a row.
Replacing 'CC' with 'C2' never saves space. Replacing more than 2 characters always saves space. So you would either return the same input string or return compressed string that's shorter than original.
I could do something like if count is more than 2 than add the character with count, if count is 1, than only add the character , and lastly if count is exactly 2, then add the character 2 times, so it would remain same as original string.
I didn't know that, looks really neat. Thanks for the tip. π