As a current Computer Science undergraduate, I often feel that topics that are taught in school are not very applicable to the work we do in the “real world”. Time Complexity was one of these topics.
That was until I encountered a situation back in February that changed my perspective forever.
Context
Back then, during an internship experience, I was tasked to create and enhance a script that would process a CSV with an expected number of 100–200,000+ rows of data. This was to be written in Javascript, more specifically NodeJs. At that time, the existing script was able to process about 17,000 rows of data within about 830 seconds. That’s impressive, I thought — less than 0.05 seconds per row?
Process of optimizations
Immediately, I began looking at the current script to see where I could add on some enhancements, in order to speed it up and process the data even quicker. I Googled around and found suggestions such as using a faster existing CSV writer/reader library, using streams and even running concurrent workers to process the same file asynchronously.
Throughout implementing all of these suggestions and constant trial and error, I was met with meagre gains in performance — total processing time was improved to 827s. Yeah, not great.
Eureka moment
After about a week of investigating these optimizations, I decided to start over from scratch and scrutinize the code a bit more. Then, I noticed code that looked similar to this:
for (const x of arr) {
if (firstOperation(x)) { // O(n) operation
secondOperation(x); // O(n) operation
}
}
I dug a bit deeper and found out that the first and second operations were running in approximately O(n) time! (They use for loop(s)) I then realized, one way we could significantly improve the runtime of this script, was to reduce the time complexity of this bunch of code here, or at least, reduce the tendency of entering this bunch of codes here. In particular, I tackled the if condition involved.
Approach
For some context, firstOperation took in an argument, and returned True if the argument was present in an internal list.
To simplify this and reduce the time complexity of this operation, I went ahead with using a Map to store the internal list of values, by assigning each of these values to their respectively generated keys.
The beauty of using a Map, or more particularly a Hash Map, was that access of the key-pair values was in constant, or O(1) time! This would significantly reduce the runtime necessary for at least one of the crucial determinants of the script.
Results
After implementing this approach, I was shocked when I opened the logs after executing my script. The result of processing 17,000+ rows came up to be a mere 32 seconds!!
That was leagues better than the processing previously, and all it took was knowledge of time complexity and improving an existing method of data retrieval as well as for loop optimizations. No need for any complex methods such as concurrent processes or additional workers xD.
Takeaways
I learnt quite a bit from this experience, and am so glad I have found a practical use case of applying something learnt from school in “real world”
work! I also learnt to take a step away from a difficult problem, and often analyzing it from a less complicated lens could produce a more simple, elegant solution!
Thanks for taking the time to read about my experience, and I wish you the best in your upcoming problem-solving endeavours!
Top comments (2)
nice! In addition to the obvious benefit of being faster for end users, time and space complexity greatly impact scaling of large systems. Imagine this same CSV processing script running thousands of times per day on a server somewhere as part of a product. A 10x faster solution means up to 10x less server resources to pay for and manage!
Hey Micah, that is an excellent pointer that you brought up! Thanks for sharing the insights regarding this article and carrying on the conversation after, this is something I neglected to put in the article, so thanks for bringing it up!
Are there any other concepts tied to time and space complexity that you have found helped you in the working world, other than the scenario you brought up?