Here we go, beginning a new adventure in computer science through the lens of data structures , this will be a relatively long series of posts which will discuss some famous data structures regarding their specifications, analysis, design and implementation in which I will try to enforce the best practices.
Why writing this series
- I have been studying data structures and algorithms for almost 3 years now and I am very enthusiast about understanding the theories and mathematics behind them.
- I believe writing deepens your knowledge and understanding, and makes you face new challenges that you would never deal with otherwise.
- Although, data structures blogs, articles, tutorials .. etc are everywhere, I believe that every person has his unique identity and could represent the topic from his own perspective and that's what I am trying to do in this series, show you why I am in love with the study of data structures and algorithms.
- I believe that the person can't truly understand a certain topic without knowing about it from various resources not just depending on a single resource, because this will reveal to him different perspectives about this topic which will build a complete understanding about it.
This series will mainly focus on the analysis of a data structure, how it meets the required specification for solving a given problem and illustrated that in code and mathematical analysis if needed.
We will start by explaining simple concepts and building blocks, the reader should have basic knowledge in at least one OOP language (Java, C++, C# ..) and should be familiar with mathematical notations.
We won't dive deep in driving the mathematical models, algorithmic thinking or techniques such as greedy algorithms, divide and conquer or dynamic programming. but, that being said, I will explain briefly what those terms mean, why we need them and how they will be used in our implementation when needed.
Brief Overview
Before setting off on our series, we need to know why it became more and more important to learn about data structures, as the software systems grow bigger every year and the hardware capabilities as well (e.g RAM, CPU clock speed, etc..), this came with the responsibility to solve bigger and more complicated problems, and process more data than before, it has been a demand to make sure that our programs take full advantages of those capabilities and run more efficiently.
For example, a program that was used to search a database with thousands of records in the 90th - as the memory would be able to store back then -, now, it's used to search millions or even billions, this is an indicator that we need to develop a program that is able to scale with the problem size, that's where data structures and algorithms take place.
Data structure is a way in which we store and organize data to make it more efficient while being processed by an algorithm.
Efficiency is a general term that can serve a lot of meanings depending on the situation, in our case it can mean :
- Easier to traverse or search data, for example Binary Search Tree, BTS for abbreviation.
- Easier to store, retrieve and modify data, e.g symbol table.
- Faster to sort or compare the stored data such as the heap.
Data structures are built above simpler ones or built-in data types, so basically data structures can be considered as data types but in most cases they are user defined, to understand what that means let's take a little detour and talk about data types.
Data Types
Data type is a set of values and a set of operations on those values such as integer :
- Values : -170 , 0 , 290, etc ...
- Operations : addition '+' , multiplication '*', etc..
Data types are usually divided into :
- Primitive data types In which values immediately map to machine representations "memory words" and operations to machine instructions like integer, double, float, etc...
- Non-primitive data types which can be derived from the primitive ones and builtin the language (e.g Array, reference and pointer in c++.), or they can be user defined types (e.g classes or structs) and that's exactly what we need to extend the builtin types and create our own types.
In this series we will implement our data structures using what is called an Abstract Data type or ADT, it is a data type whose representation is hidden from the client code -usually the main function- .
This will help the user to be only concerned about the functionality of the data structure not the implementation which in most cases could be complex.
ADT implementation leads to an style of programming known as Object Oriented Programming, I am not going to discuss what is OOP because it is a very broad subject and we only interact with a narrow area of object orientation in this series, I am expecting the reader to be familiar with some concepts such as what is a class, what is an object; what do we mean by abstraction.
ADT in C++
In this series we will implement the data structures in c++ in the form of ADT, we will break the code into three main parts:
- Client code "main" : where the object will be instantiated and the user will be able to put the data structure functionality into use.
- Header file ".h" : defines the data structure API which is the Border line between the implementation and the user "client", in the API the data structure functionality would be described.
- Source file ".cpp" : includes the actual implementation of the class methods "functionalities".
Reinvent The Wheel
Of course most modern programming languages provide wide range of most used data structures to lessen the overhead on the programmer when writing complex programs (STL in c++), but it's very useful for the programmer to learn how to write his own data structure and know how the popular ones are implemented because; it makes him aware about the scenarios that each data structures will be most suited, and in some cases the problem at hand will require specific tweaks that needed to be implemented.
Although if you are a minimalist, most of those libraries offer a huge number of operations for each data structure that you might not need in your program, so it would be a better solution to write your own data structure with just the required operations.
Finally, learning about the implementation of sophisticated data structures will make you a better programmer and provide an intuition about how to write more organized, performant and efficient code.
Note of Appreciation
I am a huge fan of Professor Robert Sedgewick and this series is motivated by his Coursera Algorithms, Part I , Algorithms, Part II
Top comments (0)