Data Structure
Data structure is a way to store data which makes it easier and efficient to retrieve data from persistent store in-memory and to perform additional functions on data. Simple Array is a linear data structure and so is:
- Linked List
- Stack
- Queue
Collections
As the name suggests it is a collection of records related or unrelated data which is held in memory temporarily, some of the different types of collections are Queues, Stacks, Dictionaries, Lists and Hash Tables.
In C# System.Collections Namespace contains interfaces and classes that define various collections of objects mentioned above. What is more important when thinking about collections of records is to keep in mind what kind of collection we should use to store our records in memory.
Now the question is how do we decide what kind of collection to use? (list, queues, dictionaries) That depends on what we want to do with our collection of records, do we just want to read through those records in our collection or we may need to insert, update, remove or even filter them in future. It is always a good idea to take your time and think what kind of collection is best suited to retrieve and store data temporarily in memory.
To do that, I think first we should try to understand different kind of collections and what it means to iterate through a collection. We can basically divide collections in two sections one which only store a value for each element in a collection and second where every item stores both a Key and a Value.
Collections which store only Value.
Array
-
ArrayList
-
List<T>
-
Queue
-
ConcurrentQueue<T>
-
Stack
-
ConcurrentStack<T>
-
LinkedList<T>
Collection which store Key and Value.
-
Hashtable
Recommended to use Generic Disctionary SortedList<TKey,TValue>
Dictionary<TKey,TValue>
ConcurrentDictionary<TKey,TValue>
In C# we got different interfaces to handle collections for example IList<T>, ICollection<T>, IDictionary<TKey,TValue>, IEnumerable<T>
etc. They all implement Iterator Pattern to iterate on a set of rows or collection of rows, so what is iteration and iterator.
Iterators
To understand iterators we need to understand what iteration means. In its most basic form, iteration means taking each element at a time in a collection of elements and going through them one after another. At any time we use a loop to go through a collection of rows it is called an iteration.
// Simple array is used to store city names
//which contain 3 element. We can iterate through
//an array to get value of each item.
string[] citiesArray = new string[3]{ "Paris", "London", "Rome" };
//We going to loop through an array
for(int i = 0; i < citiesArray .Length; i++)
{
Console.WriteLine(citiesArray[i]);
}
Now what is an iterator? Simply put it is an object which defines how to iterate over a collection and what is the next element we can iterate to (Move Next). Iterator Design Pattern is used in every modern programming language to step through data structures or collections to provide some kind of iterator for example in C# foreach()
that allows programmers or developers to iterate through its objects, without knowing underlying implementation.
In C# IEnumerable is the base interface for all collections that can be iterated on. To use generic version of IEnumerable we can use IEnumerable<T>
. IEnumerable contains only one method, GetEnumerator, which returns an IEnumerator or IEnumerator<T>
which supports a simple iteration over a non-generic and generic collection respectively. IEnumerator provides the ability to iterate through the collection by exposing.
- Current (Property)
- MoveNext and Reset (Methods)
Iterating over a collection is made simple in c# with the use of foreach
keyword which enumerates a collection, executing the embedded statement once for each element in the collection:
Basic Differences Between IEnumerable and IEnumerator
IEnumerable has only one method (GetEnumerator), whereas IEnumerator has two methods (MoveNext and Reset). IEnumerable can only return IEnumerator, IEnumerable cannot retain the cursor's current state. But IEnumerator is a read-only, forward-only cursor over a collection of values and can retain the current cursor state.
IEnumerable
IEnumerable object is the logical representation of a collection of values, it is only suitable for iterating through collections because we cannot Add or Remove items which makes it read only. IEnumerable can not be instantiated from Enumerable, so it should be instantiated from other types of collection for example: List, Dictionary, stack etc.
//create IEnumerable of string (Give you error)
//IEnumerable<string>' does not contain a definition for 'Add'
IEnumerable<string> Weekdays = new List<string>();
Weekdays.Add("Monday");
Weekdays.Add("Tuesday");
Weekdays.Add("Wednesday");
Weekdays.Add("Thursday");
Weekdays.Add("Friday");
Weekdays.Add("Saturday");
Weekdays.Add("Sunday");
//create a list
IList<string> WeekdaysList = new List<string>();
WeekdaysList.Add("Monday");
WeekdaysList.Add("Tuesday");
WeekdaysList.Add("Wednesday");
WeekdaysList.Add("Thursday");
WeekdaysList.Add("Friday");
WeekdaysList.Add("Saturday");
WeekdaysList.Add("Sunday");
//create IEnumerable
//IEnumerable object WeekdaysList is
//logical representation of a collection
IEnumerable<string> EnumerableOfWeekdaysList =
(IEnumerable<string>)WeekdaysList;
//OR
//because List<T> is already an IEnumerable<T>
IEnumerable<string> EnumerableOfWeekdaysList = WeekdaysList;
//Retrieve all the items from this IEnumerable object
//WeekdaysList we are using foreach loop
foreach(string DaysOfWeek in EnumerableOfWeekdaysList)
{
Console.WriteLine(DaysOfWeek);
}
//create IEnumerator of string.
IEnumerator<string> EnumeratorOfWeekDays =
WeekdaysList.GetEnumerator();
//Loop over the collection and display
//Monday
while (EnumeratorOfWeekDays.MoveNext())
{
if (EnumeratorOfWeekDays.Current == "Monday")
{
Console.WriteLine(EnumeratorOfWeekDays.Current);
}
}
IEnumerable is used to hold data in memory from DB server or some other data source. Point to keep in mind when handling data in collections is not to have a lot of records in the memory as it will add a lot of overhead and reduce the amount of memory available for other tasks. Good practice is to only fetch records which are needed and hold them in memory for a short period of time.
ICollection
ICollection is the base interface for non-generic collections in system.collection
namespace and to use generic interface for collections in C# we can use ICollection<T>
, both ICollection and ICollection<T>
extends or implements IEnumerable
and IEnumerable<T>
respectively but collections like IList, IDictionary, SortedList and others extends ICollection interface. Now the extra functions ICollection
interface provide which are the ability to add, remove, update, delete and filter elements in memory. ICollection
also holds the total number of elements and can be retrieved by using Count property.
Even though ICollection
extend IEnumerable
but one major difference between them other than extra functions exposed by ICollection
interface is that we can get an index of an element in IEnumerable[i]
but ICollection
is not index based so we can't get ICollection[i]
.
//ICollection extends IDictionary<TKey,TValue> Explicitly
ICollection<KeyValuePair<int, int>>
dict = new Dictionary<int, int>();
//Default implementations of collections in the
//System.Collections.Generic namespace are not synchronized.
//Enumerating through a collection is intrinsically not a
//thread-safe procedure. Even when a collection is synchronized,
//other threads can still modify the collection, which can cause
//the enumerator to throw an exception. To guarantee thread safety
//during enumeration, you can either lock the collection during
//the entire enumeration or catch the exceptions resulting from
//changes made by other threads.
bool sync = dict.IsSynchronized;
//Collection object can not be instantiated from ICollection, so
// it should be instantiated from List
System.Collections.Generic.ICollection<string> strICollection =
new List<string>();
strICollection.Add("Madrid");
strICollection.Add("Spain");
//Countable***
int ICollectionCount=strICollection.Count;
//ICollection
ICollection<Company> listOfCompaniesCollection = new
List<Company>();
listOfCompaniesCollection =
(from cmp in _context.Companies
select cmp).ToList();
var companyObj = listOfCompaniesCollection.Where(i =>
i.CompanyName == "Apple").FirstOrDefault();
//You can not get index of object, if you clear comment below
//code it will give you an error.
//int indexofICollection = listOfCompaniesCollection.IndexOf();
int iCollectionCount = listOfCompaniesCollection.Count;
IList
IList extend ICollection
public interface IList : System.Collections.ICollection
and IList<T>
extends
public interface IList<T> : System.Collections.Generic.ICollection<T>
Microsoft encourage the use of Generic collections and reason for example take an ArrayList
which is similar to List
but if we use a generic implementation List<T>
one major difference is that List<T>
is more efficient as compared to an ArrayList. Because List<T>
is a collection of strongly typed objects which means compiler will know its Type at compile time instead of runtime which makes it run faster and as there is no need for boxing and unboxing which makes it more efficient.IList<T>
extends ICollections<T>
so it inherits all the functionality from ICollection and few more which are not implemented by ICollection for example: sorting, searching, Index[i] and modifying items in a the List<T>
.
*Why do we need List? *
We may want to use IList<T>
when we need to insert or remove an element in the middle of a list using an index[i] or if we need to order our list, because ICollection does not support indexing or sorting.
//IList***
System.Collections.Generic.IList<string> strIList = new List<string>();
//List of Company Object Type
System.Collections.Generic.IList<Company> listOfCompanies =
new List<Company>();
listOfCompanies = (from cmp in _context.Companies
select cmp).ToList();
company companyObj = listOfCompanies.Where(c => c.Name ==
"Microsoft").FirstOrDefault();
foreach (var item in listOfCompaniesCollection)
{
int indexofICollection = listOfCompaniesCollection.IndexOf(item);
}
int contOfList = listOfCompaniesCollection.Count;
IDictionary
IDictionary
The IDictionary is base interface for generic collections which stores key/value pairs in no perticular order and it is also an interface for Dictionary same as ICollection or IList explained above. Dictionary Key must be unique and cannot be null but we can have Value as null or even duplicate and we can access the value by simply passing the key. Dictionary is generic version of [HashTable]. Dictionary is GENERIC but Hashtable is NON GENERIC. There is a tiny difference between them, dictionary will throw an exception if it cannot find specific key but HashTable will returns null instead of throwing an exception.
//Dictionary can create an object IDictionary<string, string>
//countryCityNames = new Dictionary<string, string>();
//Create Dictionary
IDictionary<string, string> countryCityNames = new
Dictionary<string, string>();
//adding a key/value using the Add() method
countryCityNames.Add("UK", "London");
countryCityNames.Add("Spain", "Madrid");
countryCityNames.Add("India", "Mumbai, New Delhi");
// update value of India key
cities["India"] = "Darjeeling, Asam";
//The code below throws run-time exception: key already added.
//numberNames.Add("UK", "London");
// removes UK key
cities.Remove("UK");
//prints values of India key.
Console.WriteLine(countryCityNames["India"]);
Array
Arrays remind me of days when I was studying computer science and my introduction to Single Dimension, Two Dimensional and Three Dimensional arrays. Just a quick and short example is below
//If we Add array elements at time of declaration then size is
//optional because compiler will be able to decide its size based
//on the number of elements as in the case of "Single Dimensional
//Array" below.
//Single Dimensional Array c++
int Test1[]={5,4,8,10};
Two Dimensional Array
//Defining size C++
int twoDemnsionalArray[2][3] = {{1, 3, 0}}, {-1, 5, 9}};
Three Dimensional Array
//Three Dimensional Array c#
int [, , ] threeDemnsionalArray = new int[3, 3, 3]{
{ { 1, 2, 3}, {4, 5, 6}, {4, 5, 6} },
{ { 7, 8, 9}, {10, 11, 12}, {4, 5, 6} },
{ { 7, 8, 9}, {10, 11, 12}, {4, 5, 6} }
};
Arrays with predefined size and type will perform better as compared to when type and size is not defined for example. Let's assume we want an array of 3 elements of type Int. Because compiler already knows what is the type and exact size of an array it will perform better.
//Let us define string array and we can name it fruitArray
//which contain 3 element
string[] fruitArray = new string[3]{ "Mango", "Apple", "Lemon" };
//loop through an array
for(int i = 0; i < citiesArray .Length; i++)
{
Console.WriteLine(citiesArray[i]);
}
If you will like to read more about Arrays or Multidimensional Arrays
ArrayList
ArrayList is a non-generic collection of objects and it is the same as an Array except ArrayList size can increases or decrease dynamically. We should only use ArrayList when we encounter a situation where we are not sure about the size of our data and ArrayType (int, string, decimal etc). Now it is quite tempting to use ArrayList where it can increase and shrink in size dynamically at runtime. But But But it does come with some penalties, it will take longer as at runtime, compiler will determine size and Type of ArrayList.
//Belong to system.collection
using System.Collections;
ArrayList citylist = new ArrayList();
//using ArrayList.Add() method
ArrayList dynamicObjectlist = new ArrayList();
dynamicObjectlist .Add("Madrid"); //string
dynamicObjectlist .Add(true); //bool
dynamicObjectlist .Add("London"); //String
dynamicObjectlist .Add(2587); //int
dynamicObjectlist .Add(4.5); //double
dynamicObjectlist .Add(null); //null
foreach (var item in dynamicObjectlist){
Console.Write(item + "- ");
}
//output
Madrid- True- London- 2587- 4.5- -
//Insert in ArrayList
//1 is index position and we want to insert "New York"
//we are pushing true bool to index position 2.
dynamicObjectlist.Insert(1, "New York");
//output
Madrid- New York- True- London- 2587- 4.5- -
//Let us remove first occurrence of null from ArrayList
dynamicObjectlist.Remove(null);
//output
Madrid- New York- True- London- 2587- 4.5-
//Let us check if element exist in our ArrayList
dynamicObjectlist.Contains(Madrid)
//output
True Or False
Hash Table
The Hash Table is a non-generic type of collection which stores key-value pairs for each data block. It performs better when looking up data because rather than sorting the data according to a key, it computes the location of the data from the key. It implements a hashing function to improves data access times.
Hash Table can be created without setting its type or size which means it will be allocated memory at run time. We can say Hash Tables are kind of data structure which are non generic in nature and it allow us the freedom not to set the type and size.
//Lets create HasTable
Hashtable cityRanking= new Hashtable();
Add Key Value
cityRanking.Add(1,"London");
cityRanking.Add(2,"Paris");
cityRanking.Add(3,"Rome");
foreach(DictionaryEntry de in cityRanking){
Console.WriteLine("Key: {0}, Value: {1}", de.Key, de.Value);
}
//out put
//Key: 3, Value: Rome
//Key: 2, Value: Paris
//Key: 1, Value: London
//lets update a value in HashTable for key[2]
cityRanking[2]="Athens";
//out put
//Key: 3, Value: Rome
//Key: 2, Value: Athens
//Key: 1, Value: London
//Remove value from HashTable
if(cityRanking.ContainsKey("2")){
cityRanking.Remove(2);
}
//out put
//Key: 3, Value: Rome
//Key: 1, Value: London
Stack
Stack is a kind of Data Structure which follows (Last In First Out) LIFO principal which means when we push new data it is inserted on top of the stack and when we remove a data item from the stack it will be the last one we inserted on top of the stack. So whatever we inserted last will be removed first. Stack is normally not strongly typed or either need fixed size but I always recommend if possible, to declare a Stack with a type (Stack<T>
). By declaring Type of a Stack<T>
it makes compile-time type checking possible and remove the need to perform boxing-unboxing because when we specify type for example (Stack numberStack = new Stack();) makes it non-generic.
Stack - Push
https://askanydifference.com/difference-between-stack-and-heap-with-table/
//Lets create a stack
Stack<int> numberStack = new Stack<int>();
//Lets push value in stack
numberStack.Push(110);
numberStack.Push(32);
foreach (var item in numberStack)
{
Console.Write(item + ",");
}
//out put
//32,110
Stack - Pop
//Remove the last element from stack
numberStack.Pop();
//out put
//110
Stack - Peek
//Lets push value back on stack
numberStack.Push(32);
//Peek() returns the last value added
if(numberStack.Count > 0){
Console.WriteLine(numberStack.Peek());
}
//out put
//32
Stack - Contains
numberStack.Contains(22);
//out put
//False
numberStack.Contains(32);
//out put
//true
Stack - Count
//Returns total number of items in stack
numberStack.Count
Console.Write("Count...." + numberStack.Count);
//out put
//Count....2
//Remove all items from stack
numberStack.Clear()
** Here is a more realife example**
It creates an empty list to hold the subfolder paths, and a stack to hold the subdirectories that need to be processed. It pushes the root folder path onto the stack.
It then enters a while loop that continues until the stack is empty. Inside the loop, it pops a folder path from the stack, adds it to the subfolders list, and gets the subdirectories of that folder using Directory.GetDirectories. It then pushes each subdirectory onto the stack.
After looping through subdirectories, it returns the subfolders list.
public List<string> GetAllSubFoldersWithStack(string rootFolderPath)
{
var subFolders = new List<string>();
var stack = new Stack<string>();
stack.Push(rootFolderPath);
while (stack.Count > 0)
{
var folderPath = stack.Pop();
subFolders.Add(folderPath);
var subDirectories = Directory.GetDirectories(folderPath);
foreach (var subDirectory in subDirectories)
{
stack.Push(subDirectory);
}
}
return subFolders;
}
Queue
Queue follows (First In First Out) FIFO principal which is opposite to Stack that is it :). In C# it can be generic Queue() and non-generic. It is not strongly typed and not of fixed size.
Queue
//Lets create a Queue
Queue<int> numberQueue = new Queue<int>();
Queue - Enqueue
//Lets push value in Queue
numberQueue .Enqueue(110);
numberQueue .Enqueue(32);
foreach (var item in numberQueue )
{
Console.Write(item + ",");
}
//out put
//32,110
Queue - Dequeue
//Remove the last element from Queue
numberQueue.Dequeue();
//out put
//110
Queue - Peek
//Lets push value back on stack
numberQueue.Enqueue(32);
//Peek() returns the first item from a queue
if(numberQueue.Count > 0){
Console.WriteLine(numberQueue.Peek());
}
//out put
//32
Queue - Contains
//Returns true if item is present
numberQueue.Contains(22);
//out put
//False
numberQueue.Contains(32);
//out put
//true
Queue - Count
//Returns total number of items in Queue
numberQueue.Count
Console.Write("Count...." + numberQueue.Count);
//out put
//Count....2
//Remove all items from stack
numberQueue.Clear()
Top comments (0)