DEV Community

Cover image for Getting Started with LINQ (3/3)
Mirza Leka
Mirza Leka

Posted on • Updated on

Getting Started with LINQ (3/3)

Language Integrated Query or LINQ is a C# feature that allows you to query different data sources using a unified language syntax.

In the first two parts we learned what is LINQ, when it is used and then went through every-day operations, projection, filtering, sorting, set operations, aggregation, etc. If you're not familiar, I suggest taking a peek.

In this final part we'll go over methods to generate, partition, compare and join collections. Once again, we'll be working a List imported from JSON. Here, you can aquire the data as well as learn how to import it into C#.


GENERATORS

Empty Collection

Let's kick things off with creating empty collections. This is one way to declare an empty Students collection:



var = Enumerable.Empty<Student>();


Enter fullscreen mode Exit fullscreen mode

Alternatively you can do this too:



var emptyCollection = new List<Student>();


Enter fullscreen mode Exit fullscreen mode

In this example, imagine you have to return an empty collection from a method. Here is one way to do it:



    public IEnumerable<Student> GetStudents()
    {
        return Enumerable.Empty<Student>();
    }


Enter fullscreen mode Exit fullscreen mode

Or declare a new empty List when the return type is List<T>:



    public List<Student> GetStudents()
    {
        return new List<Student>();
    }


Enter fullscreen mode Exit fullscreen mode

If you're using .NET 8 or above you can reduce the number of steps by using a collection expression []. This works both for IEnumerable<T> and the List<T>:



    public IEnumerable<Student> GetStudents()
    {
        return [];
    }

    public List<Student> GetStudents()
    {
        return [];
    }


Enter fullscreen mode Exit fullscreen mode

This also works when declaring empty collections:



List<Student> emptyCollection = [];


Enter fullscreen mode Exit fullscreen mode

Range

This operator is used to generate an IEnumerable for a specified range:



IEnumerable<int> fiveNumbers = Enumerable.Range(1, 5);
// [1, 2, 3, 4, 5]


Enter fullscreen mode Exit fullscreen mode

We can also use Range to generate a new collection of students:



IEnumerable<Student> newStudents = Enumerable.Range(1, 3).Select(count =>
{   // creates new Student for each count (1 - 3)
    return new Student()
    {
        ID = count,
        Name = "Placeholder",
        Country = "Placeholder",
        Age = count * 10
    };
});


Enter fullscreen mode Exit fullscreen mode

students-range

Repeat

The Repeat operator generates a sequence that contains one repeated value. For example let's create 5 exact same students:



var repeatTimes = 5;

var csharpStudent = Enumerable.Repeat(new Student()
{
    ID = 1,
    Name = "C#",
    Country = string.Empty
}, repeatTimes);


Enter fullscreen mode Exit fullscreen mode

repeat-example

PARTITION

Take, TakeLast, TakeWhile

The Take operator is used to limit the number of items in the collection.



IEnumerable<string> firstThreeNames = students.Select(s => s.Name).Take(3);  
// ["Mirza", "Armin", "Alan"]

IEnumerable<string> lastThreeNames = students.Select(s => s.Name).TakeLast(3);
// ["Eddy", "Abdurahman", "Amy"]


Enter fullscreen mode Exit fullscreen mode

If the number specified is greater than the number of elements in the collection, the Take operator will trim the list at the last element (without throwing errors).

The TakeWhile operator returns all the elements that satisfy the specified condition and skips the rest. In this example, we limit the collection to students under 20 years old.



IEnumerable<string> teens = students
    .OrderBy(s => s.Age) // order from youngest to oldest
    .TakeWhile(s => s.Age < 20) // take younger than 20. ignore the rest
    .Select(s => s.Name); // take only names
// ["Mirza", "Farook", "Alan", "Eddy", "Abdurahman"]


Enter fullscreen mode Exit fullscreen mode

Skip, SkipLast, SkipWhile

The Skip operator ignores all the elements until a point specified.



IEnumerable<string> lastFive = students.Skip(5).Select(s => s.Name);
// ["Raj", "Nihad", "Eddy", "Abdurahman", "Amy"] (remaining five)

IEnumerable<string> firstFive = students.SkipLast(5).Select(s => s.Name);
// ["Mirza", "Armin", "Alan", "Seid", "Farook"]


Enter fullscreen mode Exit fullscreen mode

Here we're ignoring all the students that are under 20.



IEnumerable<string> teens = students
    .OrderBy(s => s.Age)
    .SkipWhile(s => s.Age < 20)
    .Select(s => s.Name);
// ["Armin", "Raj", "Nihad", "Seid", "Amy"]


Enter fullscreen mode Exit fullscreen mode

The Skip and Take operators are commonly used when creating API pagination.

EQUALITY

SequenceEqual

The SequenceEqual operator is used to compare two collection to determine if they're equal or not. Let's demonstrate that with a simple example:



string[] countries = { "Bosnia", "UK", "Turkey" }; 
string[] countries2 = { "Bosnia", "UK", "Turkey" };

var isEqual = countries.SequenceEqual(countries2);
// true


Enter fullscreen mode Exit fullscreen mode

If we'd change the order in either collection, the output would not evaluate to true.



string[] countries = { "Bosnia", "UK", "Turkey" }; 
string[] countries2 = { "Turkey", "Bosnia", "UK" };

var isEqual = countries.SequenceEqual(countries2);
// false


Enter fullscreen mode Exit fullscreen mode

Back to the students collection, we can filter out student objects from the same country and compare the results:



// I created this to avoid writing the predicate `s => s.Country == "Bosnia"` twice
Func<Student, bool> isFromBosnia = s => s.Country == "Bosnia";

var studentsFromBosnia = students.TakeWhile(isFromBosnia);
var studentsFromBosnia2 = students.Where(isFromBosnia);

var isEqual = studentsFromBosnia.SequenceEqual(studentsFromBosnia2);
// true


Enter fullscreen mode Exit fullscreen mode

JOINS

In this section we'll look at various ways to combine collections.

Zip

The Zip operator in LINQ pairs elements from two collections based on their positions (indexes). Let's create a new collection that will be paired with countries collection:



int[] countryCodes = { 387, 44, 20, 90, 91, 86, 1 };
var distinctCountries = students.DistinctBy(s => s.Country).Select(s => s.Country);


Enter fullscreen mode Exit fullscreen mode

Now let's merge the two using Zip:



var countriesMerge = countryCodes.Zip(distinctCountries);


Enter fullscreen mode Exit fullscreen mode

zip-preview

Concat

The Concat operator is used to concatenate (join) multiple collections together.



int[] nums = { 1, 2, 3, 4, 5 };
int[] newNums = { 100, 2, 300, 4, 500 };

var totalNums = nums.Concat(newNums);


Enter fullscreen mode Exit fullscreen mode

concat-example

The Concat operator seems similar to the Union. Both join operators collections. However, there are some differences when using the Concat:

  • No duplicate elements were removed
  • The order is preserved
  • The second collection is added to the end of the first

Let's join a collection of students ages with another randomly-generated ages collection and combine the result.



var studentsAges = students.Select(s => s.Age);
var eldersAges = Enumerable.Range(1, 10).Select(_ =>
{
    // I'm randomly generating an age between 65 & 100
    var random = new Random();
    int minAge = 65;
    int maxAge = 100;

    // Add adding random age into the eldersAges collection
    return random.Next(minAge, maxAge);
});

IEnumerable<int> combinedAges = studentsAges.Concat(eldersAges);
int totalAges = combinedAges.Count(); // 20


Enter fullscreen mode Exit fullscreen mode

SelectMany

In the students.json file we're using, we know that each student object has a classes property, which represents an array of objects. How can we access those?



  {
    "ID": 1,
    "Name": "Mirza",
    "Age": 18,
    "Country": "Bosnia",
    "Classes": [
      {
        "ID": 1,
        "Title": "CAD"
      },
      {
        "ID": 2,
        "Title": "IT"
      }
    ]
  }


Enter fullscreen mode Exit fullscreen mode

Bad way

The first choice would be to use the Select() projection operator:



var classesList = students.Select(s => s.Classes);


Enter fullscreen mode Exit fullscreen mode

Since s.Classes is the collection as well, the variable classesList is of type IEnumerable<List<Classes>>. To get the list of titles we need to loop through outer collection and then use Select in the inner collection:



var classesList = new List<string>();

foreach (var stud in students)
{
    foreach (var cl in stud.Classes)
    {
        classesList.Add(cl.Title);
    }
}


Enter fullscreen mode Exit fullscreen mode

However, there is a simpler way.

Better way

Using the SelectMany() projection operator we can drill into the inner array with ease.



IEnumerable<string> classTitles = students
    .SelectMany(s => 
        s.Classes.Select(s => s.Title)
    );


Enter fullscreen mode Exit fullscreen mode

select-many

The SelectMany() acts like a join between the outer and inner array.

If we expand our student object with a new Hobbies property that contains a List:



public class Student {
....
    public List<string> Hobbies { get; set; } = [];
}


Enter fullscreen mode Exit fullscreen mode

And then add a few to one of our students:



students.First().Hobbies = new List<string> { "Games", "Hiking", "Blogging" };


Enter fullscreen mode Exit fullscreen mode

We can easily extract it once again the SelectMany():



var hobbies = students.SelectMany(s => s.Hobbies);


Enter fullscreen mode Exit fullscreen mode

As opposed to doing:



var hobbiesList = new List<string>();

foreach (var stud in students)
{
    foreach (var hob in stud.Hobbies)
    {
        hobbiesList.Add(hob);
    }
}


Enter fullscreen mode Exit fullscreen mode

Alternative SelectMany

The SelectMany() also has the second mode that accepts two parameters:

  • The first parameter is again the collection we're trying to extract
  • The second is a function containing data of the original collection and the inner we're trying to extract


var data = students.SelectMany(
    s => s.Hobbies,
    // original collection, inner collection
    (student, hobbies) => ... }
    );


Enter fullscreen mode Exit fullscreen mode

Let's use this to create a combination of student names and hobbies:



var hobbies = students.SelectMany(
    s => s.Hobbies,
    (student, hobbies) => new { Name = student.Name, Hobbies = hobbies }
    );


Enter fullscreen mode Exit fullscreen mode

The output is the name of thes student followed by their hobby.

select-many-alternative

Join

The Join operator is used to create a combination of two collections. For this example I created a new countries collection that we'll join with the students collection.



public class Country
{
    public int ID { get; set; }
    public string Name { get; set; }
    public string CapitalCity { get; set; }
    public string Continent { get; set; }
}


Enter fullscreen mode Exit fullscreen mode


var countries = new List<Country>
{
    new Country() { ID = 1, Name = "Bosnia", CapitalCity = "Sarajevo", Continent = "Europe" },
    new Country() { ID = 2, Name = "UK", CapitalCity = "London", Continent = "Europe" },
    new Country() { ID = 3, Name = "Egypt", CapitalCity = "Cairo", Continent = "Africa" },
    new Country() { ID = 4, Name = "Turkey", CapitalCity = "Ankara", Continent = "Asia" },
    new Country() { ID = 5, Name = "India", CapitalCity = "New Delhi", Continent = "Asia" },
    new Country() { ID = 6, Name = "China", CapitalCity = "Beijing", Continent = "Asia" },
    new Country() { ID = 7, Name = "USA", CapitalCity = "Washington", Continent = "North America" },
    // Countries below have no students:
    new Country() { ID = 8, Name = "Croatia", CapitalCity = "Zagreb", Continent = "Europe" },
    new Country() { ID = 9, Name = "Serbia", CapitalCity = "Belgrade", Continent = "Europe" },
};


Enter fullscreen mode Exit fullscreen mode

All students have a country property and we'll use that to link the two collections. Here is a basic join:



var studentsCountriesJoin = students.Join(
    countries,
    student => student.Country,
    country => country.Name,
    ((student, country) => ( student: student.Name, continent: country.Continent ))
);


Enter fullscreen mode Exit fullscreen mode

Let's clarify what happened here.

  • The students is the outer collection that is joining the inner collection (countries). That's the part students.Join(countries)
  • Then we determine on what property we are going to join the two. We join two on the country name:


    student => student.Country,
    country => country.Name,

// SQL equivalent
ON Student.Country = Country.Name


Enter fullscreen mode Exit fullscreen mode
  • Then we group the two (student, country)
  • And then we decide what we're going to return. In this case it's a collection with two properties, student and continent:


( student: student.Name, continent: country.Continent )


Enter fullscreen mode Exit fullscreen mode

The outcome of the join



var studentsCountriesJoin = students.Join(
    countries,
    student => student.Country,
    country => country.Name,
    ((student, country) => ( student: student.Name, continent: country.Continent ))
);


Enter fullscreen mode Exit fullscreen mode

is the following collection:

first-join

LINQ also allows join by multiple properties as well as applying multiple Joins. More on in the video.

Join & Group

Let's again join students and countries and then group students by continents they're from. The desired structure will look like:



{
  "Europe": [List of students where continent is "Europe"],
  "Africa": [List of students where continent is "Africa"],
  ...
}


Enter fullscreen mode Exit fullscreen mode


var studentsByContinents = students
    .Join(
        countries,
        student => student.Country,
        country => country.Name,
// Note we do not need to specify { Name = student.Name, Continent = country.Continent }
// C# will do that for us
        ((student, country) => new { student.Name, country.Continent })
    )
    // Now comes the groupping part
    .GroupBy(g => g.Continent)
    .ToDictionary(
        // The continent is the key
        g => g.Key,
        // Value is the list of student names
        g => g.Select(sc => sc.Name).ToList()); 


Enter fullscreen mode Exit fullscreen mode

join-then-group

GroupJoin

The GroupJoin operator is used to group elements from the second sequence (right side) that match each element from the first sequence (left side). It produces a hierarchical result set.

To get started, let's look again at our join of students and countries.



var studentsCountriesJoin = students.Join(
    countries,
    student => student.Country,
    country => country.Name,
    (student, country) => new { student.Name, Country = country.Name, country.Continent }
);


Enter fullscreen mode Exit fullscreen mode

join-again

We know that the output here is going to be a collection containing a student name, country and the continent belong to that student. Now let's see the GroupJoin:



var studentsGroupedByCountry = countries.GroupJoin(
    students,
    country => country.Name,
    student => student.Country,
    (country, studentsGroup) => 
  new { country.Continent, Country = country.Name, Students = studentsGroup.Select(s => s.Name) }
);


Enter fullscreen mode Exit fullscreen mode

group-join-example

Let's analyze what happened here:

  • First of all, we can see that we groupped students by country while joining
  • Second thing, the Name is represented as a collection with a Count property, not the actual student names
  • But the most importantly, we have countries without students. There are no students from countries at the bottom and the groupjoin() is indicating that.

In SQL terms,

  • The Join() operator represents the INNER JOIN as it produces the result where only the matching elements from both sequences are included (only the students and the countries with students).
  • The GroupJoin() operator represents the LEFT OUTER JOIN as it produces all records from the inner table (countries), and the matching records from the outer table (students). As we can see all countries are in the result, even those without matching students.

If we'd apply groupping again, the output would be the same thing we had above with Join & Group:



var studentsGroupedByCountry = countries.GroupJoin(
    students,
    country => country.Name,
    student => student.Country,
    (country, studentsGroup) => new { country.Continent, Students = studentsGroup.Select(s => s.Name) }
);

var groupedByContinent = studentsGroupedByCountry
    .GroupBy(x => x.Continent)
    .Select(g => new
    {
        Continent = g.Key,
        Students = g.SelectMany(x => x.Students).ToList()
    })
    .ToList();



Enter fullscreen mode Exit fullscreen mode

group-join-then-group

Wrapping Up

That's all I wanted to share on LINQ. If you learned something new, don't forget to hit the follow button. Also, follow me on Twitter to stay up to date with my upcoming content.

Bye for now 👋

Top comments (1)

Collapse
 
satyam_sahu_8762e6f971a41 profile image
Satyam Sahu

helpful...