Other Important LINQ Methods

In this section, we will explore some additional LINQ methods: Distinct, GroupBy, SelectMany, and Join. These methods provide more complex querying capabilities, such as combining data from multiple sources, grouping like objects, or flattening nested collections.

Distinct

Distinct is a LINQ method used to remove duplicate elements from a sequence, returning only unique values. It is particularly useful when you want to ensure that the resulting collection contains no repeated items.

var numbers = new[] { 1, 2, 3, 2, 4, 4, 5 };

var distinctNumbers = numbers.Distinct();

foreach (var number in distinctNumbers)
{
    Console.WriteLine(number);
}

//output: 
// 1
// 2
// 3
// 4
// 5

This also works with anonymous types and records naturally:

var employees = new[]
{
    new { Name = "John", Department = "HR" },
    new { Name = "Jane", Department = "Finance" },
    new { Name = "Mike", Department = "IT" },
    new { Name = "John", Department = "HR" }, // Duplicate
    new { Name = "Sara", Department = "HR" }
};

var distinctEmployees = employees.Distinct();

foreach (var employee in distinctEmployees)
{
    Console.WriteLine($"{employee.Name} from {employee.Department}");
}

//output:
//John from HR
//Jane from Finance
//Mike from IT
//Sara from HR

Using Distinct with a Custom Class (Employee) and IEqualityComparer<T>

When working with a custom class (like Employee), you need to define how two instances of that class should be compared. This is where IEqualityComparer<T> comes into play.

public class EmployeeEqualityComparer : IEqualityComparer<Employee>
{
    public bool Equals(Employee x, Employee y)
    {
        if (x == null || y == null)
            return false;

        // Compare based on Name and Department
        return x.Name == y.Name && x.Department == y.Department;
    }

    public int GetHashCode(Employee obj)
    {
        if (obj == null)
            return 0;

        // Combine Name and Department into a unique hash code
        return (obj.Name + obj.Department).GetHashCode();
    }
}

Now we'll pass it to Distinct so it can use it:

public class Employee
{
    public string Name { get; set; }
    public string Department { get; set; }
}

var employees = new[]
{
    new Employee { Name = "John", Department = "HR" },
    new Employee { Name = "Jane", Department = "Finance" },
    new Employee { Name = "Mike", Department = "IT" },
    new Employee { Name = "John", Department = "HR" }, // Duplicate
    new Employee { Name = "Sara", Department = "HR" }
};

// Use Distinct with a custom comparer for Employee objects
var distinctEmployees = employees.Distinct(new EmployeeEqualityComparer());

foreach (var employee in distinctEmployees)
{
    Console.WriteLine($"{employee.Name} from {employee.Department}");
}

//output:
//John from HR
//Jane from Finance
//Mike from IT
//Sara from HR

🌶️🌶️🌶️ I rarely bother implementing my own IEqualityComparer<T>. If I need to generate equality members (which I often don't) I use the tooling built into Rider to autogen the code.

Last note: there is a DistinctBy method that came out in the last few years that allows you to specify a property to use for the distinct comparison:

var employees = new[]
{
    new Employee { Name = "John", Department = "HR" },
    new Employee { Name = "Jane", Department = "Finance" },
    new Employee { Name = "Mike", Department = "IT" },
    new Employee { Name = "John", Department = "HR" }, // Duplicate
    new Employee { Name = "Sara", Department = "HR" }
};
var distinctEmployees = employees.DistinctBy(e => e.Name);

GroupBy

The GroupBy method is very valuable when you need to group similar objects together. It allows you to categorize elements into groups.

Example

var employees = new[]
{
    new { Name = "John", Department = "HR" },
    new { Name = "Jane", Department = "Finance" },
    new { Name = "Mike", Department = "IT" },
    new { Name = "Sara", Department = "HR" }
};

var grouped = employees.GroupBy(e => e.Department);

foreach (var group in grouped)
{
    Console.WriteLine($"Department: {group.Key}");
    foreach (var employee in group)
    {
        Console.WriteLine($"- {employee.Name}");
    }
}

//output: 
//Department: HR
//- John
//- Sara
//Department: Finance
//- Jane
//Department: IT
//- Mike

In this example, employees are grouped by their department. The GroupBy method returns a collection of groups, where each group contains elements that share the same key (in this case, the department).

SelectMany

The SelectMany method is used to flatten a collection of collections into a single sequence. This is particularly useful when you have objects that contain other collections, and you want to “unzip” them.

Example

var managers = new[]
{
    new { ManagerName = "John", Employees = new[] { "Mike", "Jane" } },
    new { ManagerName = "Sara", Employees = new[] { "Peter", "Chris" } }
};

var allEmployees = managers.SelectMany(m => m.Employees);

foreach (var employee in allEmployees)
{
    Console.WriteLine(employee);
}

In this example, SelectMany flattens the employee arrays from each manager into a single sequence of employees.

Join

The Join method is used to combine data from two sequences based on a common key.

Query Syntax

var employees = new[]
{
    new { EmployeeId = 1, Name = "John", DepartmentId = 1 },
    new { EmployeeId = 2, Name = "Jane", DepartmentId = 2 },
    new { EmployeeId = 3, Name = "Mike", DepartmentId = 3 }
};

var departments = new[]
{
    new { EmployeeId = 1, Department = "HR" },
    new { EmployeeId = 2, Department = "Finance" },
    new { EmployeeId = 3, Department = "IT" }
};

//query syntax
var result = from e in employees
             join d in departments on e.DepartmentId equals d.DepartmentId
             select new { e.Name, d.Department };

foreach (var item in result)
{
    Console.WriteLine($"{item.Name} works in {item.Department}");
}

//lambda syntax
var result = employees.Join(departments,
                            e => e.DepartmentId,
                            d => d.DepartmentId,
                            (e, d) => new { e.Name, d.Department });

foreach (var item in result)
{
    Console.WriteLine($"{item.Name} works in {item.Department}");
}

🌶️🌶️🌶️ I almost never use Join in LINQ because I usually don't have to (and because the syntax to do joins frankly sucks) - more on that when we get to the next course. (Joins in LINQ also maybe the best use case for query syntax... but I still don't use them!)