Other Important LINQ Methods
In this section, we will explore some additional LINQ methods: Distinct
, GroupBy
, SelectMany
, and Join
. These methods provide more complex querying capabilities, such as combining data from multiple sources, grouping like objects, or flattening nested collections.
Distinct
Distinct
is a LINQ method used to remove duplicate elements from a sequence, returning only unique values. It is particularly useful when you want to ensure that the resulting collection contains no repeated items.
var numbers = new[] { 1, 2, 3, 2, 4, 4, 5 };
var distinctNumbers = numbers.Distinct();
foreach (var number in distinctNumbers)
{
Console.WriteLine(number);
}
//output:
// 1
// 2
// 3
// 4
// 5
This also works with anonymous types and records naturally:
var employees = new[]
{
new { Name = "John", Department = "HR" },
new { Name = "Jane", Department = "Finance" },
new { Name = "Mike", Department = "IT" },
new { Name = "John", Department = "HR" }, // Duplicate
new { Name = "Sara", Department = "HR" }
};
var distinctEmployees = employees.Distinct();
foreach (var employee in distinctEmployees)
{
Console.WriteLine($"{employee.Name} from {employee.Department}");
}
//output:
//John from HR
//Jane from Finance
//Mike from IT
//Sara from HR
Using Distinct
with a Custom Class (Employee
) and IEqualityComparer<T>
When working with a custom class (like Employee), you need to define how two instances of that class should be compared. This is where IEqualityComparer<T>
comes into play.
public class EmployeeEqualityComparer : IEqualityComparer<Employee>
{
public bool Equals(Employee x, Employee y)
{
if (x == null || y == null)
return false;
// Compare based on Name and Department
return x.Name == y.Name && x.Department == y.Department;
}
public int GetHashCode(Employee obj)
{
if (obj == null)
return 0;
// Combine Name and Department into a unique hash code
return (obj.Name + obj.Department).GetHashCode();
}
}
Now we'll pass it to Distinct
so it can use it:
public class Employee
{
public string Name { get; set; }
public string Department { get; set; }
}
var employees = new[]
{
new Employee { Name = "John", Department = "HR" },
new Employee { Name = "Jane", Department = "Finance" },
new Employee { Name = "Mike", Department = "IT" },
new Employee { Name = "John", Department = "HR" }, // Duplicate
new Employee { Name = "Sara", Department = "HR" }
};
// Use Distinct with a custom comparer for Employee objects
var distinctEmployees = employees.Distinct(new EmployeeEqualityComparer());
foreach (var employee in distinctEmployees)
{
Console.WriteLine($"{employee.Name} from {employee.Department}");
}
//output:
//John from HR
//Jane from Finance
//Mike from IT
//Sara from HR
🌶️🌶️🌶️ I rarely bother implementing my own IEqualityComparer<T>
. If I need to generate equality members (which I often don't) I use the tooling built into Rider to autogen the code.
Last note: there is a DistinctBy
method that came out in the last few years that allows you to specify a property to use for the distinct comparison:
var employees = new[]
{
new Employee { Name = "John", Department = "HR" },
new Employee { Name = "Jane", Department = "Finance" },
new Employee { Name = "Mike", Department = "IT" },
new Employee { Name = "John", Department = "HR" }, // Duplicate
new Employee { Name = "Sara", Department = "HR" }
};
var distinctEmployees = employees.DistinctBy(e => e.Name);
GroupBy
The GroupBy
method is very valuable when you need to group similar objects together. It allows you to categorize elements into groups.
Example
var employees = new[]
{
new { Name = "John", Department = "HR" },
new { Name = "Jane", Department = "Finance" },
new { Name = "Mike", Department = "IT" },
new { Name = "Sara", Department = "HR" }
};
var grouped = employees.GroupBy(e => e.Department);
foreach (var group in grouped)
{
Console.WriteLine($"Department: {group.Key}");
foreach (var employee in group)
{
Console.WriteLine($"- {employee.Name}");
}
}
//output:
//Department: HR
//- John
//- Sara
//Department: Finance
//- Jane
//Department: IT
//- Mike
In this example, employees are grouped by their department. The GroupBy
method returns a collection of groups, where each group contains elements that share the same key (in this case, the department).
SelectMany
The SelectMany
method is used to flatten a collection of collections into a single sequence. This is particularly useful when you have objects that contain other collections, and you want to “unzip” them.
Example
var managers = new[]
{
new { ManagerName = "John", Employees = new[] { "Mike", "Jane" } },
new { ManagerName = "Sara", Employees = new[] { "Peter", "Chris" } }
};
var allEmployees = managers.SelectMany(m => m.Employees);
foreach (var employee in allEmployees)
{
Console.WriteLine(employee);
}
In this example, SelectMany
flattens the employee arrays from each manager into a single sequence of employees.
Join
The Join
method is used to combine data from two sequences based on a common key.
Query Syntax
var employees = new[]
{
new { EmployeeId = 1, Name = "John", DepartmentId = 1 },
new { EmployeeId = 2, Name = "Jane", DepartmentId = 2 },
new { EmployeeId = 3, Name = "Mike", DepartmentId = 3 }
};
var departments = new[]
{
new { EmployeeId = 1, Department = "HR" },
new { EmployeeId = 2, Department = "Finance" },
new { EmployeeId = 3, Department = "IT" }
};
//query syntax
var result = from e in employees
join d in departments on e.DepartmentId equals d.DepartmentId
select new { e.Name, d.Department };
foreach (var item in result)
{
Console.WriteLine($"{item.Name} works in {item.Department}");
}
//lambda syntax
var result = employees.Join(departments,
e => e.DepartmentId,
d => d.DepartmentId,
(e, d) => new { e.Name, d.Department });
foreach (var item in result)
{
Console.WriteLine($"{item.Name} works in {item.Department}");
}
🌶️🌶️🌶️ I almost never use Join
in LINQ because I usually don't have to (and because the syntax to do joins frankly sucks) - more on that when we get to the next course. (Joins in LINQ also maybe the best use case for query syntax... but I still don't use them!)