Collections Overview

Refresher on Array Syntax

Arrays are one of the most basic data structures in C#. They provide a way to store a fixed-size collection of elements of the same type. Here’s a quick refresher on how to declare, initialize, and access arrays:

Example

// Declaring an array of integers with 5 elements
int[] numbers = new int[5];
numbers[0] = 1;
numbers[1] = 2;
numbers[2] = 3;
numbers[3] = 4;

var theDefaultValue = numbers[4]; //0
var notPossible = numbers[5]; //throws an IndexOutOfRangeException

// Initializing an array with values
int[] numbers = { 1, 2, 3, 4, 5 };

Limitations of Array

Arrays, while useful, have several limitations that make them less than desirable over other collection types.

  • Mutable: Once declared, the size of an array cannot change.
  • No Resizing: Arrays do not dynamically resize themselves. You must create a new array if you need more space.
  • No Adding/Removing Elements: You cannot add or remove elements from an array directly. You must copy the array to a new one if you need to adjust the size.

Old Collection Types

Before the introduction of generics in .NET, developers used older collection types that lacked type safety and flexibility. Some of these types are still in use in legacy code but generally avoided in favor of newer generic collections.

Descriptions

  • ArrayList: An ArrayList is a non-generic collection that can hold any type of object. However, this flexibility comes at the cost of type safety and performance, as you need to cast elements when retrieving them.
ArrayList arrayList = new ArrayList();
arrayList.Add(1);
arrayList.Add("two"); // No type safety
  • Hashtable: A Hashtable is a non-generic collection that maps keys to values. It uses a hash code for fast lookups.
Hashtable hashtable = new Hashtable();
hashtable["one"] = 1;
hashtable[2] = "two";

IEnumerable<T> as Our First Generic Collection Type

The IEnumerable interface is one of the most commonly used generic types in .NET. It provides a way to iterate over a collection of items of type T. All interfaces and classes that are generic collections implement IEnumerable<T>.

Example

public class Example
{
    public void PrintItems(IEnumerable<string> items)
    {
        //Remember foreach?
        foreach (var item in items)
        {
            Console.WriteLine(item);
        }
    }
}

You will almost never implement IEnumerable<T> directly, but it's important to understand that all generic collections implement it and the implications of its existence. More on this later!

PS, IEnumerable<T> is often called a "sequence".

Lazy evaluation

Lazy evaluation is an important concept in .NET collections, especially when working with IEnumerable<T>. It means that for certain types of operations, values or sequences of values are not generated or fetched until they are actually needed, which can significantly improve performance and reduce memory usage, especially when working with large data sets or expensive computations.

We'll cover this more when we get to LINQ.

yield keyword

The yield keyword is used to return an element one at a time from a method. It is commonly used in LINQ queries to return a sequence of elements.

public IEnumerable<int> GetNumbers()
{
    yield return 1;
    yield return 2;
}

This is a fairly complex topic, but for now it's important to understand that yield is a way to return a sequence of values from a method that returns IEnumerable<T>. yield methods are also lazily evaluated.

🌶️🌶️🌶️ Spencer uses yield pretty often, but it's important to understand the implications of its use - there is a fair bit of complexity going on under the hood. See more here.

Introducing the .NET Developer's Favorite Collection Type: List<T>

The List is a generic collection in .NET that addresses many of the limitations of arrays. It is flexible, resizable, and provides various methods for adding, removing, and modifying elements.

List<int> numbers = new List<int> { 1, 2, 3, 4, 5 };

You can even declare it with collection initializer syntax in newer versions of C#:

List<int> numbers = [1, 2, 3, 4, 5];

Common Methods for List

  • Add/AddRange
    • Add: Adds a single element to the list. numbers.Add(6);
    • AddRange: Adds multiple elements to the list at once. numbers.AddRange(new int[] { 7, 8, 9 });
  • Remove/RemoveAt
    • Remove: Removes the first occurrence of a specific element. numbers.Remove(4);
    • RemoveAt: Removes an element at the specified index. numbers.RemoveAt(2);
  • Insert: Adds an element at the specified index. numbers.Insert(1, 10);
  • IndexOf: Returns the index of the first occurrence of an element. int index = numbers.IndexOf(5);
  • Count: Returns the number of elements in the list. int count = numbers.Count;
  • Sort: Sorts the elements in the list in ascending order. numbers.Sort();
  • Reverse: Reverses the order of elements in the list. numbers.Reverse();

Things to Avoid about List<T>

  • ForEach method The ForEach method should generally be avoided in favor of more readable and maintainable alternatives such as foreach loops or LINQ expressions.

    // Avoid using List<T>.ForEach like this
    numbers.ForEach(number => Console.WriteLine(number)); //🤢
    
    // Instead, use a standard foreach loop
    foreach (var number in numbers) //😎
    {
        Console.WriteLine(number);
    }
    

Overusing List<T>

While List<T> is a powerful and versatile collection type, my experience is that .NET devs overuse it to the point of absurdity. Most of the time, Spencer uses immutable arrays in normal code, and will use IEnumerable<T> for methods that take in collections as parameters or return collections.

Other amazing generic collection types

  • Array<T>
  • Dictionary<TKey, TValue>
  • HashSet<T>
  • ImmutableArray<T>
  • Queue<T>
  • Stack<T>

Yes, plain ol' Arrays are a generic collection type too!

int[] numbers = { 1, 2, 3, 4, 5 };

If you want to make an array dynamically resizable, you can use List<T> instead:

int[] numbers = { 1, 2, 3, 4, 5 };
List<int> numbersAsList = numbers.ToList();

Introducing Dictionary<TKey, TValue>

A Dictionary<TKey, TValue> is a collection that maps keys to values, with the keys and values having a specific type. It uses a hash code for fast lookups.

Setting values:

Dictionary<string, int> dictionary = new Dictionary<string, int>();

//set the values
dictionary["one"] = 1;
dictionary[2] = "two";  //this doesn't compile!

Getting values:

Dictionary<string, int> dictionary = new Dictionary<string, int>();

//set the values
dictionary["one"] = 1;
dictionary["two"] = 2;

//get the values
int thisValue = dictionary["one"];
int thatValue = dictionary["two"];

int doesNotExistValue = dictionary["three"]; //KeyNotFoundException

It has a fair number of useful methods:

  • Add - add a key-value pair to the dictionary. Example: dictionary.Add("three", 3); Will throw an exception if the key already exists in the dictioary.
  • Remove - remove a key-value pair from the dictionary. Example: dictionary.Remove("one");
  • ContainsKey - check if the dictionary contains a key. Example: dictionary.ContainsKey("two");
  • ContainsValue - check if the dictionary contains a value. Example: dictionary.ContainsValue(3);
  • TryGetValue - get the value for a key. Example: dictionary.TryGetValue("three", out int value); Will return true if the key exists in the dictionary, and false otherwise.
  • Clear - remove all key-value pairs from the dictionary. Example: dictionary.Clear();
  • Count - get the number of key-value pairs in the dictionary. Example: int count = dictionary.Count;
  • Keys - get a collection of the keys in the dictionary. Example: IEnumerable<string> keys = dictionary.Keys;
  • Values - get a collection of the values in the dictionary. Example: IEnumerable<int> values = dictionary.Values;

Next, we look at HashSet<T>

A HashSet<T> is a collection that stores a set of values. It uses a hash code for fast lookups. This also means that it does not allow duplicate values.

HashSet<int> hashSet = new HashSet<int>();
hashSet.Add(1);
hashSet.Add(2);
hashSet.Add(2); // This will not be added because it is a duplicate

ImmutableArray<T>

An ImmutableArray<T> is a collection that stores a fixed-size array of values. It is immutable, meaning that once it is created, its size and contents cannot be changed.

ImmutableArray<int> immutableArray = ImmutableArray.Create(1, 2, 3, 4, 5);

Adding to the array returns a new ImmutableArray<T> with the added value.

ImmutableArray<int> newImmutableArray = immutableArray.Add(6);

A couple of quick honorable mentions Queue<T> and Stack<T>

A Queue<T> is a collection that stores a first-in, first-out (FIFO) collection of values.

Queue<int> queue = new Queue<int>();
queue.Enqueue(1);
queue.Enqueue(2);
queue.Enqueue(3);

int first = queue.Dequeue(); // 1
int second = queue.Dequeue(); // 2

A Stack<T> is a collection that stores a last-in, first-out (LIFO) collection of values.

Stack<int> stack = new Stack<int>();
stack.Push(1);
stack.Push(2);
stack.Push(3);

int top = stack.Pop(); // 3
int second = stack.Pop(); // 2
int first = stack.Pop(); // 1

🌶️🌶️🌶️ They're worth mentioning, but Stack<T> and Queue<T> are rarely used in practice.

When to use which collection type

  • Use ImmutableArray<T> when you need a collection that won't change. (My favorite collection type!)
  • Use List<T> when you need a resizable collection that allows fast access to elements by index.
  • Use Dictionary<TKey, TValue> when you need a collection that maps keys to values and allows fast lookups by key.
  • Use HashSet<T> when you need a collection that stores a set of values and allows fast lookups by value.
  • Use Queue<T> when you need a collection that stores a first-in, first-out (FIFO) collection of values.
  • Use Stack<T> when you need a collection that stores a last-in, first-out (LIFO) collection of values.

🌶️🌶️🌶️ Spencer uses ImmutableArray<T> a bunch because he believes in immutable objects. Besides that, he uses Dictionary<TKey, TValue> and List<T> a ton.