Collections Overview
Refresher on Array Syntax
Arrays are one of the most basic data structures in C#. They provide a way to store a fixed-size collection of elements of the same type. Here’s a quick refresher on how to declare, initialize, and access arrays:
Example
// Declaring an array of integers with 5 elements
int[] numbers = new int[5];
numbers[0] = 1;
numbers[1] = 2;
numbers[2] = 3;
numbers[3] = 4;
var theDefaultValue = numbers[4]; //0
var notPossible = numbers[5]; //throws an IndexOutOfRangeException
// Initializing an array with values
int[] numbers = { 1, 2, 3, 4, 5 };
Limitations of Array
Arrays, while useful, have several limitations that make them less than desirable over other collection types.
- Mutable: Once declared, the size of an array cannot change.
- No Resizing: Arrays do not dynamically resize themselves. You must create a new array if you need more space.
- No Adding/Removing Elements: You cannot add or remove elements from an array directly. You must copy the array to a new one if you need to adjust the size.
Old Collection Types
Before the introduction of generics in .NET, developers used older collection types that lacked type safety and flexibility. Some of these types are still in use in legacy code but generally avoided in favor of newer generic collections.
Descriptions
ArrayList
: AnArrayList
is a non-generic collection that can hold any type of object. However, this flexibility comes at the cost of type safety and performance, as you need to cast elements when retrieving them.
ArrayList arrayList = new ArrayList();
arrayList.Add(1);
arrayList.Add("two"); // No type safety
Hashtable
: AHashtable
is a non-generic collection that maps keys to values. It uses a hash code for fast lookups.
Hashtable hashtable = new Hashtable();
hashtable["one"] = 1;
hashtable[2] = "two";
IEnumerable<T>
as Our First Generic Collection Type
The IEnumerableIEnumerable<T>
.
Example
public class Example
{
public void PrintItems(IEnumerable<string> items)
{
//Remember foreach?
foreach (var item in items)
{
Console.WriteLine(item);
}
}
}
You will almost never implement IEnumerable<T>
directly, but it's important to understand that all generic collections implement it and the implications of its existence. More on this later!
PS, IEnumerable<T>
is often called a "sequence".
Lazy evaluation
Lazy evaluation is an important concept in .NET collections, especially when working with IEnumerable<T>
. It means that for certain types of operations, values or sequences of values are not generated or fetched until they are actually needed, which can significantly improve performance and reduce memory usage, especially when working with large data sets or expensive computations.
We'll cover this more when we get to LINQ.
yield
keyword
The yield
keyword is used to return an element one at a time from a method. It is commonly used in LINQ queries to return a sequence of elements.
public IEnumerable<int> GetNumbers()
{
yield return 1;
yield return 2;
}
This is a fairly complex topic, but for now it's important to understand that yield
is a way to return a sequence of values from a method that returns IEnumerable<T>
. yield
methods are also lazily evaluated.
🌶️🌶️🌶️ Spencer uses yield
pretty often, but it's important to understand the implications of its use - there is a fair bit of complexity going on under the hood. See more here.
Introducing the .NET Developer's Favorite Collection Type: List<T>
The List
List<int> numbers = new List<int> { 1, 2, 3, 4, 5 };
You can even declare it with collection initializer syntax in newer versions of C#:
List<int> numbers = [1, 2, 3, 4, 5];
Common Methods for List
Add
/AddRange
Add
: Adds a single element to the list.numbers.Add(6);
AddRange
: Adds multiple elements to the list at once.numbers.AddRange(new int[] { 7, 8, 9 });
Remove
/RemoveAt
Remove
: Removes the first occurrence of a specific element.numbers.Remove(4);
RemoveAt
: Removes an element at the specified index.numbers.RemoveAt(2);
Insert
: Adds an element at the specified index.numbers.Insert(1, 10);
IndexOf
: Returns the index of the first occurrence of an element.int index = numbers.IndexOf(5);
Count
: Returns the number of elements in the list.int count = numbers.Count;
Sort
: Sorts the elements in the list in ascending order.numbers.Sort();
Reverse
: Reverses the order of elements in the list.numbers.Reverse();
Things to Avoid about List<T>
ForEach
method TheForEach
method should generally be avoided in favor of more readable and maintainable alternatives such asforeach
loops or LINQ expressions.// Avoid using List<T>.ForEach like this numbers.ForEach(number => Console.WriteLine(number)); //🤢 // Instead, use a standard foreach loop foreach (var number in numbers) //😎 { Console.WriteLine(number); }
Overusing List<T>
While List<T>
is a powerful and versatile collection type, my experience is that .NET devs overuse it to the point of absurdity. Most of the time, Spencer uses immutable arrays in normal code, and will use IEnumerable<T>
for methods that take in collections as parameters or return collections.
Other amazing generic collection types
Array<T>
Dictionary<TKey, TValue>
HashSet<T>
ImmutableArray<T>
Queue<T>
Stack<T>
Yes, plain ol' Arrays are a generic collection type too!
int[] numbers = { 1, 2, 3, 4, 5 };
If you want to make an array dynamically resizable, you can use List<T>
instead:
int[] numbers = { 1, 2, 3, 4, 5 };
List<int> numbersAsList = numbers.ToList();
Introducing Dictionary<TKey, TValue>
A Dictionary<TKey, TValue>
is a collection that maps keys to values, with the keys and values having a specific type. It uses a hash code for fast lookups.
Setting values:
Dictionary<string, int> dictionary = new Dictionary<string, int>();
//set the values
dictionary["one"] = 1;
dictionary[2] = "two"; //this doesn't compile!
Getting values:
Dictionary<string, int> dictionary = new Dictionary<string, int>();
//set the values
dictionary["one"] = 1;
dictionary["two"] = 2;
//get the values
int thisValue = dictionary["one"];
int thatValue = dictionary["two"];
int doesNotExistValue = dictionary["three"]; //KeyNotFoundException
It has a fair number of useful methods:
Add
- add a key-value pair to the dictionary. Example:dictionary.Add("three", 3);
Will throw an exception if the key already exists in the dictioary.Remove
- remove a key-value pair from the dictionary. Example:dictionary.Remove("one");
ContainsKey
- check if the dictionary contains a key. Example:dictionary.ContainsKey("two");
ContainsValue
- check if the dictionary contains a value. Example:dictionary.ContainsValue(3);
TryGetValue
- get the value for a key. Example:dictionary.TryGetValue("three", out int value);
Will return true if the key exists in the dictionary, and false otherwise.Clear
- remove all key-value pairs from the dictionary. Example:dictionary.Clear();
Count
- get the number of key-value pairs in the dictionary. Example:int count = dictionary.Count;
Keys
- get a collection of the keys in the dictionary. Example:IEnumerable<string> keys = dictionary.Keys;
Values
- get a collection of the values in the dictionary. Example:IEnumerable<int> values = dictionary.Values;
Next, we look at HashSet<T>
A HashSet<T>
is a collection that stores a set of values. It uses a hash code for fast lookups. This also means that it does not allow duplicate values.
HashSet<int> hashSet = new HashSet<int>();
hashSet.Add(1);
hashSet.Add(2);
hashSet.Add(2); // This will not be added because it is a duplicate
ImmutableArray<T>
An ImmutableArray<T>
is a collection that stores a fixed-size array of values. It is immutable, meaning that once it is created, its size and contents cannot be changed.
ImmutableArray<int> immutableArray = ImmutableArray.Create(1, 2, 3, 4, 5);
Adding to the array returns a new ImmutableArray<T>
with the added value.
ImmutableArray<int> newImmutableArray = immutableArray.Add(6);
A couple of quick honorable mentions Queue<T>
and Stack<T>
A Queue<T>
is a collection that stores a first-in, first-out (FIFO) collection of values.
Queue<int> queue = new Queue<int>();
queue.Enqueue(1);
queue.Enqueue(2);
queue.Enqueue(3);
int first = queue.Dequeue(); // 1
int second = queue.Dequeue(); // 2
A Stack<T>
is a collection that stores a last-in, first-out (LIFO) collection of values.
Stack<int> stack = new Stack<int>();
stack.Push(1);
stack.Push(2);
stack.Push(3);
int top = stack.Pop(); // 3
int second = stack.Pop(); // 2
int first = stack.Pop(); // 1
🌶️🌶️🌶️ They're worth mentioning, but Stack<T>
and Queue<T>
are rarely used in practice.
When to use which collection type
- Use
ImmutableArray<T>
when you need a collection that won't change. (My favorite collection type!) - Use
List<T>
when you need a resizable collection that allows fast access to elements by index. - Use
Dictionary<TKey, TValue>
when you need a collection that maps keys to values and allows fast lookups by key. - Use
HashSet<T>
when you need a collection that stores a set of values and allows fast lookups by value. - Use
Queue<T>
when you need a collection that stores a first-in, first-out (FIFO) collection of values. - Use
Stack<T>
when you need a collection that stores a last-in, first-out (LIFO) collection of values.
🌶️🌶️🌶️ Spencer uses ImmutableArray<T>
a bunch because he believes in immutable objects. Besides that, he uses Dictionary<TKey, TValue>
and List<T>
a ton.