Querying with EF Core Basics

Up to this point, we've:

Created a DbContext class to represent our database connection.
Created a DbSet property for each entity we want to query.
Created our migrations
Seeded our database

Now we want to learn how to query our database.

Querying with LINQ

The main way we'll be querying our database is by using LINQ.

Quick refresher on LINQ: we can use LINQ to transform and filter data.

var highEarners = employees
    .Where(e => e.Salary > 100000)
    .OrderBy(e => e.Name)
    .Select(e => e.Name);

In this example, we're:

Filtering the employees list to only include employees with a salary greater than 100,000.
Ordering the filtered list by the employee's name.
Selecting the name of each employee.

Entity Framework Core allows us to write LINQ queries to query our database.

So, let's start by discussing our existing repo pattern with EF Core.

The Repository Pattern, EF Core, and DbContext

Remember that our IRepository interface is our reflection of the Repository pattern in our code.

public interface IRepository<T>
{
    T? GetById(int id);
    IEnumerable<T> GetAll();
    void Create(T entity);
    void Update(T entity);
    void Delete(T entity);
}

But what does this actually look like as apps get more complex?

Consider an example where we want to be able to find employees not just by ID, but their SSN. Already that breaks our IRepository interface as we can't make that a method in our interface - it doesn't make sense as a general repository method.

So we could create an IEmployeeRepository interface:

public interface IEmployeeRepository : IRepository<Employee>
{
    Employee? GetBySSN(string ssn);
}

And then we could create an EmployeeRepository class that implements that interface:

public class EmployeeRepository : IEmployeeRepository
{
    private readonly AppDbContext _context;

    public EmployeeRepository(AppDbContext context)

But consider the impact of doing this on our code. We have at least 3 separate files for representing a single repository for employees.

This rabbit hole goes a lot deeper, but let's just skip to DbContext and Spencer's opinion.

DbContext is actually a combination of the Repository pattern and the Unit of Work pattern. We'll dive more into the Unit of Work pattern later - just know for now that our DbContext class is already a repository.

It's Spencer's VERY SPICY 🌶️🌶️🌶️ opinion that we should use DbContext as our repository and remove our IRepository interface altogether. Why is this considered spicy? Because the community seems to be split on this, almost 50/50 (by Spencer's non-scientific polling he did once on Twitter).

What ends up happening is that people end up wrapping their DbContext in their own repository interface that usually ends up being a very thin wrapper over the DbContext. And like most abstractions in software development, abstractions leak (e.g. fail to meet all use cases while adding unnecessary complexity).

There are reasons to wrap DbContext but we're not going to do it because I believe that the complexity of the architecture should reveal itself. Start with simpler, not more complex, and ship software! </end soap box>

Now that Spencer is off his soap box

We have a lot of refactoring to do. I like to straight up remove old code/interfaces and let the compiler tell me where we need to fix things.

We'll delete the IRepository<T> and the EmployeeRepository files first (and the Abstractions folder altogether - why not).

In our EmployeeController, we'll add AppDbContext to our constructor and set it to a private variable.

We'll also comment out our POST and PUT endpoints for now, as well as the benefits endpoint - we'll fix those back in the next section. (Failing tests, oh my!)

Let's change the GetAllEmployees endpoint first by changing it to use the Employee DbSet.

/// <summary>
/// Get all employees.
/// </summary>
/// <returns>An array of all employees.</returns>
[HttpGet]
[ProducesResponseType(typeof(IEnumerable<GetEmployeeResponse>), StatusCodes.Status200OK)]
[ProducesResponseType(StatusCodes.Status500InternalServerError)]
public async Task<IActionResult> GetAllEmployees()
{
    var employees = await _dbContext.Employees.ToArrayAsync();

    return Ok(employees.Select(EmployeeToGetEmployeeResponse));
}

And then the single endpoint:

/// <summary>
/// Gets an employee by ID.
/// </summary>
/// <param name="id">The ID of the employee.</param>
/// <returns>The single employee record.</returns>
[HttpGet("{id:int}")]
[ProducesResponseType(typeof(GetEmployeeResponse), StatusCodes.Status200OK)]
[ProducesResponseType(StatusCodes.Status404NotFound)]
[ProducesResponseType(StatusCodes.Status500InternalServerError)]
public async Task<IActionResult> GetEmployeeById(int id)
{
    var employee = await _dbContext.Employees.SingleOrDefaultAsync(e => e.Id == id);
    if (employee == null)
    {
        return NotFound();
    }

    var employeeResponse = EmployeeToGetEmployeeResponse(employee);
    return Ok(employeeResponse);
}

Note the use of SingleOrDefault, not FirstOrDefault. There is also a FindAsync method that takes in the key (or keys, if you have multiple ID keys on your object) - but I use SingleOrDefault out of preference and habit.

Let's run our API (forget the tests for a moment) and see what we got.

You'll see our seeded data is there... but wait, it seems like it's missing something. Where are the bennies?

[
  {
    "firstName": "John",
    "lastName": "Doe",
    "address1": "123 Main St",
    "address2": null,
    "city": "Anytown",
    "state": "NY",
    "zipCode": "12345",
    "phoneNumber": "555-123-4567",
    "email": "john.doe@example.com",
    "benefits": []
  },
  {
    "firstName": "Jane",
    "lastName": "Smith",
    "address1": "456 Elm St",
    "address2": "Apt 2B",
    "city": "Othertown",
    "state": "CA",
    "zipCode": "98765",
    "phoneNumber": "555-987-6543",
    "email": "jane.smith@example.com",
    "benefits": []
  }
]

It's because by default, EF Core won't pull them back unless we specifically request them. Let's alter our call to get all employees to include benefits:

/// <summary>
/// Get all employees.
/// </summary>
/// <returns>An array of all employees.</returns>
[HttpGet]
[ProducesResponseType(typeof(IEnumerable<GetEmployeeResponse>), StatusCodes.Status200OK)]
[ProducesResponseType(StatusCodes.Status500InternalServerError)]
public async Task<IActionResult> GetAllEmployees()
{
    var employees = await _dbContext.Employees
        .Include(e => e.Benefits)
        .ToArrayAsync();

    return Ok(employees.Select(EmployeeToGetEmployeeResponse));
}

The Include call tells EF Core to query those objects from the database as well. Ahh, much better:

[
  {
    "firstName": "John",
    "lastName": "Doe",
    "address1": "123 Main St",
    "address2": null,
    "city": "Anytown",
    "state": "NY",
    "zipCode": "12345",
    "phoneNumber": "555-123-4567",
    "email": "john.doe@example.com",
    "benefits": [
      {
        "id": 1,
        "employeeId": 1,
        "benefitType": 0,
        "cost": 100
      },
      {
        "id": 2,
        "employeeId": 1,
        "benefitType": 1,
        "cost": 50
      }
    ]
  },
  {
    "firstName": "Jane",
    "lastName": "Smith",
    "address1": "456 Elm St",
    "address2": "Apt 2B",
    "city": "Othertown",
    "state": "CA",
    "zipCode": "98765",
    "phoneNumber": "555-987-6543",
    "email": "jane.smith@example.com",
    "benefits": [
      {
        "id": 3,
        "employeeId": 2,
        "benefitType": 0,
        "cost": 120
      },
      {
        "id": 4,
        "employeeId": 2,
        "benefitType": 2,
        "cost": 30
      }
    ]
  }
]

Check the logger and see how the SQL changed when we added our Include call.

Next, we'll change our GetAll employee implementation, with a TWIST! Let's add some query string properties to filter and paginate this list.

Filtering and pagination

There are problems with the GetAllEmployees code. For one, that essentially does a SELECT * on that entire table, and if you have thousands of records... that's a lot of data that isn't all going to be displayed by the caller anyways. (You don't need to paginate EVERY object - but when the list goes from 0..n and we don't know the upper bound of n, we should be cautious and paginate.)

Let's create a new class representing our incoming request for paging. We can also add a couple of cool properties for filtering by first and last name.

public class GetAllEmployeesRequest
{
    public int? Page { get; set; }
    public int? RecordsPerPage { get; set; }
    
    public string? FirstNameContains { get; set; }
    public string? LastNameContains { get; set; }
}

🌶️🌶️🌶️ Pagination is sometimes done by using what's called limit and offset, where we use limit to limit the number of records and offset to skip that many records in the list. I don't care personally, just be consistent in your API whatever you choose!

Let's define our default page number as page number 1 and our number of records as 100. We should also create a validator to ensure no one is passing in invalid values (like negative number of records or pages).

public class GetAllEmployeesRequestValidator : AbstractValidator<GetAllEmployeesRequest>
{
    public GetAllEmployeesRequestValidator()
    {
        RuleFor(r => r.Page).GreaterThanOrEqualTo(1).WithMessage("Page number must be set to a positive non-zero integer.");
        RuleFor(r => r.RecordsPerPage)
            .GreaterThanOrEqualTo(1).WithMessage("You must return at least one record.")
            .LessThanOrEqualTo(100).WithMessage("You cannot return more than 100 records.");
    }
}

🌶️🌶️🌶️ 100 is a fairly arbitrary choice. The actual max number of records really depends on your use case. And anyways, who cares? It's easy enough to change while your API is young.

Now we can change our GetAllEmployees implementation to add the paging and the filters:

/// <summary>
/// Get all employees.
/// </summary>
/// <returns>An array of all employees.</returns>
[HttpGet]
[ProducesResponseType(typeof(IEnumerable<GetEmployeeResponse>), StatusCodes.Status200OK)]
[ProducesResponseType(StatusCodes.Status500InternalServerError)]
public async Task<IActionResult> GetAllEmployees([FromQuery] GetAllEmployeesRequest? request)
{
    int page = request?.Page ?? 1;
    int numberOfRecords = request?.RecordsPerPage ?? 100;

    IQueryable<Employee> query = _dbContext.Employees
        .Include(e => e.Benefits)
        .Skip((page - 1) * numberOfRecords)
        .Take(numberOfRecords);

    if (request != null)
    {
        if (!string.IsNullOrWhiteSpace(request.FirstNameContains))
        {
            query = query.Where(e => e.FirstName.Contains(request.FirstNameContains));
        }
        
        if (!string.IsNullOrWhiteSpace(request.LastNameContains))
        {
            query = query.Where(e => e.LastName.Contains(request.LastNameContains));
        }
    }

    var employees = await query.ToArrayAsync();

    return Ok(employees.Select(EmployeeToGetEmployeeResponse));
}

We use the FromQuery attribute to mark our request class as coming from the query string.
We capture our employee get in a query variable so we can alter it as we add filtering.
We do some simple if/then statements to change our query as we go.
Finally, our original repo didn't support async (by design), so we're using async here.

Try it out now inside of Swagger/Postman and see how the request is changed.

🌶️🌶️🌶️ If your search gets complicated, you'll almost certainly need to convert it to a POST request. That's a-ok with me!

🌶️🌶️🌶️ It's not always ideal to bring your entire object graph into memory before returning to the client, a la: var employees = await query.ToArrayAsync();. This is a bit of a waste of memory, but the problem is EF Core can't use EmployeeToGetEmployeeResponse directly. You can get around this using libraries like AutoMapper (and the magical ProjectTo!) or by simply constructing the object yourself inline, but we'll just do this to keep things simple.

Altering our tests

Of course we have some failing tests, but it doesn't mean we can't change our get all test to use our new query string. But we got a tiny bit of cleanup to do first.

Let's first get SQLite running in memory - YES, it has an in-memory implementation!

🌶️🌶️🌶️ There is an in-memory provider for EF Core. Pretty great, right? Yeah, definitely not. AVOID IT AT ALL COSTS. Even Microsoft doesn't recommend you use it. We'll discuss an even better option than SQLite for testing a bit further into this course. But for now, SQLite it is.

Let's ensure that our test API server doesn't try and use employees.db - we can do that by creating a CustomWebApplicationFactory that overrides the ConfigureWebHost method:

public class CustomWebApplicationFactory : WebApplicationFactory<Program>
{
    protected override void ConfigureWebHost(IWebHostBuilder builder)
    {
        builder.ConfigureServices(services => 
        {
            var dbContextDescriptor = services.SingleOrDefault(
                d => d.ServiceType ==
                    typeof(DbContextOptions<AppDbContext>));

            services.Remove(dbContextDescriptor);

            var dbConnectionDescriptor = services.SingleOrDefault(
                d => d.ServiceType ==
                    typeof(DbConnection));

            services.Remove(dbConnectionDescriptor);

            // Create open SqliteConnection so EF won't automatically close it.
            services.AddSingleton<DbConnection>(container =>
            {
                var connection = new SqliteConnection("DataSource=:memory:");
                connection.Open();

                return connection;
            });

            services.AddDbContext<AppDbContext>((container, options) =>
            {
                var connection = container.GetRequiredService<DbConnection>();
                options.UseSqlite(connection);
            });

        });
    }
}

This looks like a lot of stuff, but boiling it down, we're just removing the existing DbContext and DbConnection services and replacing them with our own that uses the SQLite in-memory provider.

Then we replace our WebApplicationFactory<Program> with our new CustomWebApplicationFactory in our test class.

Then we run our tests and... exception: Employees table does not exist?! Let's fix this right quick by adding migrations inside of our Seed method (and rename it to MigrateAndSeed for posterity). (MigrateAndSeed runs because the stuff we have in Program.cs runs before our tests, which is good - it gives us as close a test of a real system as we can hope for).

public static void MigrateAndSeed(IServiceProvider serviceProvider)
{
    var context = serviceProvider.GetRequiredService<AppDbContext>();
    context.Database.Migrate();

    if (!context.Employees.Any())
    {
        context.Employees.AddRange(
            new Employee
            {
                FirstName = "John",
                LastName = "Doe",
                SocialSecurityNumber = "123-45-6789",
                Address1 = "123 Main St",
                City = "Anytown",
                State = "NY",
                ZipCode = "12345",
                PhoneNumber = "555-123-4567",
                Email = "john.doe@example.com",
                Benefits = new List<EmployeeBenefits>
                {
                    new EmployeeBenefits { BenefitType = BenefitType.Health, Cost = 100.00m },
                    new EmployeeBenefits { BenefitType = BenefitType.Dental, Cost = 50.00m }
                }
            },
            new Employee
            {
                FirstName = "Jane",
                LastName = "Smith",
                SocialSecurityNumber = "987-65-4321",
                Address1 = "456 Elm St",
                Address2 = "Apt 2B",
                City = "Othertown",
                State = "CA",
                ZipCode = "98765",
                PhoneNumber = "555-987-6543",
                Email = "jane.smith@example.com",
                Benefits = new List<EmployeeBenefits>
                {
                    new EmployeeBenefits { BenefitType = BenefitType.Health, Cost = 120.00m },
                    new EmployeeBenefits { BenefitType = BenefitType.Vision, Cost = 30.00m }
                }
            }
        );

        context.SaveChanges();
    }
}

We'll then add a new test to test our query string:

public class BasicTests : IClassFixture<WebApplicationFactory<Program>>
{
    private readonly int _employeeId = 1;
    private readonly WebApplicationFactory<Program> _factory;

    public BasicTests(WebApplicationFactory<Program> factory)
    {
        _factory = factory;
    }

    [Fact]
    public async Task GetAllEmployees_ReturnsOkResult()
    {
        var client = _factory.CreateClient();
        var response = await client.GetAsync("/employees");

        if (!response.IsSuccessStatusCode)
        {
            var content = await response.Content.ReadAsStringAsync();
            throw new Exception($"Failed to get employees: {content}");
        }

        var employees = await response.Content.ReadFromJsonAsync<IEnumerable<GetEmployeeResponse>>();
        Assert.NotEmpty(employees);
    }

    [Fact]
    public async Task GetAllEmployees_WithFilter_ReturnsOneResult()
    {
        var client = _factory.CreateClient();
        var response = await client.GetAsync("/employees?FirstNameContains=John");

        response.EnsureSuccessStatusCode();

        var employees = await response.Content.ReadFromJsonAsync<IEnumerable<GetEmployeeResponse>>();
        Assert.Single(employees);
    }

    [Fact]
    public async Task GetEmployeeById_ReturnsOkResult()
    {
        var client = _factory.CreateClient();
        var response = await client.GetAsync("/employees/1");

        response.EnsureSuccessStatusCode();
    }
}

🌶️🌶️🌶️ In practice, it's impractical to test every permutation of a query string request.

← Previous Next →