Only get the data you need

And the key reason for your slow query is it outputs too much data. Consider adding: Take(200), Skip() to take only the data you need or the current page requires. Use a pager to generate the report. This might helps a lot.

Consider the following code (BAD Example)

var myArticles = await _dbContext
    .Articles
    .Where(t => t.OwnerId == me.Id)
    .ToListAsync();

var theFirstArticle = myArticles[0];
var theSecondArticle = myArticles[1];

The system will load all data to get only your second article.

var myArticles = await _dbContext
    .Articles
    .Where(t => t.OwnerId == me.Id)
    .Take(2) // Only get two items.
    .ToListAsync();

var theFirstArticle = myArticles[0];
var theSecondArticle = myArticles[1];

Consider about IQueryable and IEnumerable

When we are writing a query to our database, we need to be careful about what we are currently working with. For most LINQ operations supports both IQueryable and IEnumerable.

IQueryable is a query definition. It will not be loaded if not necessary. Extend this object only changes how EF generates the SQL.

IEnumerable is a collection of data that can be enumerated. It will be loaded once you access it. For example, if you add a filter to it and iterate the collection, the system will try to get all data before running the filter.

Consider the following code (BAD Example):

        public IEnumerable<Topic> GetTopics()
        {
            // Return an IQueryable.
            return _dbContext.Topics;
        }

        public object SomeAction()
        {
            var myTopics = GetTopics()
                .Where(t => t.SummitId == 5); // This `Where` method comes from `System.Linq`. Not from `System.Linq.Queryable`

            return myTopics;
        }

The code calls GetTopics. While the code defines an EF query and returns it, the upper level can't understand that this is a query definition. But be treated as a collection of data.

Change that to:

        public IQueryable<Topic> GetAllTopicsQuery()
        {
            var myTopics = await _dbContext
                .Topics // Topics is an IQueryable. So the query will only happed when necessary.
                .Where(t => t.SummitId == 5); // This `Where` method comes from `System.Linq.Queryable`. It defines the query.

            return myTopics;
        }

Call ToList immediately when your query can load all you need

Calling the ToList method from an IQueryable object will do the query job and loads all data in the memory immediately. So when you can make sure that your current query loads everything you need, just call ToList to prevent system lazy load.

If you are not trying to get a collection of data, but only one item, call SingleOrDefault or FirstOrDefault instead.

Consider the following code (BAD Example):

            var allTopics = _dbContext
                .Topics
                .Where(t => t.SummitId == summitId);
            foreach (var topic in allTopics) // Not called ToList(). So we are iterating an IQueryable.
            {
                // do something.
                bigName += topic.Title;
            }

The system will load the topic one by one while executing your logic. This gonna be slow.

Change that to:

            var allTopicTitles = await _dbContext
                .Topics
                .Where(t => t.SummitId == summitId)
                .Select(t => t.Title)
                .ToListAsync();
            foreach (var topicTitle in allTopicTitles) // allTopicTitles is a List<string>. This loop happens in the memory.
            {
                // do something.
                bigName += topicTitle;
            }

The first two tips indicate that:

  • If you are building a query and not everything in the query is needed, write code with IQueryable.
  • If your query is built fine and everything you gonna query is necessary for the logic, make it a list immediately to prevent further IO operations.

Prevent tracking

Consider adding AsNoTracking for this makes query performance better if you don't need to save the changes.

Reference: https://docs.microsoft.com/en-us/ef/core/querying/tracking#no-tracking-queries

Prevent Include

Include generates SQL to select multiple tables. Which greatly increased complexity. You can only select the data you need and prevent writing the Include function.

For example, if you only want to get the last ball in the box, consider writing like this:

public class Box
{
    public int Id { get; set; }
    public IEnumerable<Ball> Balls { get; set; }
}

public class Ball
{
    public int Id { get; set; }

    public int BoxId { get; set; }
    public Box Box { get; set; }
}
    var boxes = await Boxes
                // DO NOT Call Include(t => t.Balls) here!
                .Where(somecondition)
                .Select(t => new Box(){
                  Id = t.Id,
                  Balls = t.Balls.OrderByDescending(x => x.CreationTime)
                             .Take(1) // Only get what you need
                })               
                .ToListAsync()

Also when we use Select we can remove .Include because it won’t have any effect here.

Will the projection property be null if I dont use include above? No, It won’t! Select will take care of it. Give it a try!

Prevent client-side evaluation

In the old version of EF, client-side evaluation is supported. So if you are trying to do some complicated calculation with DateTime or TimeSpan, EF will load all the data and do the evaluation locally. (Fixed in .NET Core 3.0)

Client-side evaluation means that EF will query all data in the database and typically very slow. Consider the following code(BAD Example):

            var viewers = await _dbContext
                .Events
                .Where(t => t.CreationTime + TimeSpan.FromMinutes(1) > DateTime.UtcNow)
                .Select(t => t.Fingerprint)
                .Distinct()
                .CountAsync();

EF can't translate the timespan calculation to SQL. So it has to load all data from the database and query from the client-side.

To fix this, evaluate the time first in code and build the query then.

            var startCountTime = DateTime.UtcNow - TimeSpan.FromMinutes(1);
            var viewers = await _dbContext
                .Events
                .Where(t => t.CreationTime > startCountTime)
                .Select(t => t.Fingerprint)
                .Distinct()
                .CountAsync();