What is an N+1 query?
The N+1 query issue arises when your code executes N number of extra
database queries to fetch some data that could have been fetched in a single
query.
This is a major performance bottleneck. Let us understand it better and see how
it can bring down application performance with the help of an example:
In the above example, each review belongs to a movie and each movie can have
multiple reviews.
Now, consider a scenario where you are required to fetch a list of all movies
along with the reviews for each movie in the list.
This can be done by querying the movies and then looping over the movies to get
the reviews for each movie like this:
As you can see in the above example, the number of queries made appear to be
N+1, where one query fetches the list of movies and then N number of queries
fetch the reviews for N number of movies.
Hence, the number of network calls appear to be proportional to the number of
movie records in this case. What if the movie list contained a thousand or more
records? Surely making that many calls to the database for a single operation is
not efficient.
In the coming sections we will see how to overcome this issue.
Query with includes method
Rails provides an ActiveRecord method called includes which can be used to
specify associations to be included in the result set. It loads associated
records beforehand using techniques like eager loading or preloading, and
thus reducing the number of database calls and improving the application
performance significantly.
Let us take the previous example and try to implement it using includes method
to see how it works:
As you can see from the above example, specifying the reviews association
inside includes method has led to a reduced number of queries to fetch the
same information that previously took N+1 queries. This can provide a
significant boost to the application's performance.
Now, when you call the reviews method on a movie object from the list of
movies, your application won't make an additional database call to get you the
information. This is because the information was preloaded.
Under the hood the includes method either uses the preload method or
eager_load method depending upon the type of query to load up the association
data in advance.
Preload
By default includes delegates to preload method which makes two queries like
we saw in the previous example.
- First query loads up the data from the parent model.
- Second query loads up data from the specified association.
Previous example would look the exact same if we were to replace includes with
preload like this:
Eager load
In case an includes query contains a where clause which contains an
attribute from the specified association, then includes delegates to the
eager_load method. eager_load method loads all association data along with
the data from parent model in a single query using a LEFT OUTER JOIN on the
associated table. For example:
We will see how joins work later in this chapter.
includes method in Rails cannot use the preload method when a query contains
the where clause because preload always generates two SQL queries and
where clause cannot be used in such a case.
For example:
As you can see, using preload in a query with the where clause when the
conditions inside where clause contain attributes from the association table
leads to an error because Rails applies the conditions inside where clause on
the parent model which is the Movie class in this case.
This means that you can still use preload with a where clause if the where
clause contains attributes which are not from the specified association. For
example:
In the above example, the where clause doesn't contain any attributes from the
association table i.e. the reviews table.
Specifying multiple associations in includes method
In Rails, you can specify multiple associations inside the includes method.
Consider the following example where the Movie class also has an association
with the Actor class:
You can query the reviews and actors along with the movies using the
includes method like this:
There is an additional query to load up the information for the actors
association in the above example. Thus a separate query will be made for each
association specified inside the includes method.
Loading nested associations with includes
Suppose each movie has many actors and each actor belongs to a certain region.
Can you visualize the nested associations in this case? We can use includes to
load nested associations like this:
Query with joins method
A join clause is used to combine records from two tables by using values common
to each.
Active Record provides two finder methods for specifying JOIN clauses on the
resulting SQL namely joins and left_outer_joins.joins method should be
used for INNER JOIN whereas the left_outer_joins method is used for queries
using LEFT OUTER JOIN.
An INNER JOIN query performed on two tables, compares each row of table_1 with
each row of table_2 to find all pairs of rows which satisfy the join condition,
then combines the rows that match the join condition to produce a single table.
For example, consider an example where all movie records have to be fetched
where the reviews contain less than and equal to 3 stars:
In the above example, the joins method is performing an INNER JOIN on
movies and reviews table based on the common movie_id value in both tables
and producing a table which contains all rows where the review contains 3 or
less than 3 stars.
A LEFT OUTER JOIN is an extension on INNER JOIN, where an INNER JOIN join
is performed first, then for each row in table_1 which does not satisfy the join
condition with any row in table_2, a joined row is added with null values in
columns of table_2. Thus, the joined table always has at least one row for each
row in table_1.
Query with nested joins
To see how we can make query with nested joins let's update the schema like this:
In the above example, each review belongs to a booking and each booking belongs
to a movie. One movie can have multiple bookings and each booking can have
multiple reviews. Now, the movie and review are not directly connected.
Let's consider a scenario where we want to fetch a list of all movies where the
review has 3 or more stars:
In the above query, the joins method is performing an INNER JOIN on movies
and bookings table based on common value movie_id. Then INNER_JOIN is
performed on the bookings and reviews table based on the common value
booking_id. This produces a table containing all the movies having at least
one booking and rating. Then on this table, another query is performed to get
the movie with a rating of 3 or more stars. Most importantly, this nested join
statement allowed us to ensure that only a single DB query was invoked.
To read more about querying and joins please refer following links:
joins vs includes
You should use joins when you are merely filtering records based on values in
associations and you should use includes when you have to access the values in
associations because joins method does not load the association data.
For example, if you have to filter out movie records based on reviews, then use
joins and if you have to access the reviews and render them then you should
use includes.
Common pitfalls leading to N+1 Queries
1. Accessing associated records in view templates
Let’s say you have a Post model that belongs to a Author:
And you have a view that displays a list of posts, along with the author’s name for each post:
If you load the posts using a simple @posts = Post.all query in your controller, this will result in an N+1 query issue. This means one query is executed to load all the posts, and N additional queries are executed to load the author's name for each post. This issue can be easily overlooked because the associated records are being accessed in the view template.
The same consideration should be taken when rendering partials or usage of helper methods that access associated records, especially when the code is separated across different architectural layers.
2. Using delegates can hide N+1 queries.
By looking at the code in this example, it appears to be perfectly fine, with no obvious issues. However, behind the scenes, Rails’s delegate method calls user on each post individually, leading to another N+1 query problem.
3. Using methods that do not return an ActiveRecord::Relation within loops
Active Record implements "method chaining" which allow us to use multiple Active Record methods together. You can chain methods in a statement when the previous method returns an ActiveRecord::Relation, like all, where, includes, joins and order. You can’t chain Active Record methods, after a method that does not return an ActiveRecord::Relation, like to_a, find or last. You need to put those methods at the end of the statement.
If you try each link of the chain, you will see that just the call to last does not return an ActiveRecord::Relation.
This is what happens when you try to fetch the list:
At the moment you call each on the ActiveRecord::Relation, it will execute the query. But then for each post it will execute a new query, because although you already have the comments loaded, with the code order(:created_at).last you are building a new query to fetch the latest comment with a different order.
This is why when you try to fetch a list of posts, the method seems to ignore the includes, and runs a query for each post to get the latest_comment resulting in N+1 query.
How to detect N+1 queries ?
1. Use the bullet gem
The bullet gem can automatically detect and alert you to N+1 queries in your application.
2. Usage of strict_loading
By default, Rails will not raise an error if you try to access an association that has not been loaded. However, you can manually use the strict_loading option to cause an error to be raised if you try to access an unloaded association. This can help prevent N+1 queries by alerting you about cases where you accidentally executed unnecessary queries. You can read more about it in this blog.
3. Using performance monitoring tools
Leverage performance monitoring tools like New Relic to assess your application's performance and identify N+1 queries. Additionally, you can use tools such as rack-mini-profiler and stackprof locally to generate flame graphs, which can help you detect N+1 queries.