Ruby
The GraphQL Way: Limited Edition
This post shares a powerful way to implement GraphQL in Ruby with filtering, sorting and paginating. No matter which GraphQL node your data is we want to apply the same logic. The goal is always to have a simple implementation that would be easy to maintain and do not introduce any performance issues like N+1 queries.
This post is the third part of a series of [GraphQL] posts. If you have not read the previous ones, the links are listed below for reference:
- [GraphQL Way Post] => Introduction to GraphQL and reasons to use it;
- [GraphQL Rails Post] => Connect Rails models as Graphql nodes with no N+1 queries issue;
The models
In this post we will reuse the models we created in the previous post. As a reminder, here's what we created using rails generate:
rails generate model category name
rails generate model product category:references name color size price:decimal{10,2}
rails generate model user email
rails generate model order user:references ordered_at:datetime
rails generate model order_item product:references order:references
rails db:migrate
As you can see we are creating a basic version of an e-commerce domain model.
Filtering
Our first challenge is: how do we create filters in GraphQL, and how can we implement them in an elegant and reusable way? Well, filtering in GraphQL makes sense in two places: in a root query, and as a collection field. We'll be focusing here on the Products collection, however the same solution can apply to any ActiveRecord collection you have mapped.
In our e-commerce model we are exposing the products
collection in the QueryType
(as a root query), in the CategoryType
(as a has_many
relation) and finally in the OrderType
(as a has_many through
relation).
Let's work on filtering a Product by its name
. As a consumer of GraphQL API we expect to be able to use that filter in any of the places previously mentioned. Let's use the auto-generated Types::BaseObject
to define our new field_products
method:
# app/graphql/types/base_object.rb
module Types
class BaseObject < GraphQL::Schema::Object
field_class Types::BaseField
def self.field_products
field :products, [Types::ProductType], null: false do
argument :name, String, required: false
argument :color, String, required: false
argument :size, String, required: false
argument :min_price_cents, Integer, required: false, prepare: ->(value, _ctx) { value && value / 100 }, as: :min_price
argument :max_price_cents, Integer, required: false, prepare: ->(value, _ctx) { value && value / 100 }, as: :max_price
end
end
end
end
We see that this code defines a field products
with fice filters. Although straightforward, I'd like to highlight the prepare:
option that we can use to manipulate the input for example. For the min_price_cents
we convert from cents to dollars, and due to that we alias the argument name from min_price_cents
to min_price
.
The three places that we are using this field_products
with its products
implementation are:
# app/graphql/types/query_type.rb
class Types::QueryType < Types::BaseObject
...
field_products
def products(**query_options)
Product.graphql_query(query_options)
end
end
# app/graphql/types/category_type.rb
class Types::CategoryType < Types::BaseObject
...
field_products
def products(**query_options)
Loaders::HasManyLoader
.for(Product, :category_id, query_options)
.load(object.id)
end
end
# app/graphql/types/order_type.rb
class Types::OrderType < Types::BaseObject
...
field_products
def products(**query_options)
Loaders::HasManyLoader
.for(OrderItem, :order_id, {product: query_options})
.load(object.id)
.then do |order_items|
product_ids = order_items.map(&:product_id)
Loaders::BelongsToLoader
.for(Product)
.load_many(product_ids)
end
end
end
The graphql
library sends the field_products
arguments as ruby keyword args, so we use a double splat (**
) to convert it into a ruby hash. Finally, we pass the query_options
to the new model method graphql_query
.
The only exception is that in OrderType
our relation to products uses an intermediate model OrderItem
because Order
and Product
have a many-to-many relationship. This necessitates that we implement the graphql_query
method in both Product
and OrderItem
. Don't worry because that code will be also reusable, and we'll see how to do that in a moment!
But before that let's examine a small change we need to make to the HasManyLoader
:
# app/graphql/loaders/has_many_loader.rb
class Loaders::HasManyLoader < GraphQL::Batch::Loader
def initialize(model, column, query_options)
@model = model
@column = column
@query_options = query_options
end
def perform(relation_ids)
query = @model.graphql_query(@query_options).where({@column => relation_ids})
records_by_relation_id = query.group_by { |result| result.public_send(@column) }
relation_ids.each do |id|
fulfill(id, records_by_relation_id[id] || [])
end
end
end
We change it to call the graphql_query
method on the @model
with the @query_options
.
We created the HasManyLoader
is in the previous blog post, so check that out to improve your familiarity with GraphQL DataLoaders!
Our final bit of code is our graphql_query
. For this one we will use plain old ActiveRecord in order to chain every filter applied, if they are present:
# app/models/product.rb
class Product < ApplicationRecord
belongs_to :category
has_many :order_items
has_many :orders, through: :order_items
scope :by_name, ->(name) { where("name ILIKE ?", "#{name}%") if name }
scope :by_color, ->(color) { where(color: color) if color }
scope :by_size, ->(size) { where(size: size) if size }
scope :by_min_price, ->(price) { where("price >= ?", price) if price }
scope :by_max_price, ->(price) { where("price <= ?", price) if price }
def self.graphql_query(options)
Product.all
.by_name(options[:name])
.by_color(options[:color])
.by_size(options[:size])
.by_min_price(options[:min_price])
.by_max_price(options[:max_price])
end
end
And as we saw before, here's the graphql_query
method for the table in the middle OrderItem
:
# app/models/order_item.rb
class OrderItem < ApplicationRecord
belongs_to :product
def self.graphql_query(options)
query = OrderItem.all
if options[:product]
product_query = Product.graphql_query(options[:product])
query = query.joins(:product).merge(product_query)
end
query
end
end
OrderItem
has no knowledge of what filters were applied in the Product
model, so it delegates the query options to the same Product#graphql_query
, and uses the magic of ActiveRecord merge
method. One small trick we use here is to wrap the product query options
with the symbol :product
. We do this because OrderItem
is used in both directions between Order
and Product
. And that's it for our OrderItem
model!
Using scopes with conditions that check for the presence of an argument lets us chain them arbitrarily, which is exactly what we want here, score!
Once we have this in place, and we want to add a new filter, we must make changes in two places only: the definition of the GraphQL field in the BaseObject
, and the graphql_query
method in our ActiveRecord model. Quite maintainable, would'n you say?
Now the GraphQL client can execute a query like this:
query {
categories {
name
products (name: "aero", minPriceCents: 50) {
name
color
}
}
}
To translate the above query, we are fetching all categories, and for each category we return all its products that start with aero
, and have price greater than 50
cents.
Sorting
Implementing sorting in our GraphQL Schema is a continuation of the same pattern we used for filtering. The only difference here is that for sorting we want the sort options to be a hardcoded list. In order to do that we will use GraphQL Enums, as they play really well with the generated documentation. Bonus!
First let's add a new argument to the field_products
:
# app/graphql/types/base_object.rb
module Types
class BaseObject < GraphQL::Schema::Object
field_class Types::BaseField
def self.field_products
field :products, [Types::ProductType], null: false do
...
argument :sort, [Types::Enum::ProductSort], required: false, default_value: []
end
end
end
end
We define the sort
argument as an array of ProductSort
enum. This avoids tied results.
The graphql
gem needs a class to define the values in the enum, so we create it:
# app/graphql/types/enum/product_sort.rb
class Types::Enum::ProductSort < Types::BaseEnum
value "nameAsc", value: [:name, :asc]
value "nameDesc", value: [:name, :desc]
value "colorAsc", value: [:color, :asc]
value "colorDesc", value: [:color, :desc]
value "sizeAsc", value: [:size, :asc]
value "sizeDesc", value: [:size, :desc]
value "priceAsc", value: [:price, :asc]
value "priceDesc", value: [:price, :desc]
end
We use the value
option in order to map the value from the GraphQL query to the value used in our Ruby code. Also, the value is in a format that could be used directly in [ActiveRecord order method]. So what happens to out model code?:
# app/models/product.rb
class Product < ApplicationRecord
...
def self.graphql_query(options)
Product.all
.by_name(options[:name])
.by_color(options[:color])
.by_size(options[:size])
.by_min_price(options[:min_price])
.by_max_price(options[:max_price])
.order(options[:sort].to_h)
end
end
With that in place, we can query as follows:
query {
categories {
name
products (name: "aero", sort: [nameDesc, colorAsc]) {
name
color
}
}
}
Fantastic!
Paginating
I have to start this section by saying that there might be cases that you want to wrap your collection into a structure and add to that a count
field. I don't feel that's necessary for this post, but you may want to consider that option depending on your problem. Or maybe just add a new sibling field with the total count.
That said, let's see how to limit
and set a page
for our query. The first issue we want to address is to add a min
and max
configuration to these fields. Reasoning: we don't want our frontend developers accidentally requesting millions of records in a single GraphQL query. To avoid this, we can create two reusable arguments in the autogenerated BaseField
class:
# app/graphql/types/base_field.rb
module Types
class BaseField < GraphQL::Schema::Field
argument_class Types::BaseArgument
def argument_limit(default: 10, min: 1, max: 100)
prepare = ->(value, _ctx) {
if value&.between?(min, max)
value
else
message = "'limit' must be between #{min} and #{max}"
raise GraphQL::ExecutionError, message
end
}
argument(:limit, Integer, required: false, default_value: default, prepare: prepare)
end
def argument_page(default: 1, min: 1)
prepare = ->(value, _ctx) {
if value && value >= min
value
else
message = "'page' must be greater or equals than #{min}"
raise GraphQL::ExecutionError, message
end
}
argument(:page, Integer, required: false, default_value: default, prepare: prepare)
end
end
end
The idea here is the same as creating reusable fields, yet this time we are creating reusable arguments. With that we can validate min
and max
values for our limit
and page
arguments. Nice! We choose to raise an error with a friendly message so that the consumer of this GraphQL Schema will understand why that query can't be performed.
Next, we need to update our BaseObject
:
# app/graphql/types/base_object.rb
module Types
class BaseObject < GraphQL::Schema::Object
field_class Types::BaseField
def self.field_products
field :products, [Types::ProductType], null: false do
argument :name, String, required: false
argument :color, String, required: false
argument :size, String, required: false
argument :min_price_cents, Integer, required: false, prepare: ->(value, _ctx) { value && value / 100 }, as: :min_price
argument :max_price_cents, Integer, required: false, prepare: ->(value, _ctx) { value && value / 100 }, as: :max_price
argument :sort, [Types::Enum::ProductSort], required: false, default_value: []
argument_limit
argument_page
end
end
end
end
Then we need to append our two new scopes limit
and page
to our model:
# app/models/product.rb
class Product < ApplicationRecord
belongs_to :category
has_many :order_items
has_many :orders, through: :order_items
scope :by_name, ->(name) { where("name ILIKE ?", "#{name}%") if name }
scope :by_color, ->(color) { where(color: color) if color }
scope :by_size, ->(size) { where(size: size) if size }
scope :by_min_price, ->(price) { where("price >= ?", price) if price }
scope :by_max_price, ->(price) { where("price <= ?", price) if price }
def self.graphql_query(options)
Product.all
.by_name(options[:name])
.by_color(options[:color])
.by_size(options[:size])
.by_min_price(options[:min_price])
.by_max_price(options[:max_price])
.order(options[:sort].to_h)
.limit(options[:limit])
.page(options[:limit], options[:page])
end
end
Wow, this is so simple and elegant, and it even kind of works. If we now run the root query of products we can filter, sort and limit our results.
But not so fast.
If we try a query like:
query {
categories {
name
products (name: "aero", sort: [nameDesc], limit: 3) {
name
color
}
}
}
We notice that our query is returning only 3 products across all categories, and our goal is to limit products
per category
returned. This happens because we are using DataLoaders to help us out with the N+1 queries issue. Is there a way out of this trap?
HasManyLoader - part III
To solve our issue, we need to change our HasManyLoader
in such a way that the limit will be applied to each "parent" relation. In other words, and in this example, we want to limit grouped products
by their categories
.
One way we can accomplish this is to wrap up our generated SQL query in the products
table and use [PostgresQL Lateral subquery] to group products by categories, and apply the limit only inside that "group". The nice thing about this solution is that our HasManyLoader
is generic enough that it will work from Categories to Products, or from Users to Orders, or even from Orders to Products through the OrderItem model!
Presenting our final version of our HasManyLoader
:
class Loaders::HasManyLoader < GraphQL::Batch::Loader
def initialize(model, column, query_options)
@model = model
@column = column
@query_options = query_options
end
def perform(relation_ids)
query = @model.graphql_query(@query_options)
if query.limit_value || query.offset_value
sub_query = query.where("#{@model.table_name}.#{@column} = tmp_relation_ids")
query = @model
.select("tmp_lat_join_tab.*")
.from("UNNEST(ARRAY[#{relation_ids.join(",")}]) tmp_relation_ids")
.joins("JOIN LATERAL (#{sub_query.to_sql}) tmp_lat_join_tab ON TRUE")
else
query = query.where({@column => relation_ids})
end
records_by_relation_id = query.group_by { |result| result.public_send(@column) }
relation_ids.each do |id|
fulfill(id, records_by_relation_id[id] || [])
end
end
end
We are using the limit_value
and offset_value
ActiveRecord methods to see if the call to @model.graphql_query(@query_options)
is setting a limit or a page, and if so, we are wrapping the generated query as a kind of a nested sql query, using lateral joins. Also we have all the "category_ids" in the relation_ids
variable loaded by the DataLoader gem, so we can build a temporary table using UNNEST
on our array of category_ids.
This query might look scary, but it is less complicated than if looks at a first glance. If you are unfamiliar with PG Lateral subqueries, I recommend taking some time to read their docs. I analyzed and ran this query with a bunch of data on my machine and I am quite satisfied with the performance of this SQL query so far.
All this code is public and shared as a [project on Github], so go have a look, start if you like it! As a bonus there's an example of an RSpec test file that I used to cover the features I wanted to build for this post.
Conclusion
Although GraphQL specification is not huge, libraries tend to have massive documentation as there are so many topics to cover, which increases the learning curve.
With this series of post my goal is to help flatten the curve a bit by showing an example code to serve a GraphQL Schema, while keeping it refined and organized in a way that it becomes easy to maintain.
We also preemptively avoided any big performance issues in the query by using DataLoaders, and guarding against N+1 queries. While we may be able to delay performance tuning until later, it's best to address performance problems right away.
I hope you find this blog series relevant to your work, and I hope you have enjoyed following along! Thanks for reading!
[GraphQL]: https://graphql.org/ [GraphQL Way Post]: https://hashrocket.com/blog/posts/graphql-way [GraphQL Rails Post]: https://hashrocket.com/blog/posts/graphql-way-rails-edition [project on Github]: https://github.com/hashrocket/graphql_way_rails [ActiveRecord order method]: https://api.rubyonrails.org/classes/ActiveRecord/QueryMethods.html#method-i-order [PostgresQL Lateral subquery]: https://www.postgresql.org/docs/current/queries-table-expressions.html
Photo by Clint McKoy on Unsplash