Heading image for post: The GraphQL Way: Limited Edition

Ruby

The GraphQL Way: Limited Edition

Profile picture of Vinicius Negrisolo

This post shares a powerful way to implement GraphQL in Ruby with filtering, sorting and paginating. No matter which GraphQL node your data is we want to apply the same logic. The goal is always to have a simple implementation that would be easy to maintain and do not introduce any performance issues like N+1 queries.

This post is the third part of a series of GraphQL posts. If you have not read the previous ones, the links are listed below for reference:

The models

In this post we will reuse the models we created in the previous post. As a reminder, here's what we created using rails generate:

rails generate model category name
rails generate model product category:references name color size price:decimal{10,2}
rails generate model user email
rails generate model order user:references ordered_at:datetime
rails generate model order_item product:references order:references
rails db:migrate

As you can see we are creating a basic version of an e-commerce domain model.

Filtering

Our first challenge is: how do we create filters in GraphQL, and how can we implement them in an elegant and reusable way? Well, filtering in GraphQL makes sense in two places: in a root query, and as a collection field. We'll be focusing here on the Products collection, however the same solution can apply to any ActiveRecord collection you have mapped.

In our e-commerce model we are exposing the products collection in the QueryType (as a root query), in the CategoryType (as a has_many relation) and finally in the OrderType (as a has_many through relation).

Let's work on filtering a Product by its name. As a consumer of GraphQL API we expect to be able to use that filter in any of the places previously mentioned. Let's use the auto-generated Types::BaseObject to define our new field_products method:

# app/graphql/types/base_object.rb
module Types
  class BaseObject < GraphQL::Schema::Object
    field_class Types::BaseField

    def self.field_products
      field :products, [Types::ProductType], null: false do
        argument :name, String, required: false
        argument :color, String, required: false
        argument :size, String, required: false
        argument :min_price_cents, Integer, required: false, prepare: ->(value, _ctx) { value && value / 100 }, as: :min_price
        argument :max_price_cents, Integer, required: false, prepare: ->(value, _ctx) { value && value / 100 }, as: :max_price
      end
    end
  end
end

We see that this code defines a field products with fice filters. Although straightforward, I'd like to highlight the prepare: option that we can use to manipulate the input for example. For the min_price_cents we convert from cents to dollars, and due to that we alias the argument name from min_price_cents to min_price.

The three places that we are using this field_products with its products implementation are:

# app/graphql/types/query_type.rb
class Types::QueryType < Types::BaseObject
  ...

  field_products

  def products(**query_options)
    Product.graphql_query(query_options)
  end
end
# app/graphql/types/category_type.rb
class Types::CategoryType < Types::BaseObject
  ...

  field_products

  def products(**query_options)
    Loaders::HasManyLoader
      .for(Product, :category_id, query_options)
      .load(object.id)
  end
end
# app/graphql/types/order_type.rb
class Types::OrderType < Types::BaseObject
  ...

  field_products

  def products(**query_options)
    Loaders::HasManyLoader
      .for(OrderItem, :order_id, {product: query_options})
      .load(object.id)
      .then do |order_items|
        product_ids = order_items.map(&:product_id)

        Loaders::BelongsToLoader
          .for(Product)
          .load_many(product_ids)
      end
  end
end

The graphql library sends the field_products arguments as ruby keyword args, so we use a double splat (**) to convert it into a ruby hash. Finally, we pass the query_options to the new model method graphql_query.

The only exception is that in OrderType our relation to products uses an intermediate model OrderItem because Order and Product have a many-to-many relationship. This necessitates that we implement the graphql_query method in both Product and OrderItem. Don't worry because that code will be also reusable, and we'll see how to do that in a moment!

But before that let's examine a small change we need to make to the HasManyLoader:

# app/graphql/loaders/has_many_loader.rb
class Loaders::HasManyLoader < GraphQL::Batch::Loader
  def initialize(model, column, query_options)
    @model = model
    @column = column
    @query_options = query_options
  end

  def perform(relation_ids)
    query = @model.graphql_query(@query_options).where({@column => relation_ids})

    records_by_relation_id = query.group_by { |result| result.public_send(@column) }

    relation_ids.each do |id|
      fulfill(id, records_by_relation_id[id] || [])
    end
  end
end

We change it to call the graphql_query method on the @model with the @query_options.

We created the HasManyLoader is in the previous blog post, so check that out to improve your familiarity with GraphQL DataLoaders!

Our final bit of code is our graphql_query. For this one we will use plain old ActiveRecord in order to chain every filter applied, if they are present:

# app/models/product.rb
class Product < ApplicationRecord
  belongs_to :category
  has_many :order_items
  has_many :orders, through: :order_items

  scope :by_name, ->(name) { where("name ILIKE ?", "#{name}%") if name }
  scope :by_color, ->(color) { where(color: color) if color }
  scope :by_size, ->(size) { where(size: size) if size }
  scope :by_min_price, ->(price) { where("price >= ?", price) if price }
  scope :by_max_price, ->(price) { where("price <= ?", price) if price }

  def self.graphql_query(options)
    Product.all
      .by_name(options[:name])
      .by_color(options[:color])
      .by_size(options[:size])
      .by_min_price(options[:min_price])
      .by_max_price(options[:max_price])
  end
end

And as we saw before, here's the graphql_query method for the table in the middle OrderItem:

# app/models/order_item.rb
class OrderItem < ApplicationRecord
  belongs_to :product

  def self.graphql_query(options)
    query = OrderItem.all

    if options[:product]
      product_query = Product.graphql_query(options[:product])
      query = query.joins(:product).merge(product_query)
    end

    query
  end
end

OrderItem has no knowledge of what filters were applied in the Product model, so it delegates the query options to the same Product#graphql_query, and uses the magic of ActiveRecord merge method. One small trick we use here is to wrap the product query options with the symbol :product. We do this because OrderItem is used in both directions between Order and Product. And that's it for our OrderItem model!

Using scopes with conditions that check for the presence of an argument lets us chain them arbitrarily, which is exactly what we want here, score!

Once we have this in place, and we want to add a new filter, we must make changes in two places only: the definition of the GraphQL field in the BaseObject, and the graphql_query method in our ActiveRecord model. Quite maintainable, would'n you say?

Now the GraphQL client can execute a query like this:

query {
  categories {
    name
    products (name: "aero", minPriceCents: 50) {
      name
      color
    }
  }
}

To translate the above query, we are fetching all categories, and for each category we return all its products that start with aero, and have price greater than 50 cents.

Sorting

Implementing sorting in our GraphQL Schema is a continuation of the same pattern we used for filtering. The only difference here is that for sorting we want the sort options to be a hardcoded list. In order to do that we will use GraphQL Enums, as they play really well with the generated documentation. Bonus!

First let's add a new argument to the field_products:

# app/graphql/types/base_object.rb
module Types
  class BaseObject < GraphQL::Schema::Object
    field_class Types::BaseField

    def self.field_products
      field :products, [Types::ProductType], null: false do
        ...

        argument :sort, [Types::Enum::ProductSort], required: false, default_value: []
      end
    end
  end
end

We define the sort argument as an array of ProductSort enum. This avoids tied results.

The graphql gem needs a class to define the values in the enum, so we create it:

# app/graphql/types/enum/product_sort.rb
class Types::Enum::ProductSort < Types::BaseEnum
  value "nameAsc", value: [:name, :asc]
  value "nameDesc", value: [:name, :desc]
  value "colorAsc", value: [:color, :asc]
  value "colorDesc", value: [:color, :desc]
  value "sizeAsc", value: [:size, :asc]
  value "sizeDesc", value: [:size, :desc]
  value "priceAsc", value: [:price, :asc]
  value "priceDesc", value: [:price, :desc]
end

We use the value option in order to map the value from the GraphQL query to the value used in our Ruby code. Also, the value is in a format that could be used directly in ActiveRecord order method. So what happens to out model code?:

# app/models/product.rb
class Product < ApplicationRecord
  ...

  def self.graphql_query(options)
    Product.all
      .by_name(options[:name])
      .by_color(options[:color])
      .by_size(options[:size])
      .by_min_price(options[:min_price])
      .by_max_price(options[:max_price])
      .order(options[:sort].to_h)
  end
end

With that in place, we can query as follows:

query {
  categories {
    name
    products (name: "aero", sort: [nameDesc, colorAsc]) {
      name
      color
    }
  }
}

Fantastic!

Paginating

I have to start this section by saying that there might be cases that you want to wrap your collection into a structure and add to that a count field. I don't feel that's necessary for this post, but you may want to consider that option depending on your problem. Or maybe just add a new sibling field with the total count.

That said, let's see how to limit and set a page for our query. The first issue we want to address is to add a min and max configuration to these fields. Reasoning: we don't want our frontend developers accidentally requesting millions of records in a single GraphQL query. To avoid this, we can create two reusable arguments in the autogenerated BaseField class:

# app/graphql/types/base_field.rb
module Types
  class BaseField < GraphQL::Schema::Field
    argument_class Types::BaseArgument

    def argument_limit(default: 10, min: 1, max: 100)
      prepare = ->(value, _ctx) {
        if value&.between?(min, max)
          value
        else
          message = "'limit' must be between #{min} and #{max}"
          raise GraphQL::ExecutionError, message
        end
      }

      argument(:limit, Integer, required: false, default_value: default, prepare: prepare)
    end

    def argument_page(default: 1, min: 1)
      prepare = ->(value, _ctx) {
        if value && value >= min
          value
        else
          message = "'page' must be greater or equals than #{min}"
          raise GraphQL::ExecutionError, message
        end
      }

      argument(:page, Integer, required: false, default_value: default, prepare: prepare)
    end
  end
end

The idea here is the same as creating reusable fields, yet this time we are creating reusable arguments. With that we can validate min and max values for our limit and page arguments. Nice! We choose to raise an error with a friendly message so that the consumer of this GraphQL Schema will understand why that query can't be performed.

Next, we need to update our BaseObject:

# app/graphql/types/base_object.rb
module Types
  class BaseObject < GraphQL::Schema::Object
    field_class Types::BaseField

    def self.field_products
      field :products, [Types::ProductType], null: false do
        argument :name, String, required: false
        argument :color, String, required: false
        argument :size, String, required: false
        argument :min_price_cents, Integer, required: false, prepare: ->(value, _ctx) { value && value / 100 }, as: :min_price
        argument :max_price_cents, Integer, required: false, prepare: ->(value, _ctx) { value && value / 100 }, as: :max_price
        argument :sort, [Types::Enum::ProductSort], required: false, default_value: []
        argument_limit
        argument_page
      end
    end
  end
end

Then we need to append our two new scopes limit and page to our model:

# app/models/product.rb
class Product < ApplicationRecord
  belongs_to :category
  has_many :order_items
  has_many :orders, through: :order_items

  scope :by_name, ->(name) { where("name ILIKE ?", "#{name}%") if name }
  scope :by_color, ->(color) { where(color: color) if color }
  scope :by_size, ->(size) { where(size: size) if size }
  scope :by_min_price, ->(price) { where("price >= ?", price) if price }
  scope :by_max_price, ->(price) { where("price <= ?", price) if price }

  def self.graphql_query(options)
    Product.all
      .by_name(options[:name])
      .by_color(options[:color])
      .by_size(options[:size])
      .by_min_price(options[:min_price])
      .by_max_price(options[:max_price])
      .order(options[:sort].to_h)
      .limit(options[:limit])
      .page(options[:limit], options[:page])
  end
end

Wow, this is so simple and elegant, and it even kind of works. If we now run the root query of products we can filter, sort and limit our results.

But not so fast.

If we try a query like:

query {
  categories {
    name
    products (name: "aero", sort: [nameDesc], limit: 3) {
      name
      color
    }
  }
}

We notice that our query is returning only 3 products across all categories, and our goal is to limit products per category returned. This happens because we are using DataLoaders to help us out with the N+1 queries issue. Is there a way out of this trap?

HasManyLoader - part III

To solve our issue, we need to change our HasManyLoader in such a way that the limit will be applied to each "parent" relation. In other words, and in this example, we want to limit grouped products by their categories.

One way we can accomplish this is to wrap up our generated SQL query in the products table and use PostgresQL Lateral subquery to group products by categories, and apply the limit only inside that "group". The nice thing about this solution is that our HasManyLoader is generic enough that it will work from Categories to Products, or from Users to Orders, or even from Orders to Products through the OrderItem model!

Presenting our final version of our HasManyLoader:

class Loaders::HasManyLoader < GraphQL::Batch::Loader
  def initialize(model, column, query_options)
    @model = model
    @column = column
    @query_options = query_options
  end

  def perform(relation_ids)
    query = @model.graphql_query(@query_options)

    if query.limit_value || query.offset_value
      sub_query = query.where("#{@model.table_name}.#{@column} = tmp_relation_ids")

      query = @model
        .select("tmp_lat_join_tab.*")
        .from("UNNEST(ARRAY[#{relation_ids.join(",")}]) tmp_relation_ids")
        .joins("JOIN LATERAL (#{sub_query.to_sql}) tmp_lat_join_tab ON TRUE")
    else
      query = query.where({@column => relation_ids})
    end

    records_by_relation_id = query.group_by { |result| result.public_send(@column) }

    relation_ids.each do |id|
      fulfill(id, records_by_relation_id[id] || [])
    end
  end
end

We are using the limit_value and offset_value ActiveRecord methods to see if the call to @model.graphql_query(@query_options) is setting a limit or a page, and if so, we are wrapping the generated query as a kind of a nested sql query, using lateral joins. Also we have all the "category_ids" in the relation_ids variable loaded by the DataLoader gem, so we can build a temporary table using UNNEST on our array of category_ids.

This query might look scary, but it is less complicated than if looks at a first glance. If you are unfamiliar with PG Lateral subqueries, I recommend taking some time to read their docs. I analyzed and ran this query with a bunch of data on my machine and I am quite satisfied with the performance of this SQL query so far.

All this code is public and shared as a project on Github, so go have a look, start if you like it! As a bonus there's an example of an RSpec test file that I used to cover the features I wanted to build for this post.

Conclusion

Although GraphQL specification is not huge, libraries tend to have massive documentation as there are so many topics to cover, which increases the learning curve.

With this series of post my goal is to help flatten the curve a bit by showing an example code to serve a GraphQL Schema, while keeping it refined and organized in a way that it becomes easy to maintain.

We also preemptively avoided any big performance issues in the query by using DataLoaders, and guarding against N+1 queries. While we may be able to delay performance tuning until later, it's best to address performance problems right away.

I hope you find this blog series relevant to your work, and I hope you have enjoyed following along! Thanks for reading!


Photo by Clint McKoy on Unsplash