Heading image for post: The GraphQL Way: Rails Edition

Ruby

The GraphQL Way: Rails Edition

Profile picture of Vinicius Negrisolo

Let's create a GraphQL implementation on Ruby on Rails making it easy to maintain and having a great performance overall.

This post is part of a series of GraphQL posts, and here are the links for reference:

Setup

Here's the Ruby and Rails versions I am using in this post:

Package Version
ruby 2.7.0
rails 6.0.2.2

I created a project on Github in case you want to follow the final implementation of this post.

The following snippet is my initial setup for this project. We'll start by creating a new rails application, then we'll add and configure the GraphQL gems:

rails new graphql_way_rails
cd graphql_way_rails
bundle && rails db:create && rails db:migrate

echo "gem 'graphql'\n" >> Gemfile
bundle

rails generate graphql:install
bundle

echo "gem 'graphql-batch'\n" >> Gemfile
bundle

# add `use GraphQL::Batch` into `app/graphql/graphql_way_rails_schema.rb`

After those steps you'll see a bunch of generated code for you, including a new route to /graphql, GraphqlController, GraphqlWayRailsSchema, Types::QueryType, Types::MutationType and so on.

These are great libraries that the community have produced among many others. Please if you need more information about the gems used here go to the following:

Models

In this post we'll use a regular e-commerce data modeling with categories, products, orders and users. Here's the model generation:

rails generate model category name
rails generate model product category:references name color size price:decimal{10,2}
rails generate model user email
rails generate model order user:references ordered_at:datetime
rails generate model order_item product:references order:references
rails db:migrate

Then I changed some of the generated migrations to add some null: false on some fields like the name ones. I also added the missing has_many relations on the generated models.

Then I created some factories through factory_bot_rails and faker just to generate some random data for testing the queries. If you want to take a look into how I am generating my data please check the db/seeds.rb file on the created project. I ran this locally and here's the total counts:

Table Count
categories 25
products 658
users 500
orders 12_492
order_items 68_094

GraphQL Types

I'll use the provided generators to scaffold the GraphQL Types here:

rails generate graphql:object category name:String!
rails generate graphql:object product name:String! color:String size:String price_cents:Int!
rails generate graphql:object user email:String!
rails generate graphql:object foobar ordered_at:GraphQL::Types::ISO8601DateTime!

The last generator did not work very well for me as I had to manually fix the type for ordered_at to GraphQL::Types::ISO8601DateTime.

I decided to implement a custom field to understand how to decorate on GraphQL Types. Here there is:

# app/graphql/types/product_type.rb
class Types::ProductType < Types::BaseObject
  ...

  field :price_cents, Integer, null: false

  def price_cents
    (100 * object.price).to_i
  end
end

As you can see it's pretty straightforward to decorate your models using GraphQL Types.

Note as well that I have not created a type for OrderItem as this entity does not make sense to be exposed. The same way you may have some bloated model that you may want to split into different GraphQL Types and this is very easy to figure out how to do that.

GraphQL Queries

I'll start by exposing one get all query per GraphQL Type created so far. We're going to change this later, but for now it help us to explore the graph:

# app/graphql/types/query_type.rb
class Types::QueryType < Types::BaseObject
  field :categories, [Types::CategoryType], null: false
  field :products, [Types::ProductType], null: false
  field :users, [Types::UserType], null: false
  field :orders, [Types::OrderType], null: false

  def categories
    Category.all
  end

  def products
    Product.all
  end

  def users
    User.all
  end

  def orders
    Order.all
  end
end

For testing some queries you can run rails server and open http://localhost:3000/graphiql. This will open the already configured GraphiQL visual tool. You can also use some other tool, such as Altair chrome plugin. Altair introspects the GraphQL server to get the documentation by using the real url: http://localhost:3000/graphql. Watch out as the urls are different by just 1 character.

Once you have the tool setup run this AllPossibleDataQuery query:

query AllPossibleDataQuery {
  categories {
    name
  }
  products {
    name
    priceCents
    color
    size
  }
  users {
    email
  }
  orders {
    orderedAt
  }
}

With that you can fetch all Category, Product, User and Order. So far this is not very useful, but we can see how to access the root GraphQL queries and also we can see that we can call multiple root queries at once. The response will be a JSON with all requested data under a node called data.

So far we could produce a simple code end-to-end, from database to GraphQL query. Next let's connect the graph.

Connect the Graph

To connect the graph we can just expose the relations as any other field:

# app/graphql/types/category_type.rb
class Types::CategoryType < Types::BaseObject
  ...

  field :products, [Types::ProductType], null: false
end
# app/graphql/types/product_type.rb
class Types::ProductType < Types::BaseObject
  ...

  field :category, Types::CategoryType, null: false
  field :orders, [Types::OrderType], null: false
end
# app/graphql/types/user_type.rb
class Types::UserType < Types::BaseObject
  ...

  field :orders, [Types::OrderType], null: false
end
# app/graphql/types/order_type.rb
class Types::OrderType < Types::BaseObject
  ...

  field :user, Types::UserType, null: false
  field :products, [Types::ProductType], null: false
end

Then you could execute this query for example:

query AllUsersWithOrdersQuery {
  users {
    email
    orders {
      orderedAt
      products {
        name
        category {
          name
        }
      }
    }
  }
}

There're no arguments yet to this query, so this is still not useful for production mode, but this query exercises all types of relations in this data model: belongs_to, has_many and has_many, through: mid_table.

The problem is that the performance of this query is already very horrible. I checked the logs and for a single GraphQL AllUsersWithOrdersQuery call I counted 18400 SQL queries. And my local database is small still. There are 12_492 orders with an average of 5.45 products per order. The response time is on average 40149ms. There will be a section on the bottom of this post with a performance comparison as we are evolving our model relations fetching.

N+1 Query Issue

What are we missing? Well, we need to understand and implement data loaders to lazily load all data that comes from ActiveRecord relations.

Data Loaders are also helpful to batch the calculation of expensive fields, but I'll leave this out of this post as you may get the picture after these examples.

Batching with Data Loaders

Let's use Data Loaders for the rescue here. The gem graphql-batch uses ruby promises to lazy load the records, similarly to javascript promises. If you are not familiar to this concept please take a look GraphQL Batch documentation.

BelongsToLoader

I'll use another rails generator here:

rails generate graphql:loader belongs_to_loader

And I'll change it to:

# app/graphql/loaders/belongs_to_loader.rb
class Loaders::BelongsToLoader < GraphQL::Batch::Loader
  def initialize(model)
    @model = model
  end

  def perform(ids)
    @model.where(id: ids.uniq).each do |record|
      fulfill(record.id, record)
    end

    ids.each do |id|
      fulfill(id, nil) unless fulfilled?(id)
    end
  end
end

I got inspired by this great Data Loaders examples. I copied the record_loader.rb over renamed and simplified to use in this post. And here's how to use it:

# app/graphql/types/product_type.rb
class Types::ProductType < Types::BaseObject
  ...

  field :category, Types::CategoryType, null: false

  def category
    Loaders::BelongsToLoader.for(Category).load(object.category_id)
  end
end

and:

# app/graphql/types/order_type.rb
class Types::OrderType < Types::BaseObject
  ...

  field :user, Types::UserType, null: false

  def user
    Loaders::BelongsToLoader.for(User).load(object.user_id)
  end
end

Now, if you execute again the same AllUsersWithOrdersQuery, you'll notice a decent performance improvement already.

HasManyLoader

There's another rails generator call:

rails generate graphql:loader has_many_loader

And I'll change it to:

# app/graphql/loaders/has_many_loader.rb
class Loaders::HasManyLoader < GraphQL::Batch::Loader
  def initialize(model, column)
    @model = model
    @column = column
  end

  def perform(relation_ids)
    records_by_relation_id = @model.where({ @column => relation_ids.uniq }).group_by do |result|
      result.public_send(@column)
    end

    relation_ids.each do |id|
      fulfill(id, records_by_relation_id[id] || [])
    end
  end
end

This code was inspired by another Data Loaders examples, this in particular is the association_loader.rb.

And now my types using the HasManyLoader:

# app/graphql/types/category_type.rb
class Types::CategoryType < Types::BaseObject
  ...

  field :products, [Types::ProductType], null: false

  def products
    Loaders::HasManyLoader.for(Product, :category_id).load(object.id)
  end
end

and:

# app/graphql/types/user_type.rb
class Types::UserType < Types::BaseObject
  ...

  field :orders, [Types::OrderType], null: false

  def orders
    Loaders::HasManyLoader.for(Order, :user_id).load(object.id)
  end
end

So far this works just fine for regular has_many relation. But this won't work for has_many relations that uses an intermediate N-to-N table such as order_items.

Combination of BelongsToLoader and HasManyLoader

At this point we could create another custom Data Loader that could deal with the middle table in a N-to-N relation, but this would be boring and there might be a way to reuse what we already have somehow.

The solution is very simple and we'll have to use the ruby promises I've already mentioned. These are the changes I had to made on the types:

# app/graphql/types/order_type.rb
class Types::OrderType < Types::BaseObject
  ...

  field :products, [Types::ProductType], null: false

  def products
    order_items.then do |order_item_list|
      product_ids = order_item_list.map(&:product_id)
      Loaders::BelongsToLoader.for(Product).load_many(product_ids)
    end
  end

  private

  def order_items
    Loaders::HasManyLoader.for(OrderItem, :order_id).load(object.id)
  end
end

and:

# app/graphql/types/product_type.rb
class Types::ProductType < Types::BaseObject
  ...

  field :orders, [Types::OrderType], null: false

  def orders
    order_items.then do |order_item_list|
      order_ids = order_item_list.map(&:order_id)
      Loaders::BelongsToLoader.for(Order).load_many(order_ids)
    end
  end

  private

  def order_items
    Loaders::HasManyLoader.for(OrderItem, :product_id).load(object.id)
  end
end

You've noticed that I had to call then on the first HasManyLoader usage, and this method yields all returned mid-table model instance, on this case OrderItem instances. With that we can get the other relation ids to be passed down on a BelongsToLoader, this time using the method load_many.

As you can see the usage of promises by Data Loaders make them very easily reusable. With all that in mind how would you return a list of colors of Product from an Order, let's say product_colors inside OrderType? Or even some aggregation like products_count in CategoryType? Data Loaders definately opens a very broad way to expose your data without any performance complications.

Performance Notes

In this post I used the AllUsersWithOrdersQuery to compare the performance before and after the data loaders. These are the results I got using the data I created using my db/seeds.rb:

Metrics No Data Loaders Using Data Loaders
SQL queries 18_400 6
Allocations 30_618_967 2_011_010
Active Record 1_625.26ms 78.6ms
Views 592.26ms 586.12ms
Response Time 35_841ms 3_750ms

As you can see the N+1 queries are gone. There's a great performance improvement overall, so I could reduce the GraphQL query under test from 35.8 seconds to 3.8 seconds, this means a reduce of about 90% of response time. Another point to observe is that Rails allocates a lot of objects for us under the hood, so as we are now running way less queries, we are also saving a lot of memory allocation, around 93%.

Although this is not a big data set the GraphQL query performed on this post brings basically all the data as I am not applying any filter. There are 4 levels of GraphQL Types being returned and this involves 5 different database tables. Even if you won't have this type of query on your solution it's easy to understand the importance of keeping a good performance solution helps you to maintain the same in the medium and long run.

Conclusion

In this post we've seen how easy is to setup GraphQL for a Rails application. We connected the graph and got so many N+1 issues, hence we used DataLoaders to fetch relational data in batches and then avoid the N+1 problems.

Performance is a big deal in every application so watch out issues that could come with your GraphQL solutions and consider to write your own DataLoaders to solve them.

I pushed the code produced for this blog post as a project on Github, so take a look into this repo for reference.

Finally writing an API solution that is that flexible in a very short period of time is impressive. So make it right and enjoy it.

I hope you have enjoyed and thanks for reading!

More posts about Ruby rails graphql data loaders