Heading image for post: Advanced Queries with ActiveRecord's From Method

Ruby

Advanced Queries with ActiveRecord's From Method

Profile picture of Josh Branchaud

For those of us that write database-backed Rails apps, ActiveRecord is a familiar beast. We know its various query building methods well: where, order, limit, and even group. The most important SQL clause is conspicuously missing from our repertoire: the from clause.

Chances are you can't think of many, if any, times that you've used ActiveRecord's from method. This is to be expected. The from clause in SQL specifies the target table of a query. In the context of ActiveRecord, the target table is explicitly expressed by the model from which you query.

Consider the times you use one-liners such as Developer.first, Developer.all, and Developer.count. With each of these, the developers table as the target of the query is expressed by the use of the Developer model.

So, where does the from method come into the picture?

The docs describe the from method as such.

Specifies table from which the records will be fetched.

Specifying the table of a query against an ActiveRecord model is going to feel a bit redundant.

> Developer.from('developers').to_sql
=> "SELECT \"developers\".* FROM developers"

As you've probably noticed we get this same behavior with nothing more than the all method.

> Developer.all.to_sql
=> "SELECT \"developers\".* FROM \"developers\""

So, again, where does the from method come into the picture?

Our from clauses aren't always going to look as simple as from developers. We sometimes need the "table" from which we are fetching to be a collection of computed values. The key insight here is that the result of a query is itself a table. That means that we can query against the result of another query. This is ostensibly how subqueries work.

The from method comes into play when we are building up a complex query in SQL and need to transition back to the land of ActiveRecord and Rails.

Let's work with the developer_scores.sql query from the hr-til codebase.

select
 developers.id,
 username,
 posts,
 likes,
 round(likes::numeric / posts, 2) as avg_likes,
 round(log(2, (1022 * ((score - min(score) over ()) / ((max(score) over ()) - (min(score) over ()))) + 2)::numeric), 1) as hotness
from developers
join (
  select developer_id as id, sum(score) as score
  from hot_posts
  group by developer_id
) developer_scores using(id)
join (
  select
  developer_id as id,
  count(*) as posts,
  sum(likes) as likes
  from posts
  group by developer_id
) stats using(id)

This query is used to compute both the average number of likes and the hotness score for each developer's posts. Because of the complexity of the query, we are better off writing it in SQL rather than using ActiveRecord's DSL. This query does a complex computation that results in a bunch of data. It would be nice if we could further refine the results as needed with ActiveRecord's DSL. And herein lies the challenge. How do we transition from this SQL query to something compatible with ActiveRecord?

The from method -- the topic of this post -- is, of course, the answer. We can treat the above query as a "table" that we can query against (i.e. a subquery). This allows us to then use order, limit, group, etc. to further refine the result set.

We can create a class (app/models/developer_ranker.rb) to see how this can work.

class DeveloperRanker
  RANKED_DEV_SQL = <<-SQL
      (select
       developers.id,
       username,
       posts,
       likes,
       round(likes::numeric / posts, 2) as avg_likes,
       round(log(2, (1022 * ((score - min(score) over ()) / ((max(score) over ()) - (min(score) over ()))) + 2)::numeric), 1) as hotness
      from developers
      join (
        select developer_id as id, sum(score) as score
        from hot_posts
        group by developer_id
      ) developer_scores using(id)
      join (
        select
        developer_id as id,
        count(*) as posts,
        sum(likes) as likes
        from posts
        group by developer_id
      ) stats using(id)
      ) as developers
    SQL

  def self.top_developers(count=1)
    Developer.from(RANKED_DEV_SQL)
      .order(hotness: :desc)
      .limit(count)
  end

  def self.bottom_developers(count=1)
    Developer.from(RANKED_DEV_SQL)
      .order(hotness: :asc)
      .limit(count)
  end
end

We treat the complex query as a subquery by wrapping it in (...) as developers assigning it as a string to a constant. This constant is then available to any methods that want to build on this query without duplicating all the core query logic.

As you can see in DeveloperRanker.top_developers, we hydrate an ActiveRecord::Relation with the results of the query. We can then chain those familiar ActiveRecord query methods on the end. In this case, we order the results with the developers most on fire at the top and then limiting by the given count. Note that because we are querying against the Developer model, it is essential that the subquery's name matches the name that ActiveRecord uses for that model's table.

We create DeveloperRanker.bottom_developers to perform a similar query with very little duplication.

An additional benefit of this approach is that the Developer objects that are being hydrated are decorated with the additional values returned from the query. Not only do we know the id and username of each developer, we can also access the avg_likes and hotness attributes. These hydrated Developer objects can be passed to the view layer like any other decorated object would be.

This technique is great for generating reports and stats that require lots of computation or complex joins, but are ultimately tied to particular domain concepts (e.g. a developer).

Ultimately, this frees you to write some SQL without completely cutting yourself off from the world of ActiveRecord. Enjoy!


Cover photo by Samuel Zeller on Unsplash.com

  • Adobe logo
  • Barnes and noble logo
  • Aetna logo
  • Vanderbilt university logo
  • Ericsson logo

We're proud to have launched hundreds of products for clients such as LensRentals.com, Engine Yard, Verisign, ParkWhiz, and Regions Bank, to name a few.

Let's talk about your project