Large common tables

Ruby PostgreSQL

Writable Common Table Expressions

Image 100x100 jack christensen

Writable common table expressions are a powerful feature of PostgreSQL that is rarely used when building applications with an ORM like ActiveRecord. It allows inserts, updates, and deletes to be performed in the WITH clause of an SQL statement. Here are a few techniques that I have found useful.

Writable common table expressions documentation

Write Master-Detail Records in a Single Statement

Creating 1 master record and 3 detail records in ActiveRecord makes 6 round trips to the database server -- 4 inserts, 1 begin, and 1 commit.

question = text: "What is your favorite text editor?" text: 'vim' text: 'emacs' text: 'Sublime Text'! # this does 6 SQL statements

This can be reduced to a single complex statement that returns the question id with a writable CTE:

with q as (
  insert into questions(text)
  values('What is your favorite text editor?')
  returning *
), a as (
  insert into answers(question_id, text)
  select, t
  from q
    cross join (values ('vim'), ('emacs'), ('Sublime Text')) t
select id
from q;

Let's take a closer look at that SQL. On line 4, returning * makes the insert return a result set just like a select does. Therefore q becomes a result set that can be referenced later so the answers know the question_id foreign key. The rows to insert into answers are producted by cross joining the inserted question with a values list of answer texts. After the question and answers are inserted, we select question_id as the result of the entire statement.

Unfortunately, we need to build this SQL by hand. Let's look at how we do that:

class Question < ActiveRecord::Base
  has_many :answers

  def self.insert_via_cte(attributes)
    sql = <<-SQL
      with q as (
        insert into questions(text) values(?) returning *
      ), a as (
        insert into answers(question_id, text)
        select, t.text
        from q cross join (values
    args = [attributes[:question]]

    sql << attributes[:answers].map do |a|
      args << a
    end.join(", ")

    sql << ") t(text)) select id from q"

    connection.select_value sanitize_sql([sql, *args])

Building the SQL string and the arguments array by hand is much more complex than the original ActiveRecord version. In particular, whenever building SQL strings care should be taken to avoid SQL injection vulnerabilities. The sanitize_sql method can help.

A simple benchmark against a local PostgreSQL server shows the CTE version is 8x faster. With a remote server the round trip times would be much higher, meaning that the gains could be even more dramatic. If you need every last bit of write performance this can make a substantial difference and the increased complexity may be worth it.


Sometimes you want to update a record if it exists and create it if it doesn't. In traditional ActiveRecord this could be done like this:

  product = Product.find_or_initialize_by(id: pid)
  product.counter += 1!
rescue ActiveRecord::RecordNotUnique, ActiveRecord::StaleObjectError

Provided Product has a lock_version column this will be correct. However, depending on the how bad the concurrency collisions are it could be fairly slow.

In PostgreSQL we can do this:

with upd as (
  update products
  set counter = counter + 1,
    lock_version = lock_version + 1 -- play nice with ActiveRecord
  where id = 42
  returning *
insert into products(id, counter)
select 42, 1
where not exists(select * from upd);

This does have a small race condition that will cause uniqueness errors on the insert. Provided the above SQL was wrapped in a Product.merge_increment class method this could be resolved by retrying on ActiveRecord::RecordNotUnique.

rescue ActiveRecord::RecordNotUnique

In an admittedly contrived benchmark of very heavy concurrent updates, the PostgreSQL version was over 3x faster.

Delete and Archive

This tip is more for database maintenance than for application code. Sometimes you need to delete some records, but you would like to keep them archived somewhere else. Writable CTEs make this easy.

with deleted as (
  delete from products where id < 10 returning *
insert into deleted_products
select *
from deleted;

Source Code

Source code for samples is on Github.

More posts about Ruby PostgreSQL