Ruby PostgreSQL
Writable Common Table Expressions
Writable common table expressions are a powerful feature of PostgreSQL that is rarely used when building applications with an ORM like ActiveRecord. It allows inserts, updates, and deletes to be performed in the WITH clause of an SQL statement. Here are a few techniques that I have found useful.
Writable common table expressions documentation
Write Master-Detail Records in a Single Statement
Creating 1 master record and 3 detail records in ActiveRecord makes 6 round trips to the database server -- 4 inserts, 1 begin, and 1 commit.
question = Question.new text: "What is your favorite text editor?"
question.answers.new text: 'vim'
question.answers.new text: 'emacs'
question.answers.new text: 'Sublime Text'
question.save! # this does 6 SQL statements
This can be reduced to a single complex statement that returns the question id with a writable CTE:
with q as (
insert into questions(text)
values('What is your favorite text editor?')
returning *
), a as (
insert into answers(question_id, text)
select q.id, t
from q
cross join (values ('vim'), ('emacs'), ('Sublime Text')) t
)
select id
from q;
Let's take a closer look at that SQL. On line 4, returning *
makes the insert return a result set just like a select does. Therefore q
becomes a result set that can be referenced later so the answers know the question_id
foreign key. The rows to insert into answers
are producted by cross joining the inserted question with a values list of answer texts. After the question and answers are inserted, we select question_id
as the result of the entire statement.
Unfortunately, we need to build this SQL by hand. Let's look at how we do that:
class Question < ActiveRecord::Base
has_many :answers
def self.insert_via_cte(attributes)
sql = <<-SQL
with q as (
insert into questions(text) values(?) returning *
), a as (
insert into answers(question_id, text)
select q.id, t.text
from q cross join (values
SQL
args = [attributes[:question]]
sql << attributes[:answers].map do |a|
args << a
"(?)"
end.join(", ")
sql << ") t(text)) select id from q"
connection.select_value sanitize_sql([sql, *args])
end
end
Building the SQL string and the arguments array by hand is much more complex than the original ActiveRecord version. In particular, whenever building SQL strings care should be taken to avoid SQL injection vulnerabilities. The sanitize_sql
method can help.
A simple benchmark against a local PostgreSQL server shows the CTE version is 8x faster. With a remote server the round trip times would be much higher, meaning that the gains could be even more dramatic. If you need every last bit of write performance this can make a substantial difference and the increased complexity may be worth it.
Merge
Sometimes you want to update a record if it exists and create it if it doesn't. In traditional ActiveRecord this could be done like this:
begin
product = Product.find_or_initialize_by(id: pid)
product.counter += 1
product.save!
rescue ActiveRecord::RecordNotUnique, ActiveRecord::StaleObjectError
retry
end
Provided Product
has a lock_version
column this will be correct. However, depending on the how bad the concurrency collisions are it could be fairly slow.
In PostgreSQL we can do this:
with upd as (
update products
set counter = counter + 1,
lock_version = lock_version + 1 -- play nice with ActiveRecord
where id = 42
returning *
)
insert into products(id, counter)
select 42, 1
where not exists(select * from upd);
This does have a small race condition that will cause uniqueness errors on the insert. Provided the above SQL was wrapped in a Product.merge_increment
class method this could be resolved by retrying on ActiveRecord::RecordNotUnique
.
begin
Product.merge_increment(pid)
rescue ActiveRecord::RecordNotUnique
retry
end
In an admittedly contrived benchmark of very heavy concurrent updates, the PostgreSQL version was over 3x faster.
Delete and Archive
This tip is more for database maintenance than for application code. Sometimes you need to delete some records, but you would like to keep them archived somewhere else. Writable CTEs make this easy.
with deleted as (
delete from products where id < 10 returning *
)
insert into deleted_products
select *
from deleted;
Source Code
Source code for samples is on Github.