Hashrocket.com / blog

Deferring constraints in PostgreSQL

posted on and written by in

Database constraints are essential to ensuring data integrity, and you should use them. Allowing them to be deferrable during transactions makes them even more convenient. A common scenario in which the database can help us is in a sortable list implementation. This post outlines the how and why of deferring database constraints, using a sortable list domain as an example.

Modeling lists of sortable items

Imagine that you have an application with multiple lists. Each list has items that can be reordered with a drag-and-drop interaction. This can be modelled in a fairly straightforward manner.

Each list has_many list items, which are ordered by the position column. Each list’s items have a position beginning with 1 and incrementing with each subsequent item.

# app/models/list.rb
class List < ActiveRecord::Base
  has_many :items, -> { order :position }, class_name: "ListItem"
  validates_presence_of :name
end
# app/models/list_item.rb
class ListItem < ActiveRecord::Base
  belongs_to :list
  validates_presence_of :name, :list, :position

  before_validation :ensure_position

  def self.update_positions(ids)
    ids.each_with_index do |id, index|
      where(id: id).update_all(position: index + 1)
    end
  end

  private
  def ensure_position
    self.position ||= self.class.where(list_id: list_id).maximum(:position).to_i + 1
  end
end

A couple things are worth noting about the ListItem class. Firstly, we have update_positions, a class method that accepts an array of ids and updates each. This method will be called in a sort controller action as such:

class ItemsController < ApplicationController
  expose(:list)

  def sort
    # list item ids is an ordered array of ids
    list.items.update_positions(params[:list_item_ids])
    head :ok
  end
end

Secondly, new items don’t necessarily know what position they should have, so we put list items that don’t have position at the end of their respective list, just before validating that the position is present.

Here are the migrations that we used to create the models' database tables:

class CreateLists < ActiveRecord::Migration
  def change
    create_table :lists do |t|
      t.string :name
      t.timestamps
    end
  end
end
class CreateListItems < ActiveRecord::Migration
  def change
    create_table :list_items do |t|
      t.belongs_to :list
      t.integer :position
      t.string :name

      t.timestamps
    end
  end
end

Notice anything missing? If you said database constraints, you’re correct! Our application is enforcing presence for most attributes, but our corresponding columns are missing NOT NULL constraints. Also, the list_id column on list_items is missing a foreign key constraint.

But I’d like to focus on another missing constraint. Our domain model has an implicit requirement that we haven’t enforced with either validations or database constraints: each list item’s position should be unique per list. No two list items in a list should have the same position. That would make the ordering non-deterministic.

We could add a uniqueness validation for position, scoped to the list_id. However, as thoughtbot recently warned, application-level uniqueness validations are insufficient at best, and fail completely in concurrent environments.

The position column needs a database-level constraint.

Adding constraints

Adding the uniqueness constraint to position is fairly straightforward in PostgreSQL. We’ll just create a new migration with the following:

class AddUniquenessValidationOnListItems < ActiveRecord::Migration
  def up
    execute <<-SQL
      alter table list_items
        add constraint list_item_position unique (list_id, position);
    SQL
  end

  def down
    execute <<-SQL
      alter table list_items
        drop constraint if exists list_item_position;
    SQL
  end
end

Let’s wrap our UPDATE statements in a transaction so that any failed UPDATE of the position column will result in none of them being updated:

class ListItem < ActiveRecord::Base
  # ...
  def self.update_positions(ids)
    transaction do
      ids.each_with_index do |id, index|
        where(id: id).update_all(position: index + 1)
      end
    end
  end
end

This ensures at the database level that positions of items are unique per list; no two items in the same list can occupy the “1” position. With regard to data integrity, this is a huge improvement over our initial implementation. But it has one drawback: it doesn’t work.

To illustrate why, imagine a list with the following items:

id | position | name
13 | 1        | Eggs
18 | 2        | Milk
35 | 3        | Bread

To move Bread to the top of the list, we would pass an array of ids, [35,13,18] to the update_positions method. This method does a series of UPDATE statements to the database. For the first id, the one for Bread, we end up sending an update statement that would look like the following:

UPDATE list_items SET position=1 WHERE id=35;

After this statement is executed in the database, but before we move on to the next id in the list, Postgres will fail its constraint checks. At the moment that the UPDATE happens, the data would be:

id | position | name
13 | 1        | Eggs
18 | 2        | Milk
35 | 1        | Bread

With both Eggs and Bread occupying the same position, the UPDATE fails. Of course, we know that we want to change the position of Eggs as well, so that its position would be “2”, and that collision would not happen. But at the time that the constraint-check happens, the database doesn’t know this.

Even within a transaction, database uniqueness constraints are enforced immediately per row. It seems our dreams of data integrity are smashed. If only there were a way to enforce uniqueness constraints at the end of the transaction, rather than the end of each statement…

Deferrable Constraints

As mentioned before, constraints are immediately enforced. This behavior can be changed within a transaction by changing a constraints deferrable characteristics. In PostgreSQL, constraints are assumed to be NOT DEFERRABLE by default.

However, constraints can also behave as deferrable in one of two ways: DEFERRABLE INITIALLY IMMEDIATE or DEFERRABLE INITIALLY DEFERRED. The first part, DEFERRABLE, is what allows the database constraint behavior to change within transactions. The second part describes what the default behavior will be within a transaction.

With a constraint that is deferrable, but initially immediate, the constraint will by default behave just like a non-deferrable constraint, checking every statement immediately. A constraint that is initially deferred will, by default, defer its checks until the transaction is committed. Both of these can change their behavior per-transaction with a call to SET CONSTRAINTS (documentation).

With that information, let’s change the definition of the constraint we defined before:

class AddUniquenessValidationOnListItems < ActiveRecord::Migration
  def up
    execute <<-SQL
      alter table list_items
        add constraint list_item_position unique (list_id, position)
        DEFERRABLE INITIALLY IMMEDIATE;
    SQL
  end

  def down
    execute <<-SQL
      alter table list_items
        drop constraint if exists list_item_position;
    SQL
  end
end

The only thing we’ve changed from before is the DEFERRABLE INITIALLY IMMEDIATE bit. I think it is a good idea to use the INITIALLY IMMEDIATE option. This will ensure that other parts of our app, and other consumers of the database will not be surprised by the behavior of the constraint; it will continue to act a like a normal, non-deferrable constraint, until we explicitly opt in to the deferral.

We now need to change our transaction block. In our case, the first database statement within the transaction must be the SET CONSTRAINTS statement:

class ListItem < ActiveRecord::Base
  # ...
  def self.update_positions(ids)
    transaction do
      connection.execute "SET CONSTRAINTS list_item_position DEFERRED"
      ids.each_with_index do |id, index|
        where(id: id).update_all(position: index + 1)
      end
    end
  end
end

Having now opted in to deferring our uniqueness constraint, reordering the items now works as expected. The constraint still ensures that we don’t have two items that occupy the same position, but waits until the end of the transaction to do that check. We can have our data integrity cake and eat it too.

Bonus: deferrable introspection

Having to name the constraint in two places is a bit of a bummer, and introduces a coupling that could bite us if the constraint name ever changed. Knowing that, we leverage PostgreSQL’s introspective abilities to query the constraint names instead.

For example, we can add the following module to our codebase:

# lib/deferrable.rb
module Deferrable
  def deferrable_uniqueness_constraints_on(column_name)
    usage = Arel::Table.new 'information_schema.constraint_column_usage'
    constraint = Arel::Table.new 'pg_constraint'
    arel = usage.project(usage[:constraint_name])
    .join(constraint).on(usage[:constraint_name].eq(constraint[:conname]))
    .where(
      (constraint[:contype].eq('u'))
      .and(constraint[:condeferrable])
      .and(usage[:table_name].eq(table_name))
      .and(usage[:column_name].eq(column_name))
    )
    connection.select_values arel
  end

  def transaction_with_deferred_constraints_on(column_name)
    transaction do
      constraints = deferrable_uniqueness_constraints_on(column_name).join ","
      connection.execute("SET CONSTRAINTS %s DEFERRED" % constraints)
      yield
    end
  end
end

And now change our model to use it:

class ListItem < ActiveRecord::Base
  extend Deferrable
  # ...
  def self.update_positions(ids)
    transaction_with_deferred_constraints_on(:position) do
      ids.each_with_index do |id, index|
        where(id: id).update_all(position: index + 1)
      end
    end
  end
end

And, boom! Less coupling.

NOTE That’s a lot of Arel! Use at your own risk. ;-)

Example application

While writing this post, I created a sample Rails app to iterate quickly. I used TDD to write the initial approach, and reused the specs while I “refactored” the implementation to the subsequent approaches. Each commit on the master branch more or less follows the sections above.

Posted in Development and tagged with Ruby, PostgreSQL