Heading image for post: MySQL Has Transactions, Sorta

MySQL Has Transactions, Sorta

Profile picture of Josh Branchaud

I've used PostgreSQL almost exclusively for several years. I've chosen it again and again for my own projects and guided many of my clients to it for theirs. In the process I have definitely taken my share of jabs at MySQL's deficiencies. A recent brownfield project required me to use MySQL and so I took some time to revisit some of my assumptions about the tool. I hope to write about those soon. This post, however, focuses on a particular issue, that of transactions.

It's true. If you are using MySQL with the InnoDB engine, you'll have support for transactions when working with data. There is a catch though, an implicit one. One that can certainly suprise you if you aren't aware of it.

MySQL does not support transactions for DDL changes. To understand the implications of this, we can turn to the MySQL documentation on the Transaction Life Cycle.

The normal transaction is committed in a similar way (by going over all engines in thd->transaction.all list) but at different times:

  • When the user issues an SQL COMMIT statement

  • Implicitly, when the server begins handling a DDL statement or SET AUTOCOMMIT={0|1} statement

It is that second condition that should catch our eye.

Transactions with Data and DDL

If we are in the midst of a transaction -- perhaps we have deleted some data from one table and modified rows on another -- and we then execute any DDL statement, such as adding a table or changing a default value, the transaction will be implicitly and silently commited. Any data changes that we have made, regardless of whether or not we wanted them, will have been committed.

Let's look at an example of that.

mysql> create table persons (id serial primary key, first_name varchar(20) not null, last_name varchar(40) not null);
Query OK, 0 rows affected (0.02 sec)

mysql> insert into persons (first_name, last_name) values ('Liz', 'Lemon');
Query OK, 1 row affected (0.01 sec)

mysql> select * from persons;
+----+------------+-----------+
| id | first_name | last_name |
+----+------------+-----------+
|  1 | Liz        | Lemon     |
+----+------------+-----------+
1 row in set (0.00 sec)

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into persons (first_name, last_name) values ('Jack', 'Donaghy');
Query OK, 1 row affected (0.00 sec)

mysql> alter table persons modify column last_name varchar(40);
Query OK, 0 rows affected (0.12 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> rollback;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from persons;
+----+------------+-----------+
| id | first_name | last_name |
+----+------------+-----------+
|  1 | Liz        | Lemon     |
|  2 | Jack       | Donaghy   |
+----+------------+-----------+
2 rows in set (0.00 sec)

Though it is uncommon to be simultaneously making data and DDL changes in the same transaction, this example clearly illustrates the potential for a transaction to be unexpectedly committed. As you may have noticed, MySQL gave us no indication that the transaction was implicitly committed. In fact, when we issued the rollback; statement, it again gave us no signal that data was committed.

In this simple case, we can select all rows and clearly see that data changes were committed. However, in a typical database, there is far too much data to glance at. You'd have to trust that the rollback worked. MySQL is not gaining much of my trust here.

Transactional Migrations

Another place where this can trip us up - data migrations within a framework like Ruby on Rails. When executing migrations for an app, Rails will wrap the contents of an entire migration in a transaction. The reasoning is that you want all or nothing. If any part of a migration is invalid, the whole thing should be rolled back so that you can fix it.

Now, imagine we are putting together a migration to replace an author column with a reference to a new author table. Let's say we end up with the following lines of DDL in the up method of a migration.

--  assuming we already have the following posts table
--  > describe posts;
--  +--------+---------------------+------+-----+---------+----------------+
--  | Field  | Type                | Null | Key | Default | Extra          |
--  +--------+---------------------+------+-----+---------+----------------+
--  | id     | bigint(20) unsigned | NO   | PRI | NULL    | auto_increment |
--  | title  | varchar(50)         | YES  |     | NULL    |                |
--  | body   | text                | YES  |     | NULL    |                |
--  | author | varchar(40)         | YES  |     | NULL    |                |
--  +--------+---------------------+------+-----+---------+----------------+
create table authors (id serial primary key, first_name varchar(20), last_name varchar(20));

alter table posts
  add column author_id bigint(20) unsigned not null,
  add constraint fk_author_id foreign key (author_id) references authors(id);

alter table posts
  remove column author;

The first couple statements are fine, but the third has a typo (remove should be drop). When you run that migration it executes the first couple statements and then fails. Rails does a rollback on that transaction. Or at least it attempts to. Because the migration is a series of DDL statements, each is committed implicitly irrespective of the surrounding transaction.

This is a cause of confusion and unnecessary friction. The migration, from Rails' perspective, is considered down. However, it is actually half up and half down. Fixing that third statement and running the migration again will result in a different error -- the authors table already exists and cannot be created again. You're going to have to rollback the first bits of the migration manually before proceeding.

Avoiding These Gotchas

First, I'd suggest not mixing data changes and DDL changes. Regardless of the database engine you are using, this is probably not a good practice. If you can remain disciplined on this front, you'll at least avoid unintentionally committing data changes.

When working with migrations, there aren't a lot of great options. You can be meticulous about sticking to single-statement migrations. This may make migrations confusing and hide intention. Alternatively, you can carefully piece together each part testing out statements from the MySQL shell as needed.

Ultimately, my recommendation is, if you have the choice, to move to a database with a more robust transaction system, such as PostgreSQL.


Cover image by chuttersnap on Unsplash.com

More posts about mysql