What is a transaction?
If you are asking this
question, you are probably used to website databases, where most often it does
not matter in which order you run transactions, and if one query fails, it has
no impact on others. If you are updating some sort of content, often you will
not care when the update is performed, as long as the reads are being taken
care of quickly. Similarly, if an update fails, the reads can still carry on
reading the old data in the meantime. However, there are times when it is
vitally important in which order queries run, and that all queries in a group
run, or none at all. The classic example is from the banking environment. An
amount of money is taken from one person’s account, and put into another, for
example as follows, a 500-unit transaction:
UPDATE account1 SET balance=balance-500; UPDATE account1 SET balance=balance+500;
Both queries must run, or
neither must run. You cannot have the money being transferred out of one
person’s account, and then ‘disappearing’ if for some reason the second query
fails. Both these queries form one transaction. A transaction is simply
a number of individual queries that are grouped together.
A small dose of ACID
For a long time, when
MySQL did not support transaction, its critics complained that it was not ACID
compliant. What they meant, is that MySQL did not comply with the four conditions
to which transactions need to adhere in order to ensure data integrity. These
four conditions are:
-
Atomicity: An
atom is meant to be the smallest particle, or something that cannot be divided.
Atomicity applies this principle to database transactions. The queries that
make up the transaction must either all be carried out, or none at all (as with
our banking example, above). -
Consistency:
This refers to the rules of the data. For example, an article body may have to
have an associated article heading. During the transaction, this rule may be
broken, but this state of affairs should never be visible from outside of the
transaction. -
Isolation:
Simply put, data being used for one transaction cannot be used by another
transaction until the first transaction is complete. Take this example below,
where an account balance starts at 900. There is a single deposit of 100, and a
withdrawal of 100, so the balance at the end should remain the same.
Connection 1: SELECT balance FROM account1;
Connection 2: SELECT balance FROM account1;
Connection 1: UPDATE account1 SET balance = 900+100;
Connection 2: UPDATE account1 SET balance = 900-100;The balance is now 800, so we have lost 100. These two
transactions should have been isolated, and the result supplied to Connection 2
only when the transaction from Connection 1 was complete. -
Durability:
Once a transaction has completed, its effects should remain, and not be reversible.
Down to work: InnoDB Transactions
Transactions are wrapped
in BEGIN and COMMIT statements. Let’s create a sample InnoDB
table, and see how transactions work:
mysql> CREATE TABLE t (f INT) TYPE=InnoDB;
Now let’s begin a
transaction, and insert a record:
mysql> BEGIN;
Query OK, 0 rows affected (0.00 sec)mysql> INSERT INTO t(f) VALUES (1);
Query OK, 1 row affected (0.01 sec)mysql> SELECT * FROM t;
+——+
| f |
+——+
| 1 |
+——+
1 row in set (0.02 sec)mysql> ROLLBACK;
Query OK, 0 rows affected (0.01 sec)mysql> SELECT * FROM t;
Empty set (0.00 sec)
Without a COMMIT
statement, the insert was not permanent, and was reversed with the ROLLBACK.
Note that the added record was visible during the transaction from the same connection
that added it.
Consistent reads
Let’s try
looking from a different connection. For this exercise, open two connections to
the database.
Connection 1:
mysql> BEGIN;
Query OK, 0 rows affected (0.00 sec)mysql> INSERT INTO t (f) VALUES (1);
Query OK, 1 row affected (0.00 sec)mysql> SELECT * FROM t;
+——+
| f |
+——+
| 1 |
+——+
1 row in set (0.00 sec)
Connection 2:
mysql> SELECT * FROM t; Empty set (0.02 sec)
The
important point is that running the same query from different connections (one
within the middle of a transaction, the other from without) produces different
results. Now, commit the transaction from the first connection, and run the
query again from connection 2.
Connection
1:
mysql> COMMIT; Query OK, 0 rows affected (0.00 sec)
Connection
2:
mysql> SELECT * FROM t; +------+ | f | +------+ | 1 | +------+ 1 row in set (0.00 sec)
This behavior is called consistent
reading. Any select returns a result up until the most recently completed
transaction, with the exception of the connection doing the updating, as we saw
above. By default, MySQL InnoDB tables perform consistent reads.
Automatic Commits
MySQL also automatically
commits statements that are not part of a transaction. The results of any
UPDATE or INSERT statement not preceded with a BEGIN will immediately be
visible to all connections. You can change this behavior, as follows:
mysql> SET AUTOCOMMIT=0; Query OK, 0 rows affected (0.00 sec)
Now, note what happens,
even if we do not specifically start a transaction with BEGIN.
Connection 1:
mysql> INSERT INTO t (f) VALUES (2); Query OK, 1 row affected (0.00 sec) mysql> SELECT * FROM t; +------+ | f | +------+ | 1 | | 2 | +------+ 2 rows in set (0.00 sec)
Connection 2:
mysql> SELECT * FROM t; +------+ | f | +------+ | 1 | +------+ 1 row in set (0.00 sec)
Commit the transaction from
the first connection, then reset AUTOCOMMIT to 1, and repeat the
exercise:
Connection 1:
mysql> COMMIT; Query OK, 0 rows affected (0.00 sec) mysql> INSERT INTO t (f) VALUES (3); Query OK, 1 row affected (0.00 sec)
Connection 2:
mysql> SELECT * FROM t; +------+ | f | +------+ | 1 | | 2 | | 3 | +------+ 3 rows in set (0.00 sec)
This time the transaction
is committed immediately, and is visible from another connection, even without
a specific COMMIT statement.