This tutorial briefly explains the concepts of
referential integrity, and looks at how MySQL enforces them with its definition
of foreign keys. You should be using a stable version of MySQL 4 to follow all
the examples, although some examples may work with earlier versions running InnoDB
tables.
What is referential integrity?
Simply put, referential integrity means that when a record in a table refers
to a corresponding record in another table, that corresponding record will
exist. Look at the following:
customer
customer_id |
name |
1 |
Nhlanhla |
2 |
Anton |
customer_sales
transaction_id |
amount |
customer_id |
1 |
23 |
1 |
2 |
39 |
3 |
3 |
81 |
2 |
There are 2 customers in the customer table, but 3 customer_id’s in the
customer sales table. Assuming the two tables are linked with the customer_id
field, you can tell that Nhlanhla has an amount of 23, and Anton 81. However,
there is no corresponding name for customer_id 3. Foreign key relationships are
described as parent/child relationships (customer being the parent, and customer_sales
the child), and the record is said to be orphaned when its parent is no longer
in existence.
A database in this sort of condition is referred to as having poor
referential integrity (there are other kinds of integrity problems too). This is
not necessarily a serious problem – one of the primary systems I work uses MyISAM
tables, and has loads of orphans: article blurbs and article bodies not linked
to any articles, but these don’t do much harm besides prickle my aesthetic
sensibility, and we’ve never needed to fix this. However, it is not good
design, and can sometimes lead to problems, so you should avoid a situation
like this where possible.
In the past, the MySQL DBMS could not enforce this, and the responsibility
passed to the code to do so. But this wasn’t good enough for serious systems,
and one of the most frequently requested features in later versions of MySQL
was that of foreign keys, enabling MySQL data to maintain referential
integrity. A foreign key is simply a field in one table that corresponds to a
primary key in another table. In the example above, customer_id would be
the primary key in the customer table, uniquely identifying each record,
and transaction_id would be the same in the customer_sales table.
In the customer_sales table, the customer_id field could be an
example of a foreign key, referring to its namesake in the customer
table. A transaction should not exist without an associated customer. The code
that generated these tables is clearly buggy!
Defining Foreign Keys in MySQL
Strictly speaking, for a field to be a foreign key, it needs to be defined
as such in the database definition. You can ‘define’ a foreign key in any MySQL
table type (including the default MyISAM table type), but they do not actually
do anything – they are only used to enforce referential integrity in InnoDB
tables.
In order to create a foreign key, you need the following:
- Both tables need to be InnoDB tables.
- To use the syntax FOREIGN KEY(fk_fieldname) REFERENCES table_name
(fieldname) - The field being declared a foreign key needs to be
declared as an index in the table definition
Here is how you would define the two tables above with a
foreign key:
CREATE TABLE customer
(
customer_id INT NOT NULL,
name VARCHAR(30),
PRIMARY KEY (customer_id)
) TYPE = INNODB;CREATE TABLE customer_sales
(
transaction_id INT NOT NULL,
amount INT,
customer_id INT NOT NULL,
PRIMARY KEY(transaction_id),
INDEX (customer_id),
FOREIGN KEY (customer_id) REFERENCES customer (customer_id)
) TYPE = INNODB;
If you get the rather unhelpful error message:
ERROR 1005: Can't create table './test/customer_sales.frm' (errno: 150)
then check your foreign key definitions carefully –
something is wrong with the definition. Common causes are a table not being of
type InnoDB, a missing index on the same field (customer_id), or
attempting to set a field to NULL when it cannot be (see the ON DELETE SET NULL
clause below).
Referential integrity can be compromised in three situations: when creating
a new record, deleting a record or updating a record. The FOREIGN KEY (transaction_id)
REFERENCES customer (customer_id) clause ensures that when a new record is
created in the customer_sales table, it must have a corresponding record
in the customer table. After creating the above tables, insert the
following data, which we will use to demonstrate some of the concepts:
mysql> INSERT INTO customer VALUES(1,’Nhlanhla’),(2,’Anton’);
Query OK, 2 rows affected (0.00 sec)
mysql> INSERT INTO customer_sales VALUES(1,23,1),(3,81,2);
Query OK, 2 rows affected (0.00 sec)
Now insert the third record, referring to the non-existent
customer 3:
mysql> INSERT INTO customer_sales VALUES(2,39,3);
ERROR 1216: Cannot add or update a child row: a foreign key constraint fails
You cannot add the record, as customer_id 3 does not
exist. The constraint has ensured your data keeps its integrity! However, what
happens when we delete a record? Let’s add a customer 3, then add the customer_sales
record again, after which we delete the 3rd customer:
mysql> INSERT INTO customer VALUES(3,’Malvin’);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO customer_sales VALUES(2,39,3);
Query OK, 1 row affected (0.01 sec)
mysql> DELETE FROM customer WHERE customer_id=3;
ERROR 1217: Cannot delete or update a parent row: a foreign key constraint fails
So the constraint holds, and we would need to first delete
the record from the customer_sales table. There is a way we could have allowed
the delete to go ahead, which we will look at shortly, but first we will need
to drop and recreate the index.