Data-basics
We need databases because we need a way to structure a big
mess of data. The very simplest database is concerned with
only one category, or “field”, of data, such as a list of
client names. But a list of names on its own is not very
useful, unless you simply enjoy looking at a bunch of names.
What we need is more information to relate to these
names. For instance, the total bill on our last invoice. Now
we have a database with two fields–two fields which are
related to one another:
Loganberry Inc. | 50.00 |
Smart Corp. | 450.00 |
Arlene Von Smitten | 500.00 |
Sanford & Son | 75.00 |
B. Gates | 25000.00 |
Fanny Farmer | 55.50 |
If you read any article about databases you’ll undoubtedly
see the term
“relational database” — which is exactly what we are
building above. Relational databases are extremely common and
the most likely database you will want to query from a web
page. All of the major database software packages, including
Microsoft Access,
FileMaker Pro,
Oracle,
Informix,
and so on are geared towards creating and managing relational
databases.
In real life, our database above would likely contain even
more fields — more information we can relate, the richer
the database. For instance, it would make sense to store an
e-mail contact for the client, the invoice number, and
whether or not the client has paid up. In fact, we can
visually represent this database as a table … literally:
ClientName | ClientEmail | Invoice | Total | Paid |
Loganberry Inc. | logan@pluc.net | 099050101 | 50.00 | Yes |
Smart Corp. | smarties@pants.net | 099050102 | 450.00 | |
Arlene Von Smitten | arly@frau.net | 099060101 | 500.00 | Yes |
Sanford & Son | rfoxx@pickup.com | 099060102 | 75.00 | |
B. Gates | bill@me.com | 099060103 | 25000.00 | |
Fanny Farmer | ffarmer@sausagelinks.net | 099070101 | 55.50 | Yes |
The above is a table, both on the screen and in
database lingo. Each record in the database is called a
row, which makes a whole lot of sense. And, of course,
each row is made up of fields, which are in some way
(mathematical, conceptual, philosophical) related to one
another.
We can quickly see how we might formulate requests of this
database, for example:
- “Please list all client names and their e-mail
addresses”.
- “Please list all client names who were billed at least
$500.”
- “Please list all client names and e-mail addresses who
are unpaid.”
Pretty simple so far. Of course, the more fields in each row,
the more complex your requests could become. It shouldn’t be
tough to imagine applying this example to many types of
databases, such as basic retail inventory, or birthdays, and
so on.
Thus far we’ve created a basic database with one table — in
fact, a single database can contain several or more tables.
Because this is a relational database, the tables
should be related to each other somehow. Technically
speaking, the tables should relate to each other via a common,
unique field. Consider a second table that we might include
in this database:
Invoice | Hours | Rate |
099050101 | 1 | 50 |
099050102 | 9 | 50 |
099060101 | 12.5 | 40 |
099060102 | 1 | 75 |
099060103 | 50 | 500 |
099070101 | 3.7 | 15 |
The first table, billing, contains a unique invoice
code for each row, which relates the client name, address,
total, and paid status to that invoice. Yet, in the
invoices table, we store more detailed information
about each individual invoice … namely, the number of
hours worked and the rate of pay. We can not only relate
data fields within a table, but we can relate data across
two tables because they share a common unique key
(the invoice code). Now we can formulate requests such as:
- “Please list all invoices where I worked at least 10
hours.”
- “Please list all invoices where I charged at least $50
per hour.”
- “Please list the names of all unpaid clients for whom I
worked at least 10 hours.”
- “Please list the names and e-mail addresses of all
clients for whom I charged at least $70 per hour.”
There you have it — the basic ins and outs of your garden
variety relational database. Our example database, which we
could call clients, consists of two tables,
billing and invoices. We’ve seen the types of
requests you could foist upon this beast — but how to foist
them? Ultimately, we’re going to use Perl to request data
from the database, receive the data, and even format it for
output to a Web page. However to do that we need to
understand how general requests are made from a database,
Perl notwithstanding.