Structured Query Language and the science of databases is no different from other highly-evolved
schools of thought, in that there are many words which are specific to the discipline. In the case of SQL
some of these words are similar but actually refer to entirely different concepts. For example we use
several words and phrases that contain the term "data." There are some shades of meaning to these
terms that involve what layer of information or processes we are discussing.
Data: A set of information with some aspect in common
For example data for employees might include social security numbers: "123-45-6789 Abe Adams, 234-
56-7890 Beth Barrett, 345-67-8901 Chris Cao"
Metadata: Information about the structure and organization of data in a database
In database terms (which we will cover in Chapter 3), metadata typically contains descriptions of the
tables and their constraints, the fields and their sizes and rules, and the relationships between the tables.
Some people have phrased metadata as "data about the data". An example of metadata for an employee
database would be something like:
field count = 3
type=integer size= 9 digits (exactly)
type= string size <=15 characters
type=string size <=30 characters
This metadata describes that we have a set of data about employees that is organized into a table.
Within that table we hold three kinds of information about each employee, their ID number, and their
first and last names. The metadata goes on to describe how each type of information is maintained. By
the end of Chapter 7, you will have a lot of experience with these terms, but for now note that the
metadata does not include the data, only a description of how the data is organized.
Database: A database is the data and the metadata
In other words a database is a set of related information as well as a description of how that information
is organized. A database for employees would be the metadata we just looked at, plus the information
described above under Data.
Note that we frequently display data as being in a table. The table is only a logical
structure, not a representation of exactly how the data sits on the disk. Different
DBMS will physically store the data in different configurations, usually associated
with the order that the records were added. As you will see later, the storage
methodology is of no concern to us.
Datastore: A source of data
(for our purposes, a source of data that has a way of responding to SQL statements)
Since about 1985, the dominant form of database has been of a type called relational. We will study this
term later in the book, but for now it is enough to understand that relational databases follow the rules
for data organization established by Dr. E.F. "Ted" Codd. However, in the past few years more and
more data has accumulated in non-relational forms, primarily hierarchical forms. Think, for example, of
the amount of information in word-processing documents stored on your company's desktop hard
drives. Clearly your company has data in those documents, but it is organized into drives, folders,
subfolders and files; a hierarchical system with no rules about the data contained at each level. In an
effort to emphasize data techniques that work with any source of data (relational or not), the term
Datastore can be used instead of Database. The term datastore also includes older databases (generally
called "flat file") that predate the relational rules of Dr. Codd. In summary, whereas the term database
today generally means a modern relational database, the term datastore is much broader, including
information contained in non-relational databases. Software companies have created many tools to
allow SQL to talk with all types of datastores as well as proper relational databases.
Data Server: A Data Server manages data and is typically one or more of many servers in an environment
Modern business computing relies on one or more central servers to hold data and handle centralized
processing. In the past a single machine might perform data storage, security, print services and all
other functions of the office. But as the number of computing jobs has increased (e-mail, Web Site
Support, etc.), IT departments have split the job across multiple servers, each optimized to perform a
very specific job. There may also be servers handling the business rules, web or network services and
other functions. Any of the other servers can pass a SQL statement to the Data Server, which will
execute the statement and then pass the results back to the requesting server. In simpler cases Data
Servers pass information directly to clients such as desktop PCs.
The other point to note here is that a Data Server differs from a File Server. The latter holds files
organized into folders and subfolders, with the intention of keeping employee's documents on one
central server. A Data Server generally abandons the folders in favor of a disk organization scheme
more efficient for relational data. In a small shop a single machine may function as both a file and data
server by using different organization schemes on different disk areas.
Note that the terms "server" and "client" are relative in multiple server
environments. When a web server sends a SQL statement to a data server the web
server is considered to be the "client." However the web server is then considered the
"server" in relation to the browser. In other words, we refer to the client as the
machine or software that sent the SQL statement to the DBMS and we refer to the
server as the machine (or group of machines) that receives the statement and
Database Management System (DBMS): Software which handles most aspects of data
management including physical storage, reading and writing data, security, replication, error
correction and other functions
Common DBMS include IBM DB2, Microsoft SQL Server, Oracle, Sybase and Informix. Although it is
not a true DBMS and isn't designed for large numbers of users, Microsoft Access can also accept SQL
statements. Each DBMS has a way of receiving a SQL statement and forwarding the statement to its
Database Engine. The DBMS will then have a way to take a result from the engine and send it back out
to the requester. Larger and more complex DBMS will be closely integrated with a specific Operating
System and some actually contain an OS optimized for data management.