Describing Oracle Syntax
January 12, 2005
What do FORTRAN and Oracle have in common? The answer to this question is at the end of the article, so before we get there, allow me to use the study of mathematics as a model.
For math majors at many colleges and universities, a required course or elective covers the subject of math history. Manipulating formulas and applying concepts to solve practical and theoretical problems is at the heart of being a math major. However, a course on math history helps provide a student with some appreciation about his chosen field of study. Knowing some history behind the three L's (Laplace, Legendre, and Lagrange) or the roots of probability theory (and the interesting history behind the entire Bernoulli family) gives you an appreciation of what obstacles some (now, but not necessarily back then) famous mathematicians faced in their day. Things you take for granted today were not so obvious hundreds of years ago.
In virtually all of Oracle's documentation, you see two types of syntax: graphical and something called Backus-Naur Form syntax. The graphical syntax examples are quite obvious when they are in their, well, graphical format. Let's take a look at the CREATE TABLE syntax (using the SQL Reference Guide). Part of the graphical representation of the CREATE TABLE syntax is shown below.
Presenting data definition or data manipulation language (DML and DDL) syntax in this form offers an intuitively easy format for humans to read and follow. If you take note of what is underneath each path, you will see a link to "Text description of create table" and "Text description of relational table." Oracle documentation is replete with these syntax paths and their links to their text descriptions.
At the beginning of most documents, Oracle provides a section on "Conventions in Code Examples" and Oracle is not alone in this regard. Wherever there is code syntax, virtually all major vendors provide a section on how to "read" their code. Without knowing exactly why, you probably already know that optional items or keywords are enclosed in brackets and that they are separated by the "|" symbol.
Therefore, looking at the text description of the CREATE TABLE command confirms Oracle's adherence to following its own convention.
The conventions cover what bold, italics, uppercase, lowercase and combinations thereof mean or depict, and meanings are defined for symbols such as brackets, braces, vertical bars and ellipses. You have seen this syntax a million times, but do you know where it came from?
The text description of command syntax is known as Backus-Naur Form syntax, and is commonly referred to as BNF notation. As you may deduce from the names, Oracle did not invent this syntax, but it certainly follows it (or a modified form of it).
In fact, because Oracle's version of SQL is by and large ANSI compliant, you would be correct in assuming that Oracle's adherence to BNF notation closely follows what would be found in the ANSI version of a command such as the CREATE TABLE statement. This leads us to a "standards" type of Web site for SQL.
In case you didn't know, ANSI (American National Standards Institute) goes well beyond the IT field, and a visit to their Web store provides a glimpse into all that ANSI covers. The home page is at http://webstore.ansi.org/ansidocstore/default.asp. Enter "database language" in the search field and you will see the following page.
Clicking on the first item in the table shows the following description of what was the X3.135-1992 standard.
For $18.00, you can purchase the current SQL language standard and have access to all the gory details of the SQL language.
The standard itself uses BNF notation and provides a section on exactly how it uses it.
What does the CREATE TABLE syntax look like?
You can see the basis for Oracle's syntax and how it follows BNF notation.
Who or what is Backus-Naur?
John Backus and Peter Naur answer the "who" part of this question. John Backus developed a syntax or programming grammar for use with the ALGOL programming language in 1958 (referred to as ALGOL 58). By the 1960 release (ALGOL 60), Peter Naur had revised and expanded the rules and the syntax became known as Backus-Naur Form (BNF), and that answers the "what" part.
John Backus won the Turing Award in 1977 for his work in the field of computer science. Winning the Turing Award places one in rarefied company, and reading the list of previous winners is like reading the "Who's Who" of the computer science field. Among the many notable names is none other than E.F. Codd, whom, interestingly enough, has another "NF" acronym associated with his name, namely, normal form (1NF, 2NF, etc.).
Peter Naur, on the other hand, prefers that his name not be associated with the BNF syntax (call it Backus Normal Form instead). The hyphenated name combination stuck and that's why we have BNF.
Back to my original question
At the beginning of the article, I asked what do FORTRAN (an old programming language, also referred to as Fortran) and Oracle have in common? John Backus is credited with having developed Fortran, which was the first high-level programming language. And, as discussed, Oracle syntax, by way of following the ANSI standard for the SQL database language, complies with the conventions of the Backus-Naur Form syntax. Now you know the rest of the story.
In many fields, being able to illustrate an interesting historical perspective can serve to heighten a student's interest. Many IT subjects are extremely dry and uninteresting in of themselves, so when someone asks you what "Oracle" is (yes, there is a section titled, "What is Oracle?"), you can go beyond "An Oracle database is a collection of data treated as a unit." Imagine what the IT industry was like 50 years ago. For one, it was not even an industry, at least as we know it today. Given how much the IT field has changed in just the past ten years, how likely do you think it is that some procedural language or syntax invented today will still be in use 50 years from now? For example, the C programming language was replaced by C++. That's part of the joke behind "C++" as the name of the next version ("++" meaning is to increment by one). C# is just around the corner.
The fact that we still use the Backus-Naur Form syntax today, I think, is pretty amazing. It seems so simple and obvious that we should follow a standardized syntax, but what was there to work with 50 years ago? Just like in math, use of the definite integral today to calculate the area under a curve seems like a no-brainer, but it took literally hundreds of years to develop the theory that enables us to add an infinite number of Riemann sums. Think about what you know today, Oracle database or anything else in the IT field, and how much of it will be around in ten years, let alone 50 years?