The Epochs of Spatial Analysis
The most prevalent relational databases in today's marketplace have some sort of spatial capabilities, but it is not the focus of this article to differentiate among their implementations (though the differences are vast). However, to get a better feeling for the richness of the DB2 UDB implementation, it is important to understand the generations of spatial analysis.
The first generation of spatial analysis uses a geospatial information system (GIS) application that interacts with spatially-aware middleware to perform the analysis that, in turn, interacts with the local file systems for storage requirements. The spatial application's API calls are translated by a middle tier and tools to the proprietary file format on the operating system. Since the storage mechanism is a file system, it is not surprising that none of today's database vendors subscribe to this method of spatial analysis.
While this first generation of spatial analysis provides for rich spatial analysis features, the data is not integrated with the rest of the corporate data. There are many consequences to this type of implementation. First, it impedes corporate-wide decision making since it, by nature, fragments the single version of the truth. The data is transposed and stored away from the watchful eye of IT, which could lead to unique departmental interpretations of the data. Another problem is that issues such as access, performance, backup, security, privileges, concurrency, and more may arise - these are issues that databases are really good at addressing.
To help address the previously mentioned issues, some database vendors chose to support spatial analysis by substituting their relational database for the file system, as shown below:
This second generation addresses the data management issues of the first generation but does not solve another major issue: a 'translator' is required for the spatial application to talk to the database. In other words, the database does not know about spatial components. It does not understand about distances between two points, what a polygon is, what an intersect method is, and more. Loading up the translator with the hefty task of converting spatial dialect to a relational one is not going to help. For example, how do you store a polygon in a database that is not spatially aware? It has to be stored as a large object (LOB), where all input coordinates are detailed. This leads to performance issues, the requirement of specialized tools to work with the data (it isn't just SQL), and more.
All of these issues are solved in the third generation of spatial analysis, which is where DB2 UDB fits, and which is shown below:
The preceding figure shows that capabilities are added to DB2 UDB when the free DB2 UDB Spatial Extender is installed on the server. In this scenario, all tools and applications speak spatially to the database through the SQL API..
Using the DB2 UDB Spatial Extender and DB2 UDB
It is easy to work with spatial data in your DB2 UDB database, but it does require a couple of simple steps to get you going.
After installing the DB2 UDB Spatial Extender, you enable your database for spatial analysis (shown in the figure below). Enabling a DB2 UDB database for spatial analysis adds specific spatial objects to the schema. These objects include spatial abstract data types (ADTs), user-defined types (UDTs), user-defined functions (UDFs), methods, and a special 'grid' index extension for fast spatial searching.
Once the database is enabled for spatial analysis, you simply enable a table that you want to work spatially with. This action can also be performed from the Control Center or the Command Line Processor.
When you enable a table for spatial analysis, an extra column is added to the table to hold the data types and indicate how the spatial values are computed (for example, using a projection with a built-in geocoder). Finally, you load the data and query it.
Now, there is more to it than this. For example, leveraging your existing address information for spatial analysis involves geocoding LAT/LONG coordinates for those addresses (which would get stored as points in the added spatial column). The DB2 UDB Spatial Extender comes with a built-in geocoder, but you can also plug-in your own. However, it is outside the scope of this article to provide detailed setup instructions.
DB2 UDB Spatial Extender Queries
When the database and table are enabled for spatial analysis, and you have loaded data into your tables, you are ready to perform some queries. Since DB2 UDB is a spatially aware database, you can interact with it using your spatial tools or plain old SQL - which is the whole point. DB2 UDB understands the spatial "dialect" and the operations that you want to perform with it.
For example, a telematics application on a PDA may provide its user with a list of nearby Chinese restaurants that serve Peking duck based on the dynamic request of this user. In this case, after the hardware creates a location box using Global Positioning System (GPS) coordinates, it could generate SQL statements similar to these:
SELECT name, description, address FROM restaurants
WHERE Overlaps(location, box(getGPS(),2000,2000))
AND category = 'chinese' AND docContains(menu,
Note the interaction of spatial (box) and relational (category) data.
A geological application may ask DB2 UDB for a list of volcanic eruptions in a specific region. In this case, the SQL could look like this:
SELECT name, year, mag, location from volcano
WHERE ST_Within(location,'0 polygon((-125 43, -125 46,
-120 46, -120 43, -125 43))')
ORDER BY name, year;
It is not so important that you fully understand the SQL in the previous two examples. However, note the highlighted keywords after the WHERE clause for each example I provided. These are spatial functions that are provided by the DB2 UDB Spatial Extender - this is what I mean when I say that DB2 UDB is spatially aware. In this example, DB2 UDB knows that a data type of 'polygon' serves to envelope areas of interest, and the method ST_Within looks for objects within this area.
The DB2 UDB spatial capability becomes even more powerful when you mix it with the built-in federation capabilities found in any DB2 UDB server. For example, the following figure shows a distributed join performed by DB2 UDB:
This type of distributed join brings together data stores that may have been bound to separate line-of-business applications in the past, and presents the data graphically for corporate-wide decision making.
The Geodetic Spatial Extender
DB2 UDB Version 8.2 also introduced a new spatial add-on product (this one is chargeable) called the DB2 UDB Geodetic Extender. While the free DB2 UDB Spatial Extender is well suited for many types of geospatial applications, the curvature of the earth can present problems for certain types of applications.
With the DB2 UDB Spatial Extender, you 'flatten' the earth using different projections (for example, the Mercator projection) and then account for the margin of error they introduce. This may work fine when trying to locate a caller's house for a 911 response call (being off by 10 feet isn't likely to hurt). However, national defense applications such as a missile defense system need more precise measurements and must remove distortion caused by 'flattening' the earth's surface. This is where the DB2 UDB Geodetic Extender comes in. It is outside the scope of this article to discuss this extender in detail, but cartographers and geography students would certainly be interested in it. For example, I was surprised to learn that Greenland is 1/14 the size of Africa - which did not seem to be the case when I think back to my elementary school atlas! Achieving greater spatial precision is really the point of the Geodetic Extender.
Wrapping it All Up
The move to make the DB2 UDB Spatial Extender a free add-on product for all IBM, DB2 Universal Database for Linux, UNIX, and Windows servers opens new opportunities for companies of all sizes. From large to small, we all collect data, and it undoubtedly has some sort of spatial component to it. The question is, are you using it? Or are your competitors?
About the Author
Paul C. Zikopoulos, BA, MBA, is an award-winning writer and speaker with the IBM Database Competitive Technology team. He has more than nine years of experience with DB2 UDB and has written numerous magazine articles and books about it. Paul has co-authored the books: DB2 Version 8: The Official Guide, DB2: The Complete Reference, DB2 Fundamentals Certification for Dummies, DB2 for Dummies, and A DBA's Guide to Databases on Linux. Paul is a DB2 Certified Advanced Technical Expert (DRDA and Cluster/EEE) and a DB2 Certified Solutions Expert (Business Intelligence and Database Administration). You can reach him at: firstname.lastname@example.org.