DB2 9 and Microsoft Excel 2007 - Part 1: Getting the Data...
July 3, 2007
Without a doubt, Microsoft Excel, is the worlds most ubiquitous reporting and data analysis tool. Quite often I see clients export massive amounts of data from their enterprise systems into another data server (guess which one) for the sole purpose of supporting Excel reporting.
The main problems associated with this approach are as follows:
1. There are too many copies of the data. Will they all be the same? Either way, it leads to redundant data costs.
2. There is too much latency with the data. It takes too long to get the data to the people who need it. In addition, everyone sees different data with inconsistent points in time with respect to when the data was captured.
3. The solution is too complex. This much data movement requires its own life cycle maintenance, batch jobs, and more.
4. The data is not secure. Data is left on minimally secured laptops. I dont need to go much further on this point.
5. The solution is too expensive. Numerous components lead to increased costs. These costs are often hidden in distributed organizations, but they can be massive -- core operating system licensing costs, additional data server license costs, network costs, patching costs, security costs for a whole new layer of data access, and more.
What if you could just leave all that data in your DB2 (including DB2 for z/OS data server) and just access it? After all, if your enterprise data warehouse (EDW) is built on DB2, it can pretty much scale limitlessly and is designed to handle all kinds of workloads. If you could do this, you could achieve the following benefits:
1. A dramatic reduction in the number of copies of the data with a Single View of the Business.
2. Reduced latency time with fewer input and access paths. Near real-time load and access tools providing up-to-date business intelligence.
3. A simplified architecture making the environment easier to manage for faster solution development and quicker return on investment (ROI)!
4. A more secure solution since data is locked in one place with enterprise-wide policies.
5. A less expensive solution. Fewer components lead to reduced costs in both infrastructure and human resources all this results in lower total cost of ownership (TCO).
The following figure summarizes these approaches:
In fact, the only good reason that I can see for the way things are typically done is easy access. Therefore, in this article, Im going to show you just how easy it is to access data from DB2 9 in your Excel worksheets.
Getting your DB2 9 data into Excel 2007
To get your data into Excel from a DB2 9 data server, you need to start with a database connection. You can work with database connections in Excel 2007 using the Data ribbon, as shown below:
Ribbons are a new usability feature in Office 2007, though Im not sure how usable they are; they take some getting used to. With that said, once you fumble your way around the interface, and learn where things are now located, the product becomes easier and nicer to use. If you read this article, you can leverage the learning curve Ive already traversed and save yourself some valuable time (and frustration).
From the Data ribbon, you can perform all the data-specific actions you need (outside of regular Excel operations such as formatting, and so on). The Data ribbon has a number of subcategories. For example, in the far left, you can see the Get External Data frame, which is obviously used to import external data sources into your Excel 2007 worksheet, and its exactly what you use to get your DB2 9 data into your Excel worksheet:
To import the contents of the STAFF table into your Excel 2007 worksheet, perform the following steps:
1. In the Data ribbon, select From Other Sources>From Data Connection Wizard from within the Get External Data frame:
In the previous figure, you can see that you can import data from a number of sources but for this article, since we are going after data that resides in a DB2 9 data server that isnt known to Excel 2007 yet, the From Data Connection Wizard option is the one we want to select.
2. The Data Connection window opens. Select Other/Advanced, and click Next.
3. The Data Link Properties window opens. Select the IBM OLE DB Provider for DB2 OLE DB provider, and click Next.
When you install a DB2 9 client on your workstation, the corresponding OLE DB provider is automatically installed as well, and this is what Excel 2007 uses to get data from a DB2 database.
The figure in Step 2 shows you the option to select an ODBC DSN. When you catalog a DB2 database, you can choose to configure the database as a user, system, or file DSN using the native Windows facility - the Data Source (ODBC) tool. Since my SAMPLE database is set up as a system DSN, I could have used that connection to retrieve the data as well:
4. Select the database that you want to connect to from the Specify the DB2 database box, ensure that you have the correct credentials for this database connection in the Log on information box, and click Test Connection:
If the database that you want to connect to isnt cataloged on the local machine where you are running Excel 2007, you can search for existing DB2 data servers and their databases (as well as DB2 Connect gateways if you are attempting to add a connection to a DB2 for System i or DB2 for System z data server) using the DB2 Discovery feature thats integrated with Excel. You can leverage this facility by selecting the Direct server connection radio button and select a database from the Select the database on the server drop-down list (you might need to click Refresh to update the server list or the database list is returns):
Its outside the scope of this article, but you should be aware that you can use the Advanced tab to set advanced properties of this connection such as the connection timeout allotment, or perhaps a hard-coded password:
5. Click OK if the connection is successful.
The DB2 OLE DB provider is well integrated into Excel 2007. For example, if Excel 2007 wasnt able to get a connection to the SAMPLE database because the database instance was down, it would display the following message:
If the database was currently in use, (perhaps it was in a quiesce state or a backup was being performed), Excel 2007 would display a message like this: