Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

DB2

Posted Apr 28, 2004

Tips for using Tivoli Storage Manager with DB2 - Page 3

By Marin Komadina


Tip 5. TSM Communication problems


A period of successful TSM usage followed by a of series unsuccessfully backup operations, raises the standard question: "What went wrong if nothing has changed?" Rechecking the environment and database configuration parameters leads to the conclusion that only the network configuration between database server and the TSM server has been changed. The following error message has occurred during a regular online database backup on the TSM:


db2 => backup db artist online use tsm
SQL2025N  An I/O error "-50" occurred on media "TSM".
Listing 10: Backup error condition

Return codes from Tivoli Storage Manager APIs describe this problem as a TCP/IP communications failure:


cat /opt/tivoli/tsm/client/api/bin/sample/dsmrc.h | 
  grep -50   

#define DSM_RC_TCPIP_FAILURE       
  -50 /* TCP/IP communications failure      */
Listing 11: TSM error code explanation

In order to find more detailed information about the error requires the TSM API tracing files. TSM API tracing is enabled using traceflags and tracefile configuration
entries in the dsm.opt configuration file.


# cat dsm.opt
SERVERNAME              TESTTSM001
traceflags             service api
tracefile              /tmp/artist_tracing.log
Listing 12: Enabling TSM API tracing

Possible sources of the problem might be:

  • a problem with some database configuration parameters
  • a password problem between the TSM server, TSM API and DB2
  • a problem with the TSM server configuration
  • a problem in the network infrastructure connecting the TSM server and the database server

After enabling communication tracing and a series of connectivity tests, this problem has shown up. There were infrastructure changes on the network and the DB2 database server had been disconnected from a fast 100MB and re-connected to a slower 10MB network segment. This resulted in communication between the DB2 database and the TSM server having a longer delay than before and backup failed to finish. Luckily, we have some parameters at our disposal for fine communication tuning:

adsm> q opt
Server Option         Option Setting           
-----------------     --------------------     
CommTimeOut           900 (-> to 1800 ) 
Listing 13: Changing the TSM server CommTimeOut parameter

A change has been made to the CommTimeOut parameter. The value has been extended from 900 seconds to 1800 seconds. From the TSM Administrators Reference Guide, the CommTimeOut parameter has the following description:

"CommTimeOut
- Specifies how long the server waits (in seconds) for an expected client
message during an operation that causes a database update. If the length of
time exceeds this time-out, the server ends the session with the client. You
may want to increase the time-out value to prevent clients from timing out if
|there is a heavy network load in your environment or client will be backing up
large files. "

Tip 6. Checking Backup on the TSM server

In the DB2 version 7.1, IBM offered a new utility, db2ckbkp. This utility is used to:

  • test the integrity of a backup image and search for possible corruptions
  • display information that is stored in the backup header
  • display information about the objects and the log file header in the backup image

Detecting an unusable backup directly on the TSM could save precious DBA time. However, system utility db2ckbkp has one small feature, it cannot be used to check a backup on the TSM server. The DBA has to restore the whole backup file from the TSM server on the local filesystem, and than check it with db2ckbkp utility. Checking a TSM backup file for possible corruptions, using the db2ckbkp utility:

$ db2ckbkp ARTIST.0.artist.NODE0000.CATN0000.20040125010545.001
[1] Buffers processed:  ###############################################################################################
Image Verification Complete - successful.
Listing 14: db2ckbkp system utility

IBM has acknowledged that the db2adutil system command should be used for checking a database backup on the TSM server. An example of the TSM backup check:

$ db2adutl VERIFY FULL TAKEN AT 20040125010545.000

Query for database ARTIST

Retrieving FULL DATABASE BACKUP information.  Please wait.

   FULL DATABASE BACKUP image:

     ./ARTIST.0.artist.NODE0000.CATN0000. 20040125010545.000, Node: 0

   Do you wish to verify this image (Y/N)?

Read 4194304 bytes, assuming we are at the end of the image
 
Image Verification Complete - successful.
Listing 15: TSM related database parameters, with enabled TSM_PASSWORD

From the IBM documentation:

Verify option performs consistency checking on the backup copy that is on the server. This parameter causes the entire backup image to be transferred over the network.

The whole image will be read from the TSM server into a local memory buffer. (Not the

whole image at once, but piece by piece). Only a temporary file was written to the local disk. I have been testing this option, which is extremely useful, and did not find any problems even with large backup files. Taking measurements, some large backup files (100 GB), required only 10 minutes for checking. After testing with several files different sizes, I doubt that a backup file is entirely transferred to the local filesystem, as IBM documentation states. Nevertheless, this method is fully functional.

Conclusion

The explained situations reflect things that might shorten the TSM learning path. The TSM backup system is very powerful and very well suited. A DBA needs to test the TSM recovery process so that in the event a recovery is necessary, it will not be the first time. Half of the battle is knowing what features are available, and the other half is testing.

Related Articles:

» See All Articles by Columnist Marin Komadina



DB2 Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.