Here are the “best of the best” strategies that define a successful and competent database administrator (DBA). While I present these in the context of DB2 on z/OS, I strongly believe that they are applicable across all hardware platforms and database management systems.
Standardize Database Recovery Processes
Ensure data recoverability. If there’s one thing to get right, this is it. While other things (such as performance or security) may seem more urgent, ensuring data recoverability is the database administrator’s most important responsibility. (For more on this, see DB2 on z/OS Recovery — Get it Right).
Consider how the DBA begins implementing a database to support a critical production application. If a disaster occurs, will the data be available in the agreed-upon Recovery Time Objective (RTO)? If not, in the case of medical and financial data, this may breach contracts with vendors or violate audit guidelines. Data recoverability is another major consideration in some legislation. Here are some that affect financial institutions:
- Expedited Funds Availability (EFA) Act, 1989 requires federally chartered financial institutions to have a demonstrable business continuity plan to ensure prompt availability of funds.
- Federal Financial Institutions Examination Council (FFIEC) Handbook 2003-2004 (Chapter 10) specifies that directors and managers are accountable for organization-wide contingency planning and for “timely resumption of operations in the event of a disaster.”
- Basel II, Basel Committee on Banking Supervision, Sound Practices for Management and Supervision, 2003 requires that banks establish business continuity and disaster recovery plans to ensure continuous operation and limit losses.
Most IT shops use regularly scheduled standard backup procedures (e.g., DB2 image copies), but few have actually tested the recovery time of these objects and analyzed whether their backup procedures are sufficient (or necessary) for their recovery requirements.
You should ensure you have all of the following:
- A regularly scheduled process for determining (and documenting) the recovery status of all production objects
- Regular measurements of required recovery times for objects belonging to critical applications
- Development of alternative methods of backup and recovery for special situations (such as image copy of indexes, data replication to recovery site, and DASD mirroring)
- Regular development, improvement and review of data recoverability metrics
Implement Appropriate Automation
Automate reactive or simple reporting processes, freeing the DBA for higher-level work. Your first reaction might be, “Wait! I’ll automate myself out of a job!” Far from it. Implementing automation makes the DBA more valuable. IT management wants its knowledge workers doing tasks that add value. These might include detailed systems performance tuning, quality control, cost/benefit reviews of potential new applications and projects, and more. Management understands that a DBA spending time on trivial tasks represents a net loss of productivity.
The advantage of automation isn’t merely speed; automating tasks helps move the DBA away from reactive tasks such as reporting and analysis toward more proactive functions.
Here’s a typical list of processes many DBAs still manually perform:
- Executing an EXPLAIN process for SQL access path analysis
- Generating performance reports such as System Management Facility (SMF) accounting and statistics reports
- Verifying that new tables have columns with names and attributes that follow standard conventions and are compatible with the enterprise data model and data dictionary
- Verifying that access to production data is properly controlled through the correct authority GRANTs
- Monitoring application thread activity for deadlocks and timeouts
- Reviewing console logs and DB2 address space logs for error messages or potential issues.
Each of these tasks can be replaced by an automated reporting or a data gathering process of some kind. With such processes in place, DBAs now can schedule data gathering and report generation for later analysis, or guide requestors to the appropriate screens, reports or jobs. This removes the DBA from the reactive rut and generates time for proactive tasks such as projects, architecture, planning, systems tuning, and more.
Along with choosing specific tasks to automate, you’ll probably need to learn one or more automation tools or languages. REXX is an example of a popular language for online or batch access to DB2 data. There are many examples and ideas for automated processes in articles, presentations, and white papers.
Implement DBMS Autonomics
Implement DBMS-based processes so the system is self-healing and self-tuning. As our IT organizations have matured, we became smarter about our problems. We began to collect problem logs, and analyzed them looking for trends and patterns. We recognized frequent problems and devised strategies for automatically dealing with them or preventing them.
We’ve now reached the next logical step in this progression: engineering processes and process control to make systems and applications self-aware and self-healing. This is called autonomics. Autonomics, ranging from simple scripts to complicated processes, can be applied to applications, systems, or support software. For many DBAs, the idea of a self-healing database inspires visions of the database redesigning itself. What exactly would a self-healing database heal? One DB2 z/OS example that comes to mind is real-time statistics. DB2 dynamically generates these data distribution statistics. An example of their use is during execution of the Reorg utility, where real-time statistics may be queried and the results used to decide whether or not to execute the Reorg.
These and other examples of DB2 autonomics make it possible to “program in” a manner of self-tuning (or at least self-management) into the DBA’s support infrastructure. Review the literature and IBM documentation for possible implementation of autonomic processes.
Understand Management Needs
Give your management the data they need to measure your productivity and prioritize the team’s work. IT infrastructure management can be subjective, considering the lack of good productivity or quality measures. Faced with distributing and prioritizing work across a database support team, management usually creates or extends a spreadsheet or project plan to include task categories and team assignments.
In seeking to increase productivity, management needs an important piece of information: the variability of the DBA’s work. This usually translates to whether the tasks that need to be done have a fixed amount of effort and can be pre-scheduled or consist of a variable amount of work that unpredictably arrives. In other words, are the DBAs being reactive, active, or proactive?
Some work tends to have a fixed or well-known duration and can usually be easily assigned and scheduled. Other work tends to arrive at random, uncontrolled times; there may be no way to estimate the amount of effort required.
The DBA’s manager assigns the fixed work (usually projects, meetings, etc.) to DBA resources at a level that allows sufficient additional available hours for important variable work, such as production issues and emergencies. In more mature environments, this may be implemented as a Gantt chart showing the next several months of resource consumption. This can be extremely valuable for estimating the availability of DBA resources for future projects, or for pre-planning some discretionary time that may be required.
This method of managing task and resource allocation dovetails nicely with the DBA efforts discussed earlier to implement automation and autonomics and define standard processes that can be easily outsourced. DBAs will work to convert variable, non-standard work into fixed, scheduled work, moving away from reactive tasks. Management will see fewer variable tasks and more fixed tasks, and more easily be able to assign DBA resources.
Summary
While there are other tactical choices that may define a DBA’s daily life, the strategies mentioned here are the crucial ones: they support all aspects of the enterprise’s data; they support the idea of off-loading simple and repetitive work to the machine, freeing the DBA for more skill-based tasks; and they build on industry best practices such as recoverability, reliability and data availability.
Review the list and ask: Do we have these standard processes in place? Are they well-documented? Is the documentation centrally located and easily accessible? Have we done enough in each area that adds real value to our organization? The answers to these questions will assist you in determining where your time is best spent.