dcsimg

Best Practices for a SYBASE ASE Cluster Edition Proof of Concept

June 11, 2010

By Jeffrey Garbus

You’ve taken a quick look, and decided that Sybase ASE Cluster Edition is right for you. Managing the POC requires a great deal of forethought & planning. Jeffrey Garbus shares some best practices to get you started.

You’ve taken a quick look, and decided that Sybase ASE Cluster Edition is right for you. Chances are you’ve based your decision on one of a relatively tight list of reasons:

  • High availability
  • Work-load consolidation
  • Resource utilization

ASE CE takes advantage of shared-disk architecture, and OS-level clustering to provide high availability clustering. This means that if you are pointing to a “cluster” containing many servers, if one goes down, this is transparent to the end users because another server within the cluster can take over the workload, accessing the failed server’s databases on the SAN.

Work-load distribution, via the workload manager, enables you to balance processing across multiple servers. Again, through the cluster entry point, transparent to the users, ASE routes the requests to the correct server (or to a random server for balancing purposes), once again making a single server failure a nonissue from the users’ perspective, as well as allowing less expensive resources to be used in a variety of situations.

The work-load distribution has a side benefit. You can also consolidate servers, running individual servers at a higher utilization rate, meaning that you can have several servers running at 60%, knowing that you have many servers to take on any peak performance, rather than having to run multiple servers each with enough peak potential.

So, decision made, it’s time to start thinking about the proof-of-concept (POC). As with many IT projects, your most critical success component will be planning.

Scoping the POC

The point of your POC is to demonstrate that the ASE CE will do what you intend it to do, in your environment.

More specifically, you need to identify a simple, quantifiable set of requirements. These may have to do with any of the areas mentioned above (high availability, workload management, connection migration), or may include ease of use of installation, backup/recovery, or performance.

In any case, you want to identify specific tests, and specific success criteria. Put together specific, detailed test cases, including POC objectives, functional specifications, and specific tests, all with success criteria in a good, checklist format.

There are a variety of ways of defining the scope, but here are a few basic levels:

variety of ways of defining the scope

Basic

You can check basic availability features. Use one node, and test the entry-level functions:

  • Installation
  • Configuration
  • Failover
  • Migration

Intermediate

A step up from basic availability takes you a bit further. With two nodes, you can test Storage architecture:

  • Shared-disk setup (raw devices)
  • Installation
  • Configuration
  • Failover
  • IO fencing

Advanced

The next step up would likely test the full shared disk cluster. Now, go up to 4 nodes, and work with:

  • Shared-disk and private-interconnect setup
  • Local installation and configuration
  • Logical Clustering
  • Failover
  • Load Balancing

Production

Pushing the POC out to the most specific level, perhaps a 4-node test of your target production setup:

  • Database upgrade / migration
  • Client application upgrades
  • Failover benchmarking
  • Maintenance
  • Performance

Planning the test bed

Test resources are often at a premium, so you need to make your list early; are you testing on VMs? Real separate boxes? Once you’ve established your success criteria during the scoping, you can work backwards to make sure you have the right equipment.

Define the complexity of your test bed based upon those criteria. You can run functional tests when you have multiple CE instances on a single physical node, even though you may not be confident as to the results of your failover tests. You can use partitioned nodes to work around resource availability.

A simple approach to a true multi-node cluster would be two nodes with a twisted-pair interconnect. For more (required) complexity you can go to a 4-node cluster with high-speed network interconnects and shared storage. Be sure to check the current supported hardware lists.

Use cases

Your use cases will become your checklist; your table should be orchestrated along these lines:

Test

Purpose

Success Criteria

Database Stability in a clustered configuration

Power Off test

Validate database survival in case of an abnormal shutdown

  • Instances survive on non-shutdown nodes
  • Applications correctly (transparently) handle the failover
  • Connections to the failed node transfer to non-failed nodes

Unplug a network cable

CE isolation tests

  • Connections migrate successfully
  • Correct nodes come offline

CE performance under a workload

Large report performance

Load Balancing

  • Meet benchmark threshold of xxx

Backup and recovery

Replication

Operations and maintainability

Performance monitoring

Other test areas may include installation, high availability under a variety of circumstances, workload management, maintenance, job scheduling, and anything else that has driven your decision to evaluate CE.

Hardware requirements

The supported hardware list changes on an occasional basis, so make sure you check the Sybase web site to see what’s currently available.

At this writing, supported platforms include:

  • X64 w/ RHEL, SUSE
  • Sun SPARC w/ Solaris
  • IBM AIX, HP-UX
  • All cluster nodes of a single platform
  • Upgrades are supported from ASE versions:

  • 12.5 to 12.5.3
  • 15.0 to 15.0.3

Shared SAN storage is a requirement. The SAN must be certified for raw devices on SAN. The SAN fabric must be multi-pathed. All the data devices must be visible on all participating nodes. The disk subsystem should support SCSI-3 persistent reservation.

Multiple network interface cards (NICs) are also mandatory; you need both public access (from the clients) and high-speed private access (for intra-node communications).

Within a node, the same OS and architecture is required.

Best Practices

  • Take the time to validate the ecosystem on the cluster prior to beginning software installation
  • Storage
    • Use only raw devices, not file system devices
    • All CE devices must be multi-path for visibility on all nodes
    • Save additional space for the Quorum and local system tempdb on each instance
    • Do not use io fencing on the quorum device; it cannot share LUN with other CE devices (SCSI IO fencing is implemented at the LUN level)
    • Ensure IO complex setup is not at the expense of latency and throughput. If available, benchmark non-CE latency and try to maintain similar profile with CE
  • Network
    • Use at least one private network interface, in addition to the public interfaces. Private network usage is critical, you will want to configure for maximum throughput and the least latency. Perform visibility tests to ensure that the private network actually is private
  • Installation
    • Use UAF agents on all nodes
    • Connect ‘sybcluster’ to all the UAF agents to test the network connectivity
    • Generate the XML configuration for installation for future reference & reuse
  • Post-installation
    • First, test for cluster stability
    • Make sure the cluster is stable under all conditions; any instability is likely due to ecosystem issues
    • Test any abnormal network, disk, or node outages
  • Startup nuances
    • The cluster startup writes out a new configuration file. Sometimes, due to timing, an individual node may need to be started up twice
  • Keep an eye on the Errorlog
    • The Errorlog file is going to be more verbose because the cluster events add a great deal of information
  • Validate client connectivity to all nodes
    • Connection migration and failover process is sensitive to network names; consistent DNS name resolution is critical path
    • Connect to all nodes form each client machine as part of the test
    • Inconsistent naming may cause silent redirection, causing the workload distribution to be less evenly distributed than you want
  • Database load segmentation
    • Load segmentation is key, and should be seamless with Workload Management
    • Use happy case segmentation. This keeps writes to a database on single nodes, spreads out reads if there is instance saturation
    • Plan out failover scenarios and the consequent load management implications
    • Pay special attention to legacy APIs, which may not communicate to the cluster, and therefore need upgrading

Installation and configuration

Most ASE configuration parameters will behave as you expect; individual servers will still process individual queries. In general, you will configure at the cluster level, and those configuration options will propagate to individual servers. You may, if you choose, configure individual instances, if for example all hardware doesn’t match. Each server will also get its own system tempdb, which will be used by the quorum device.

The quorum contains the cluster configuration, which is in turn managed with the qrmutil utility.

There are other configuration parameters for CE, you’ll likely leave them at the defaults, but do take the time to read the release bulleting for the individual operating systems for any changes you might consider making.

Workload management

Workload management is a new concept for a lot of DBAs, because in a standard ASE environment all the workload is directed at a single db server. Some examples of workload management include using standby replicated server for read-only queries.

Active load management requires metrics, and you are going to have to take some time to determine how to measure and manage this activity. You should expect a significant performance improvement with an active load segmentation strategy.

Summary

ASE CE is a significant new feature to the Sybase suite. Managing the POC requires a great deal of forethought & planning. We’ve given you a bit of a starting point.

Additional Resources

Overview of Sybase ASE In-Memory Database Feature
Sybase Adaptive Server Enterprise Cluster Edition
Sybase ASE CLUSTER EDITION PROOF OF CONCEPT STRATEGIES

A 20-year veteran of Sybase ASE database administration, design, performance, and scaling, Jeff Garbus has written over a dozen books, many dozens of magazine articles, and has spoken at dozens of user’s groups on the subject over the years. He is CEO of Soaring Eagle Consulting, and can be reached at Jeff Garbus.








The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers