SHARE

Queries On Multiple Databases

Written By

Apr 22, 2001

4 minute read

I work with an accounting system that is currently comprised of 150 databases and growing
by three or four databases per year. My coworkers and I frequently need queries to run against some or all of the databases at once. Since the system’s report writer is slow and constrained, I’ve developed both scripts and stored
procedures, relying on either unions or cursors, as an alternative.

Unions

Unions are straightforward and well-documented in BOL. They can easily reach across multiple databases, and even multiple servers.

If I have three databases with similar schema, and I want a count of rows in a particular table, by database,
I could run the following query:

Click here for code example 1.

I specify the object I want by using its "fully qualified"
name,
more commonly called three-part; i.e., NJ1.dbo.apinpchg. NJ1 is the
database,
dbo is the object’s owner, and apinpchg is the object, a table in this case.
On
my server, the dbo owns every object except INFORMATION_SCHEMA’s views, so I
could refer to this table as NJ1..apinpchg. Don’t assume the dbo owns
everything
on your server(s) or that there aren’t identically named objects owned by
different users; there are several counter-examples elsewhere in my
company.

One point of
interest is MA1, which is on another server, srv2. I access it with
four-part
syntax. The SA would have had to set up
srv2 as a linked server and mapped my local login to a login on srv2 that
had
SELECT rights on MA1.dbo.apinpchg.

I recommend using ALL with the UNION operator
if
possible to increase performance. ALL tells the server
not to
remove duplicate rows from the union, an unproductive and potentially costly
operation.

The reader is referred to BOL for criteria for using UNION.
Note
that the SELECT statements can be as complex as you need. Also
note
that the tables within each SELECT can be drawn from multiple
databases.

Click here for code example 2.

Cursors

I do most of my work with cursors and dynamic SQL. Large unions are
tiresome
to code, read, and maintain. They may not run at all on 6.5, even if they’re
within the maximum batch size.

The following stored procedure returns every transaction in every company
for
the specified GL account for FY2000. It can dump the results into Excel via
Microsoft Query. The only way to do this with my system’s report writer is
to
run 150 transaction detail reports and copy and paste them together.

The SP is an example. It doesn’t have even rudimentary error checking,
and is
inflexible.

Click here for code example 3.

A few notes. The cursor needs a table of database names to work with. If
necessary, you can use sysdatabases in the master database and control the
rows
with a WHERE. I’ve also used derived tables occasionally.

This example uses exec. Microsoft recommends
sp_executesql
if your server is 7.0 or greater. I’m used to exec, and in any
event I doubt you will notice a performance difference between the
alternatives.
Like everything else in SQL Server, performance is mainly governed by
whether
the data is cached and the quality of your T-SQL, indexes, and
statistics.

Unless your users work in ISQL/W or Query Analyzer, you’ll likely want to
develop a cursor solution as a stored procedure. Be aware that the SP won’t
give
users rights to databases and objects that they don’t already have;
exec
checks the rights of the user who invoked the procedure, not the user who
created it.

Which to Use?

	Union	Cursor
Advantages	Easy to follow–not much more complex than one SELECT statement Easy to code. Mostly a copy-and-paste operation in Windows Works well with MSQuery and other programs that allow only one statement per batch Can be put in a view Server can show you the entire query plan if you need Good choice for fixed queries on a small number of databases	Very flexible. You can control the databases queried at runtime and you have the power of dynamic SQL available No meaningful limit on the number of databases that can be queried at once The logic is contained in a loop rather than sprawled out over hundreds of lines. Modifications only have to be done in one place Good choice for a large number of databases or ad hoc queries or both
Disadvantages	Very cumbersome syntax as the number of databases or the query complexity or both increases Inflexible–the particular databases queried are hardcoded Modifications may have to be made throughout the union In unusual circumstances, the server may not be able to handle the query, even though it’s within the maximum batch size. I had considerable problems with 6.5 giving me “DBPROCESS is dead” errors.	Requires some expertise to code and debug Must often be put into a stored procedure Creates a batch every time `exec` runs, which the server must compile More difficult to optimize than a union

Union

Cursor

Advantages

Easy to follow–not much more complex than one SELECT
statement
Easy to code. Mostly a copy-and-paste operation in Windows
Works well with MSQuery and other programs that allow only one
statement per batch
Can be put in a view
Server can show you the entire query plan if you need
Good choice for fixed queries on a small number of databases

Very flexible. You can control the databases queried at runtime
and
you have the power of dynamic SQL available
No meaningful limit on the number of databases that can be
queried
at once
The logic is contained in a loop rather than sprawled out over
hundreds of lines.
Modifications only have to be done in one place
Good choice for a large number of databases or ad hoc queries or
both

Disadvantages

Very cumbersome syntax as the number of databases or the query
complexity or both increases
Inflexible–the particular databases queried are hardcoded
Modifications may have to be made throughout the union
In unusual circumstances, the server may not be able to handle
the
query, even though it’s within the maximum batch size. I had
considerable problems with 6.5 giving me “DBPROCESS is dead”
errors.

Requires some expertise to code and debug
Must often be put into a stored procedure
Creates a batch every time exec runs, which the
server
must compile
More difficult to optimize than a union