Leveraging the Information_schema
January 10, 2003
A while back, I received a call from a colleague who was testing out a third party database and client application. She had a simple question - she wanted to know all of the tables and columns in the database that contained a specific string. My colleague did not want the names of the tables or columns containing the string; she needed to know where, in any row of data, the string appeared. Imagine looking for the word "Smith" anywhere in the PUBS database:
It might seem a strange query, but I knew why she was asking. A software house had supplied the database and the client, but no documentation to say how data in the database mapped onto what the end user saw on screen. My colleague had the client app, and an ODBC connection into SQL Server so she could view the data, but she needed this question answered in order to make sense of the results she was getting while testing.
Now the obvious question here is "Didn't the supplier provide this info in the documentation?"
Well, no they did not.
It's a big database, with a lot of tables, and many columns, and no documentation. The job of testing the application was not an enviable one, the task of checking every single character column, in every single user table in the database, to look for a particular string even less so. Fortunately - it's not necessary.
One thing that a relational database does, or should do, is describe itself. SQL Server does.
This description, beginning with SQL Server 7, is called the "Information Schema." Old hands like me sometimes still use outmoded terms like "system tables."
The Information Schema is a collection of tables in every database that describes the layout of the database (even describing the Information Schema itself) and one of these tables in particular was going to help turn this nasty little job into a very easy one.
Try running the
I know I am looking for a character string, so I can eliminate all the assorted columns in the database that contain binary or numeric information. There is no Text data type in pubs; it's only necessary to search the CHAR and VARCHAR datatypes, along with their UNICODE variants NCHAR and NVARCHAR. There is also no need to search the information_schema itself, as it contains no user data. A simple query against information_schema.columns will narrow down the search to a specific set of tables and columns:
use pubs go select table_name, column_name from information_schema.columns where data_type in ('char', 'varchar', 'nvchar', 'nvarchar') and table_schema = 'dbo'
Next, write a select statement for each table and column combination, to see which one contains the word "Smith." Of course, I don't intend to type out all of those select statements myself: A simple bit of dynamic SQL will do the job for me:
use pubs go select 'if exists (select 1 from ' + table_name + ' where ' + column_name + ' like ''%smith%'' ) print ''' + table_name + '/' + column_name + '''' + char(10) + 'go' from information_schema.columns where data_type in ('char', 'varchar', 'nvchar', 'nvarchar') and table_schema = 'dbo'
Wrapping the text of the "Select" statement inside the previously written SQL statement causes SQL Server to write the bulk of the SQL. All that remains to be done is cut & paste the results of the query above into a new Query window and the dynamically generated SQL will search the database for any "Smith" that's in there.
If the idea of writing Dynamic SQL grabs you, check out my Dynamic SQL for beginners tutorial.