Introduction
Full Text Searching is a free, optional component of MS SQL
2000. When installed, it offers a vast array of additional string querying
abilities. Full Text Searching allows for string comparisons similar to
internet search engines, returning both results and a matching score or
weight. With regular TSQL, string matching is usually limited to an exact
match, or a wildcard match with the keyword "LIKE." Full Text
Searching exceeds this by searching for phrases, groups of words, words near
one another, or different tenses of words, such as run, running, and ran. In
addition, if Microsoft Office Word or Excel documents are saved in the
database, their contents can be searched like a typical varchar field. Full
Text Searching is accomplished by installing a new service (Microsoft Search),
and using key words in TSQL designed specifically for text searching. This
article will demonstrate installing, configuring and using the Full Text Search
engine.
Microsoft Search
Unlike other SQL operations, Full Text Searching is not
managed by the MSSQL Server service. Instead, a new service, Microsoft Search,
must be installed and started. A primary reason for this is that Full Text
Searching is done against a special index contained in a Catalog. Catalogs that
are used to support searching are not part of the regular SQL server
installation; they are kept outside the MDF, and are stored in separate
physical files. The Microsoft Search service has two primary jobs, indexing support
and querying support. Indexing support includes the tasks of defining the
Catalogs, and indexes they contain, creating them and keeping the data up to
date as changes take place. When search queries are issued, the second job of
the Microsoft Search service begins, determining which Catalog index meets the
query requirements. Once determined, Microsoft Search returns the identity of
selected rows, along with a ranking value if requested, back to the SQL Server
service for query completion. This brings up two requirements that the
Microsoft Search service imposes. First, there must be a unique index on the
base table to which the Catalog refers. This is required because the Search
service only reports the identity of the selected rows back to the SQL service,
not the record set itself. Second, Full Text Searching is only supported on
Windows Server. Either NT 4 or 2000 Server can be used, but not Workstation,
Windows 98, or XP. These other clients can issue quires, but not host the
Search service.
Installing Search
Check in the services list for "Microsoft Search."
If not listed, insert the SQL Server media, and select "Upgrade, Add or
Remove Components." Click next for "Add Components." On the
next screen, select "Server Components," Full-Text Search, and then
continue with the rest of installation.
Creating a Catalog
In this first example, we will be creating a Full Text
Search on the pubs database. To begin, open Enterprise Manager and highlight
the local server.
1. From
the top menu, select Tools, Full Text Indexing.
2. Click
next on the opening screen.
3. On
the "Select a Database" screen, choose pubs.
4. For
the next screen, select "titles" as the table and click next.
5. Keep
the default unique index; there is only one on the pubs table.
6. In
the next screen, select Title and Notes as the fields we would like indexed.
7. On
the "Select a Catalog" screen, because this is the first Catalog on
the server, we are prompted for a name. In this example, "demo" is
used as the Catalog name.
8. This
next screen "Populate Catalogs," leave blank and click next.
9. Click
Finish, and the following success screen should appear:
10. From Enterprise Manager,
expand Full-Text Catalogs, the "demo" catalog should appear in the
right window.
11. Right click and select "Start
Full Population."
12. The catalog is now complete.
13. To verify in creation, open
Query Analyzer and select the pubs database.
14. Execute sp_help_fulltext_catalogs,
the new Catalog should appear.
15. Issue a test query of:
SELECT title, notes FROM titles WHERE CONTAINS (notes, ' "quick easy" ')
This should return one record with a title of "Silicon
Valley Gastronomic Treats."
One of the first items to notice about Full Text Searching
is that it is not automatic. Even after Microsoft Search is installed, it
still must be configured for each database, table, and field we want to
search. If a field will need to be searched, it must first be included in a Catalog.
Steps 1 through 4 and 6 accomplished this.
Step 5 demonstrates one the Full Text Search requirements,
that each table made available for searching contains a unique index. When a
search query is issued, Microsoft Search will determine which records meet the
query, and will report their unique index key back to the MSSQL Server service
for completion of the query.
The result of Step 7, "Select a Catalog," is the
creation of an entire file structure to support the Catalog. To find the name
of structure from Enterprise Manager, expand Full-Text Search, then right click
the "demo" Catalog and select properties. By default, Catalogs will
be saved under the default server directory, MSSQL, FTDATA.
For the "demo" Catalog, a new directory structure
named "SQL00000500005" as shown in the "Physical catalog"
field, has been created. This structure will store the indexes and data files
needed by the Search service. They are not text readable. The "Item
count" field shows the sum of the tables and rows indexed.
Conclusion
Full Text Searching brings advanced string querying support
to SQL. In this first article, a new Catalog containing an index to search on
two fields of the pubs database has been created. Future articles will expand
this Catalog to include searching for words near each other, returning the rank
or weight of the search, and indexing Microsoft Office documents. A number of
housekeeping tasks will also be introduced, such as keeping the Catalog up to
date when data on the base table changes, and determining whether a word should
be indexed or not (Noise words). Although Full Text Searching is far from
automatic, the results it provides make it worth the effort.