In the previous installment of our series dedicated to the most prominent
features available in SQL Server 2005 Express Edition, we have concluded the
coverage of its implementation of Reporting Services. There are a few remaining
topics that we have not discussed yet, whose importance warrants our attention.
One of them is Full Text Indexing and Search, whose characteristics we will be
exploring starting with this article.
You are likely quite familiar with standard T-SQL capabilities allowing you
to query character-based data according to full or partial matches (typically
relying on wildcards, such as %
,
_
, []
, or [^]
). There are, however, situations where these options
turn out to be too limited (or simply too inefficient from a performance
standpoint, resulting in full table scans), especially when dealing with
unstructured text containing inflections, diminutives, and misspellings, or
based on unsupported by LIKE
statements data types. Such needs are addressed by the Full Text Search
functionality, allowing you to delve deeper into a variety of lexical
arrangements, including locating phrases that are similar to each other. The
level of similarity can be adjusted by assigning a desired weigh factor, words
sharing a common base, or a combination of several words appearing near each
other (so called proximity searches). In addition, it becomes possible to
perform localization-specific queries, which take into consideration rules
pertinent to an arbitrarily chosen language.
In its original implementation, full text indexing relied on the Microsoft Search
service built into the
host operating system. Starting with the SQL Server 2005 platform, its
responsibilities have been relegated to the Full-Text Engine, implemented as
the MSFTESQL.EXE
process and
operating as SQL Server FullText Search (SQLEXPRESS)
service (where SQLEXPRESS
is the name of the SQL Server 2005 Express Edition instance where this feature
has been installed). This new component interacts with the SQL Server database
engine to facilitate creating, populating, and managing full-text indexes and
their catalogs, as well as to support execution of queries based on them. It is
aided by the Full-Text Engine Filter Daemon (running as MSFTEFD
process) that enhances indexing
and searching functionality through filters (making possible to analyze data in
formats typically not associated with SQL Server, such as Word, Excel, or
PowerPoint, stored in columns of image and varbinary(max)
types), word breakers (delimiting word boundaries) and stemmers (assisting with
organizing conjugated verbs). The resulting indexes are stored within catalogs
residing in the local file system (by default, in the Program FilesMicrosoft SQL ServerMSSQL.xMSSQLFTData
folder, where x
designates an integer assigned to a local SQL Server 2005 instance), rather
than SQL Server-hosted databases (keep this fact in mind when devising a backup
strategy). Full-Text Search, however, does keep some of its metadata in
internally managed tables (fulltext_index_map
and fulltext_catalog_freelist
).
Due to its potential performance implications, full text indexing is not
enabled by default in the SQL Server 2005 Express Edition. To make its
functionality available, you need first to explicitly add it during the setup
process. With the base component in place, you will have an option to enable
full-text indexing on a target database (note that this happens to be the
default applied when creating new user databases via CREATE DATABASE
statement, but not when
performing the same task via graphical interface within SQL Server Management
Studio Express) and create a full-text catalog associated with it (remember
that it is not possible to create full-text catalogs in master
, model
, or tempdb
system databases). The final steps consist of creating and populating full-text
indexes. It is important to realize that they are applicable only to columns of
specific data types (namely character-based including char
, nvarchar
,
and varchar
, as well as
formatted as binary type, such as image
,
and varbinary(max)
), and
that tables containing these columns must have a unique, single-column, non-nullable,
preferably clustered index (which is referenced during index creation). It is
also helpful to know that there are several factors affecting the process of
populating and querying indexes. One of them is the presence of so-called noise
words (such as pronouns, conjunctions, or prepositions), which are excluded
from indexing or search results. Another involves the use of thesaurus,
allowing you to incorporate synonyms into search criteria (language-specific
noise and thesaurus files reside in the Program
folder and
FilesMicrosoft SQL ServerMSSQL.xMSSQLFTData
can be customized to match your requirements). You should also be aware that
the index population mechanism depends on the value of its CHANGE_TRACKING
option (associated with
the index at its creation), which, in addition, determines whether SQL Server
maintains a list of changes to indexed data (we will discuss these topics in
more details in our next article). Finally, keep in mind that there is a limit
of a single full-text index per table.
Let’s walk through a sample implementation of a full-index catalog and
demonstrate a few searches that take advantage of its benefits. We will be
working with the AdventureWorks
database (refer to one
of our earlier articles for specific instructions regarding adding it to
your SQL Server 2005 Express Edition instance). Start by verifying whether the
Full-Text Index is present on your computer. One way to confirm whether this is
the case is to review the list of active services and look for SQL Server FullText Search (SQLEXPRESS)
entry, where SQLEXPRESS
matches the name of the local database engine instance. Another possibility is
to examine the outcome of SELECT fulltextserviceproperty('IsFulltexIinstalled')
T-SQL statement, which should yield 1
.
If it turns out that the Full-Text Search feature is missing, you will have
to extract the installation files included in the Microsoft SQL Server 2005
Express Edition with Advanced Services, available from the Microsoft Download
Center as a single, compressed executable SQLEXPR_ADV.EXE
. This can be accomplished by invoking the
executable with the -x
switch, after which you will be prompted to Choose
. You will need to point to the
Directory For Extracted Filessetup.exe
file at that location when
selecting the Change
option
from the context sensitive menu of SQL
entry of the listing presented in
Server 2005Add or Remove Programs
or Programs and Features
(depending on the
operating system version) Control Panel applet. On the other hand, if you are
installing a new instance of SQL Server 2005 Express Edition, simply make sure
to add the Full-Text Search
feature (via the Feature Selection
dialog box in the Microsoft SQL Server
wizard) to the list of other components you intend to
2005 Setup
use.
Once you have confirmed that the SQL
service is present and operational,
Server FullText Search
disable user instances (since they are not compatible with full-text indexing)
by running sp_configure 'user instances
via
enabled', '0'Query
. Next, verify that the indexing is enabled on a target
Editor
database. This can be accomplished either by checking the state of the Use full-text indexing
checkbox in the Files
section of the Database Properties
dialog box in the SQL Server Management Studio Express
interface or by running SELECT
,
DATABASEPROPERTY('database_name','IsFullTextEnabled')
where the 'database_name'
parameter is set in our case to 'AdventureWorks'
).
Modifying this option involves either selecting the checkbox or invoking sp_fulltext_database
stored procedure
with 'enable'
switch while
connected to the target database. (Similarly, when creating new databases, you
could simply check the Use full-text
checkbox in the
indexingNew
dialog box). Assuming that the outcome is successful,
Database
you are ready to create a full-text catalog by running the CREATE FULLTEXT CATALOG
T-SQL statement,
which, by default, will result in the creation of a directory structure under the
Program FilesMicrosoft SQL ServerMSSQL.xMSSQLFTData
folder (carrying out this command requires at the minimum CREATE FULLTEXT CATALOG
permissions). By
employing the optional parameters of this statement, you also have the ability
to alter the location of catalog files (by pointing to another local disk using
the IN PATH
attribute, which
is recommended if you expect their size to be significant), designate a catalog
as the database default (by including AS
clause) or accommodate accent-sensitive searches (by
DEFAULT
appending WITH ACCESS_SENSITIVITY=ON
).
In our example (with ftcAdventureWorks
catalog residing in D:AppDataSQLFTData
folder), this procedure consists of the following steps:
USE AdventureWorks GO EXEC sp_fulltext_database 'enable' GO CREATE FULLTEXT CATALOG 'ftcAdventureWorks' IN PATH 'D:AppDataSQLFTData' AS DEFAULT GO
Note that, unlike full-featured editions of SQL Server 2005, the graphical
interface of SQL Server Management Studio Express does not expose options for
creating new or viewing existing catalogs. However, you can simplify your
management tasks by taking advantage of templates available in SQL Server
Management Studio, by connecting it to your SQL Server 2005 Express Edition
instance and following the steps described in the Microsoft Knowledge Base
article 916784. In either
case, the end result is the same, yielding a full text catalog associated with
the AdventureWorks
database.
In the upcoming articles of our series, we will describe the remaining steps
necessary to create a full-text index on one of tables in the AdventureWorks
database as well as
present sample queries that demonstrate the power and flexibility of full-text
searches.