Full Text Catalog Configuration in SQL Server 2005 Express Edition

November 9, 2009

In the previous installment of our series dedicated to the most prominent features available in SQL Server 2005 Express Edition, we have concluded the coverage of its implementation of Reporting Services. There are a few remaining topics that we have not discussed yet, whose importance warrants our attention. One of them is Full Text Indexing and Search, whose characteristics we will be exploring starting with this article.

You are likely quite familiar with standard T-SQL capabilities allowing you to query character-based data according to full or partial matches (typically relying on wildcards, such as %, _, [], or [^]). There are, however, situations where these options turn out to be too limited (or simply too inefficient from a performance standpoint, resulting in full table scans), especially when dealing with unstructured text containing inflections, diminutives, and misspellings, or based on unsupported by LIKE statements data types. Such needs are addressed by the Full Text Search functionality, allowing you to delve deeper into a variety of lexical arrangements, including locating phrases that are similar to each other. The level of similarity can be adjusted by assigning a desired weigh factor, words sharing a common base, or a combination of several words appearing near each other (so called proximity searches). In addition, it becomes possible to perform localization-specific queries, which take into consideration rules pertinent to an arbitrarily chosen language.

In its original implementation, full text indexing relied on the Microsoft Search service built into the host operating system. Starting with the SQL Server 2005 platform, its responsibilities have been relegated to the Full-Text Engine, implemented as the MSFTESQL.EXE process and operating as SQL Server FullText Search (SQLEXPRESS) service (where SQLEXPRESS is the name of the SQL Server 2005 Express Edition instance where this feature has been installed). This new component interacts with the SQL Server database engine to facilitate creating, populating, and managing full-text indexes and their catalogs, as well as to support execution of queries based on them. It is aided by the Full-Text Engine Filter Daemon (running as MSFTEFD process) that enhances indexing and searching functionality through filters (making possible to analyze data in formats typically not associated with SQL Server, such as Word, Excel, or PowerPoint, stored in columns of image and varbinary(max) types), word breakers (delimiting word boundaries) and stemmers (assisting with organizing conjugated verbs). The resulting indexes are stored within catalogs residing in the local file system (by default, in the Program Files\Microsoft SQL Server\MSSQL.x\MSSQL\FTData folder, where x designates an integer assigned to a local SQL Server 2005 instance), rather than SQL Server-hosted databases (keep this fact in mind when devising a backup strategy). Full-Text Search, however, does keep some of its metadata in internally managed tables (fulltext_index_map and fulltext_catalog_freelist).

Due to its potential performance implications, full text indexing is not enabled by default in the SQL Server 2005 Express Edition. To make its functionality available, you need first to explicitly add it during the setup process. With the base component in place, you will have an option to enable full-text indexing on a target database (note that this happens to be the default applied when creating new user databases via CREATE DATABASE statement, but not when performing the same task via graphical interface within SQL Server Management Studio Express) and create a full-text catalog associated with it (remember that it is not possible to create full-text catalogs in master, model, or tempdb system databases). The final steps consist of creating and populating full-text indexes. It is important to realize that they are applicable only to columns of specific data types (namely character-based including char, nvarchar, and varchar, as well as formatted as binary type, such as image, and varbinary(max)), and that tables containing these columns must have a unique, single-column, non-nullable, preferably clustered index (which is referenced during index creation). It is also helpful to know that there are several factors affecting the process of populating and querying indexes. One of them is the presence of so-called noise words (such as pronouns, conjunctions, or prepositions), which are excluded from indexing or search results. Another involves the use of thesaurus, allowing you to incorporate synonyms into search criteria (language-specific noise and thesaurus files reside in the Program Files\Microsoft SQL Server\MSSQL.x\MSSQL\FTData folder and can be customized to match your requirements). You should also be aware that the index population mechanism depends on the value of its CHANGE_TRACKING option (associated with the index at its creation), which, in addition, determines whether SQL Server maintains a list of changes to indexed data (we will discuss these topics in more details in our next article). Finally, keep in mind that there is a limit of a single full-text index per table.

Let's walk through a sample implementation of a full-index catalog and demonstrate a few searches that take advantage of its benefits. We will be working with the AdventureWorks database (refer to one of our earlier articles for specific instructions regarding adding it to your SQL Server 2005 Express Edition instance). Start by verifying whether the Full-Text Index is present on your computer. One way to confirm whether this is the case is to review the list of active services and look for SQL Server FullText Search (SQLEXPRESS) entry, where SQLEXPRESS matches the name of the local database engine instance. Another possibility is to examine the outcome of SELECT fulltextserviceproperty('IsFulltexIinstalled') T-SQL statement, which should yield 1.

If it turns out that the Full-Text Search feature is missing, you will have to extract the installation files included in the Microsoft SQL Server 2005 Express Edition with Advanced Services, available from the Microsoft Download Center as a single, compressed executable SQLEXPR_ADV.EXE. This can be accomplished by invoking the executable with the -x switch, after which you will be prompted to Choose Directory For Extracted Files. You will need to point to the setup.exe file at that location when selecting the Change option from the context sensitive menu of SQL Server 2005 entry of the listing presented in Add or Remove Programs or Programs and Features (depending on the operating system version) Control Panel applet. On the other hand, if you are installing a new instance of SQL Server 2005 Express Edition, simply make sure to add the Full-Text Search feature (via the Feature Selection dialog box in the Microsoft SQL Server 2005 Setup wizard) to the list of other components you intend to use.

Once you have confirmed that the SQL Server FullText Search service is present and operational, disable user instances (since they are not compatible with full-text indexing) by running sp_configure 'user instances enabled', '0' via Query Editor. Next, verify that the indexing is enabled on a target database. This can be accomplished either by checking the state of the Use full-text indexing checkbox in the Files section of the Database Properties dialog box in the SQL Server Management Studio Express interface or by running SELECT DATABASEPROPERTY('database_name','IsFullTextEnabled'), where the 'database_name' parameter is set in our case to 'AdventureWorks'). Modifying this option involves either selecting the checkbox or invoking sp_fulltext_database stored procedure with 'enable' switch while connected to the target database. (Similarly, when creating new databases, you could simply check the Use full-text indexing checkbox in the New Database dialog box). Assuming that the outcome is successful, you are ready to create a full-text catalog by running the CREATE FULLTEXT CATALOG T-SQL statement, which, by default, will result in the creation of a directory structure under the Program Files\Microsoft SQL Server\MSSQL.x\MSSQL\FTData folder (carrying out this command requires at the minimum CREATE FULLTEXT CATALOG permissions). By employing the optional parameters of this statement, you also have the ability to alter the location of catalog files (by pointing to another local disk using the IN PATH attribute, which is recommended if you expect their size to be significant), designate a catalog as the database default (by including AS DEFAULT clause) or accommodate accent-sensitive searches (by appending WITH ACCESS_SENSITIVITY=ON). In our example (with ftcAdventureWorks catalog residing in D:\AppData\SQL\FTData folder), this procedure consists of the following steps:

USE AdventureWorks
GO
EXEC sp_fulltext_database 'enable'
GO
CREATE FULLTEXT CATALOG 'ftcAdventureWorks' IN PATH 'D:\AppData\SQL\FTData' AS DEFAULT
GO

Note that, unlike full-featured editions of SQL Server 2005, the graphical interface of SQL Server Management Studio Express does not expose options for creating new or viewing existing catalogs. However, you can simplify your management tasks by taking advantage of templates available in SQL Server Management Studio, by connecting it to your SQL Server 2005 Express Edition instance and following the steps described in the Microsoft Knowledge Base article 916784. In either case, the end result is the same, yielding a full text catalog associated with the AdventureWorks database. In the upcoming articles of our series, we will describe the remaining steps necessary to create a full-text index on one of tables in the AdventureWorks database as well as present sample queries that demonstrate the power and flexibility of full-text searches.

» See All Articles by Columnist Marcin Policht








The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers