If you have written many
applications then you probably have run across situations where you need to
identify if a specific character or set of characters appears in a string. In
this article I will discuss using the CHARINDEX and PATINDEX functions to
search text columns and character strings. I will show you how each of these
functions operate, and explain the differences between them. Also provided are
some examples on how you might consider using these functions to solve a number
of different character search situations.
The CHARINDEX and PATINDEX
functions are used to search a character string for a character or set of
characters. If the character string being searched contains the characters
being searched for, then these functions return a non-zero integer value. This
integer value is the starting location of where the character string being
searched for is located within the string being searched. The PATINDEX
function allows for using wildcard syntax in the pattern being searched for, where
as the CHARINDEX function does not support wildcard character searches. Let’s
look at each one of these functions in a little more detail.
How to use the CHARINDEX
Function
The CHARINDEX function
returns the starting position of a character, or a string of characters within
another character string. The CHARINDEX function is called using the following
format:
CHARINDEX
( expression1 , expression2 [ , start_location
] )
Where expression1 is
the string of characters to be found in expression2, and start_location
is the position where the CHARINDEX function will start looking for expression1
in expression2.
The CHARINDEX function
returns an integer. The integer value returned is the position where the
characters being search for are located within the string being searched. If
the CHARINDEX does not find the characters you are searching for then the
function returns a zero integer value. Let say we execute the following
CHARINDEX function call:
CHARINDEX('SQL', 'Microsoft SQL Server')
This function call will return
the starting location of the character string “SQL”, in the string “Microsoft
SQL Server”. In this case the CHARINDEX function will return the number 11,
which as you can see is the starting position of “S” in string “Microsoft SQL
Server”.
Now say we have the
following CHARINDEX Command:
CHARINDEX('7.0', 'Microsoft SQL Server 2000')
In this example the
CHARINDEX function will return zero, since the character string “7.0” cannot be
found in the string “Microsoft SQL Server 2000”. Let go through a couple of
examples of how you might be able to use the CHARINDEX function to solve some
actual T-SQL problems.
For the first example say
you would like to display only the last name of the ContactName column, for the
first 5 records in the Northwind database Customer table. Here are the first 5
records.
ContactName
——————————
Maria Anders
Ana Trujillo
Antonio Moreno
Thomas Hardy
Christina Berglund
As you can see, the CustomerName
contains both the first and last name of the customer, where first and last
name are separated by a single space. I will use the CHARINDEX function to
identify the position of the space between the two names. This way we can used
the position of the space to parse the ContactName so we can display only the
last name portion of the column. Here is some T-SQL code to display only the
last name for the first 5 records in the Northwind Customer table.
select top 5 substring(ContactName,
charindex(‘ ‘,ContactName)+1 ,
len(ContactName)) as [Last Name]
from Northwind.dbo.customers
Here is the output from this
command:
Last Name
——————————
Anders
Trujillo
Moreno
Hardy
Berglund
The CHARINDEX function found
the space between the First and Last Name, so that the substring function could
split the ContactName, thus only the Last Name was displayed. I added 1 to the
integer value that CHARINDEX returned, so the Last Name displayed did not start
with a space.
For the second example, say
you want to count all the records from a table where a given column contains a
particular character string. The CHARINDEX function could be used to satisfy
your request. To count all of the Addresses in the Northwind.dbo.Customer
table where the Address column contains either the word ‘Road’ or an
abbreviation for road (‘Rd’), your SELECT statement would look like this:
select count(*) from Northwind.dbo.Customers where CHARINDEX('Rd',Address) > 0 or CHARINDEX('Road',Address) > 1
How Does the PATINDEX Function
Work?
The PATINDEX function
returns the starting position of a character or string of characters within
another string, or expression. As stated earlier the PATINDEX has additional functionality over CHARINDEX.
PATINDEX supports wildcard characters in
the search pattern string. This makes PATINDEX valuable for searching for
varying string patterns. The PATINDEX command takes the following form:
PATINDEX
( ‘%pattern%‘ , expression )
Where “pattern” is
the character string you are searching for and expression is the string
in which you are searching. Commonly the expression is a column in a
table. The “%” sign is needed on the front and back of the pattern, unless you
are searching for the pattern at the beginning and/or ending of the expression.
Like the CHARINDEX function,
the PATINDEX function returns the starting position of the pattern within the string
being searched. If you have a PATINDEX function call like so:
PATINDEX(‘%BC%’,’ABCD’)
Then the result of the
PATINDEX function call is 2, which is the same as the CHARINDEX function. The
%’s in the above command tells the PATINDEX command to find the position of the
“BC” string where the string might have zero or more characters in front of, or
after “BC”. The % sign is a wildcard character.
If you want to determine if
a string starts with a specific set of characters you would leave off the first
% sign, and your PATINDEX call would look like this:
PATINDEX(‘AB%’,’ABCD’)
In this case the PATINDEX
function returns a 1, indicating that the pattern ‘AB’ was found in the
expression ‘ABCD’.
Now with wildcard characters
you can create a much more complicated pattern matching situation then the simple
ones I have shown you so far. Say you want to determine if a character string
contains the letters A and Z, as well as any numeric number. Then your
PATINDEX function call might look like this.
PATINDEX(‘%[A,Z,0-9]%[A,Z,0-9]%[A,Z,0-9]%’,’XYZABC123′)
Note that the pattern I am
looking for in the above example uses a number of wildcard references. Review
SQL Server Books Online for other wildcard syntax information. Let’s go through
a couple of examples in how we might use the PATINDEX command in conjunction
with a SELECT statement.
Say you want to find all of
the records that contain the words “Bread”, or “bread” in the Description text
column in the Northwind.dbo.Categories table, then your select statement would
look like this:
select Description from Northwind.dbo.Categories where patindex('%[b,B]read%',description) > 0
Here I used wildcarding to
look for either a lower or upper case “b”. When I run this SELECT statement against
my Northwind database I get the following Description columns displayed:
Description
——————————————————–
Desserts, candies, and sweet breads
Breads, crackers, pasta, and cereal
Here is another example
where I used some additional wildcard references to find some records. This
example excludes the Description that has an ‘e’ as the second letter from the
result set in the above example.
select Description from Northwind.dbo.Categories
where patindex(‘%[b,B]read%’,description) > 0
and patindex(‘_[^e]%’,description) = 1
By adding an additional
PATINDEX function call, to the where statement, that used the ^ wildcard
symbol, I was able to exclude the “Dessert, candies, and sweet breads”
description. The above example returned the following single description.
Description
——————————————————–
Breads, crackers, pasta, and cereal
Conclusion
As you can see the CHARINDEX
and the PATINDEX perform similar kinds of pattern searches within a character
string. The PATINDEX function provides wildcard specifications, allowing it to
be used for much different kinds of pattern matches, while the CHARINDEX
function does not. Depending on what you need to do, these two functions are
great in helping you search, manipulate and parse character strings in SQL
Server.