Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

MS SQL

Posted Jan 31, 2007

Dealing With Upper and Lower Case Data

By Gregory A. Larsen

When working with data in SQL Server you might run across different needs to change the case of the data. For instance, you might want to convert all the characters in a varchar column to upper case, lower case, or change only the first character of a column value to upper case. In addition, you might want to compare different character strings and either have them be equal or not equal when the characters are the same but the case is different. In this article, I will show you different examples of how to deal with different situations related to the case of character strings.

What is Collation?

Each server, database, or character column has a default collation. A collation is a definition for how each character is represented, stored and compared. Part of the collation setting, known as case sensitivity, determines if an upper case “A” and a low case “a” when compared against either other are the same or different. If the collation is case-insensitive then a lower case “a” and an upper case “A” when compared are the same. So when using case insensitive data the following two strings are equal, “AaAaA” and “aaaaa”. On the other hand, if your collation setting is case-sensitive then the preceding two strings of A’s will not be equal. Therefore, when you are working with data you need to understand what the case sensitivity is for the character data you are dealing with.

UPPER and LOWER Function

Within SQL Server, two functions are provided to manage the case of character strings. These functions are UPPER and LOWER. The UPPER function accepts a character string as a parameter and returns the same character string where all the characters have been converted to upper case. If the character string “AbCd1234” is passed to the UPPER function, then this function returns “ABCD1234”. The LOWER function does just the opposite and converts all characters it receives to lower case, so if “AbCd1234” is passed to the LOWER function then it would return “abcd1234”.

Converting the First Character of Each Column to Upper Case

For my first example let me show you how to convert the first character of each column value to upper case and lower casing the rest of the column. Here is the T-SQL code for my example:

SET NOCOUNT ON 
IF exists (SELECT * FROM dbo.sysobjects WHERE id = OBJECT_ID(N'[mytable]') 
 and OBJECTPROPERTY(id, N'IsUserTable') = 1)
DROP TABLE [mytable]
GO
CREATE TABLE [mytable] (
A CHAR(10))
go
INSERT INTO mytable VALUES('aAaA')
INSERT INTO mytable VALUES('bBbB')
INSERT INTO mytable VALUES('cCcC')
INSERT INTO mytable VALUES('dDdD')
INSERT INTO mytable VALUES('DDdD')
-- Uppercase first letter of each value in column
SELECT UPPER(SUBSTRING(A,1,1)) + LOWER(SUBSTRING(A,2,LEN(A))) 
 [Upper Case First Letter Only] 
FROM mytable
DROP TABLE mytable

In this example, I created a table “mytable” that contained a single column. I then insert five different records into this table. The column in each row contains a character string that has a mixed set of cases. In the SELECT statement above, I use the substring function to get the first character of column “A”. I then pass that first character to the UPPER function. This upper cases the first character of column “A”. I then use the concatenation operator (+) to append the rest of column “A” to that upper cased first character. Before the rest of column “A” is concatenated to the first character, it is lower cased by using the LOWER function on the substring of column “A” starting at second character of column “A”. Here is the output I get when I run the above script:

Upper Case First Letter Only
----------------------------
Aaaa 
Bbbb 
Cccc 
Dddd 
Dddd

Proper Casing a Persons Name

In this example, I will take a column that contains a persons name and proper case the name. By proper case I mean I will take the first character of each name (First, Middle, Last, etc.) and upper case it and then lower case the rest of each name.

The bulk of my proper case example is coded in a T-SQL function. The function handles proper casing any number of names. It also handles multiple spaces between names, Irish type names, and with apostrophe or parentheses within in the name. Here is the code for my proper case function:

IF OBJECT_ID (N'dbo.ProperCase', N'FN') IS NOT NULL
    DROP FUNCTION dbo.ProperCase
GO
CREATE FUNCTION ProperCase (@String VARCHAR(8000)) RETURNS VARCHAR(8000) AS BEGIN DECLARE @TempString VARCHAR(8000) DECLARE @PS VARCHAR(8000)
SET @PS = ''
-- lower case entire string
SET @TempString = lower(@String)
WHILE patindex('%[-( '']%',@TempString) > 0 BEGIN
  -- Check to see if first character of @TempString is whitespace
  IF (patindex('%[-( '']%',SUBSTRING(@TempString,1,1)) > 0)
  BEGIN
    SET @PS = @PS + SUBSTRING(@TempString,1,1)
  END
  ELSE -- @TempString starts with a Name
  BEGIN
   IF SUBSTRING(@TempString,1,2) = 'mc'
   BEGIN
     SET @PS = @PS + 'Mc'
     SET @TempString = SUBSTRING(@Tempstring,3,LEN(@TempString))
   END
   IF SUBSTRING(@TempString,1,3) = 'mac'
   BEGIN
     SET @PS = @PS + 'Mac'
     SET @TempString = SUBSTRING(@Tempstring,4,LEN(@TempString))
   END
      
    -- upper case first character and return string up to the next space
    SET @PS = @PS + UPPER(SUBSTRING(@TempString,1,1)) +
     SUBSTRING(@TempString,2,patindex('%[-( '']%',@TempString)-1)
    
    
  END
  -- truncation string that we have already processed
  
  SET @TempString = SUBSTRING(@TempString,
     patindex('%[-( '']%',@TempString)+1,LEN(@TempString))
  -- Trim off leading spaces
  SET @TempString = LTRIM(@TempString)
END
IF SUBSTRING(@TempString,1,2) = 'mc'
BEGIN
  SET @PS = @PS + 'Mc'
   SET @TempString = SUBSTRING(@Tempstring,3,LEN(@TempString))
END
IF SUBSTRING(@TempString,1,3) = 'mac'
BEGIN
  SET @PS = @PS + 'Mac'
  SET @TempString = SUBSTRING(@Tempstring,4,LEN(@TempString))
END
-- proper case last word/name
SET @PS = @PS + UPPER(SUBSTRING(@TempString,1,1)) +
SUBSTRING(@TempString,2,LEN(@TempString))
-- check for spaces in front of special characters
SET @PS = Replace(@PS,' -','-')
SET @PS = Replace(@PS,' ''','''')

RETURN (@PS)
END

Here is an example of how to execute my ProperCase function:

CREATE TABLE #temp (
PersonName VARCHAR(100))
INSERT INTO #temp VALUES('GREGory A. LARSEN')
INSERT INTO #temp VALUES('TODD JOShUA   SMITH')
INSERT INTO #temp VALUES('HARRY  JOnes')
INSERT INTO #temp VALUES('Danielle i. RobBins')
INSERT INTO #temp VALUES('mr. john d. johnson')
INSERT INTO #temp VALUES('Jon o''connor') 
INSERT INTO #temp VALUES('mARtiN mCGalpan')
INSERT INTO #temp VALUES('sARah   maCdonald')
INSERT INTO #temp VALUES('maRy   Smith-foster')
INSERT INTO #temp VALUES('Robert (bob)  smith')
SELECT dbo.ProperCase(PersonName) [Proper Case Names] FROM #temp
DROP TABLE #temp 

When I run the above code, I get the following output:

Proper Case Names
-------------------------------------
Gregory A. Larsen
Todd Joshua   Smith
Harry  Jones
Danielle I. Robbins
Mr. John D. Johnson
Jon O'Connor
Martin McGalpan
Sarah MacDonald
Mary Smith-Foster
Robert (Bob) Smith

My code above, that calls my proper case function, first creates and populates a temporary table that contains a PersonName column where the names are specified in a mix of upper and lower case characters. In the SELECT statement, I call the ProperCase function and pass it the PersonName column. The ProperCase function takes the passed parameter string and first converts all the characters to lower case using the LOWER function and sets a variable @TempString to the lower case value. The function then processes through the @TempString with a WHILE loop looking for special characters in the name (dash, parentheses, spaces or a quote). Each pass through the WHILE loop, the @TempString is evaluated to determine if the first character is one of these special characters, or is it the first character of a name. If a special character is found the special character is added to the @PS variable. When a new name is found, (first character is not a special character) then the case of the first character is converted to upper case, and then both the first character of the name and the rest of the name are added to the @PS string. The @TempString is shortened to only include the characters after the first space. The WHILE loop continues processing until the @TempString contains no more special character. Once the WHILE loop completes, the final part of the name is then proper cased and added to the @PS variable. The @PS variable is then returned to the calling SELECT statement.

The function makes it easy to implement a process to convert a character string to proper case. Having the proper case code embedded in a function simplifies the code for the SELECT statement to proper case a column.

Comparing Columns with Different Case

As stated earlier the collation of a character string will determine how SQL Server compares character strings. If you store your data using a case-insensitive format then when comparing the character string “AAAA” and “aaaa” they will be equal. If you want to be able to compare two case-insensitive strings and have the comparison be false when the two strings are the same except for the case of the characters then you need to use the COLLATE clause when comparing the two strings. To demonstrate look at the following code:

CREATE TABLE #mytable (
C CHAR(10) COLLATE Latin1_General_CI_AS)
INSERT INTO #mytable VALUES('aaaa')
SELECT C FROM #mytable WHERE C = 'AAAA' 
SELECT C FROM #mytable WHERE C = 'AAAA' COLLATE Latin1_General_CS_AS
DROP TABLE #mytable

When I run this code on my machine, I get the following output.

C
----------
aaaa      
(1 row(s) affected)
C
----------
(0 row(s) affected)

Here you can see the WHERE statement in the first SELECT statement returned the single record from #mytable. The WHERE statement evaluates to true even though the value of column C contains 4 lower case a’s and it was being compared to 4 upper case A’s. This WHERE statement was true because both the column C and the character string “AAAA” have a collation of Latin1_General_CI_AS, which is a case-insensitive collation. A character string is assigned the default collation of the database when the string is not associated with a COLLATE clause. To make sure the 4 upper case A’s don’t match the 4 lower case a’s in column C I need to specify a case-sensitive collation on one side of the equal operator. By specifically casting the character string “AAAA” to a case-sensitive collation using the “COLLATE Latin_General_CS_AS” clause in the second SELECT statement I forced the sting on the right of the equal operator to be case sensitive, which caused “aaaa” to not equal “AAAA”.

Conclusion

This article provided you with some examples of how to physically change the case of some column values and how to set the collation of a character string. Depending on your application’s needs you might need to convert or compare character strings that contain upper and lower case characters. The use of the UPPER and LOWER functions allows you to convert your character strings to whatever case you need. The COLLATE clause helps you explicitly convert a string of characters to the desired collation. When comparing character strings it is important to understand the case-sensitivity of all strings to make sure the outcome of your comparison is what you expect.

» See All Articles by Columnist Gregory A. Larsen



MS SQL Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
MS SQL Forum
Topic By Replies Updated
SQL 2005: SSIS: Error using SQL Server credentials poverty 3 August 17th, 07:43 AM
Need help changing table contents nkawtg 1 August 17th, 03:02 AM
SQL Server Memory confifuration bhosalenarayan 2 August 14th, 05:33 AM
SQL Server Primary Key and a Unique Key katty.jonh 2 July 25th, 10:36 AM


















Thanks for your registration, follow us on our social networks to keep up-to-date