SQL Server and Collation

What is collation?

Collation refers to a set of rules that determine how data
is sorted and compared. Character data is sorted using rules that define the
correct character sequence, with options for specifying case-sensitivity, accent
marks, kana character types and character width.

Case sensitivity

If A and a, B
and b, etc. are treated in the same way then it is
case-insensitive. A computer treats A and a
differently because it uses ASCII code to differentiate the input. The ASCII
value of A is 65, while a is 97. The
ASCII value of B is 66 and b is 98.

Accent sensitivity

If a and á, o and ó are
treated in the same way, then it is accent-insensitive. A computer treats a and á differently because it uses ASCII code for
differentiating the input. The ASCII value of a is 97 and áis 225. The ASCII value of o is 111 and ó
is 243.

Kana Sensitivity

When Japanese kana characters
Hiragana and Katakana are treated differently, it is called Kana sensitive.

Width sensitivity

When a single-byte character
(half-width) and the same character when represented as a double-byte character
(full-width) are treated differently then it is width sensitive.

Database, Tables and columns with different collation

SQL Server 2000 allows the users
to create databases, tables and columns in different collations.

Databases with different collation


use master
go
create database BIN collate Latin1_General_BIN
go
create database CI_AI_KS collate Latin1_General_CI_AI_KS
go
create database CS_AS_KS_WS collate Latin1_General_CS_AS_KS_WS
go

Tables and columns with different collation


Create table Mytable (
[colu] char(10) COLLATE Albanian_CI_AI_KS_WS NULL,
[Maydate] [char] (8) COLLATE Korean_Wansung_Unicode_CS_AS_KS NOT NULL ,
[Risk_Rating] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
)

Comparing characters on the databases with different collation

When we run the below code in CI_AI_KS and CS_AS_KS_WS the results will be completely
different.


declare @Accentvar1 char(1)
declare @Accentvar2 char(1)
declare @Casevar1 char(1)
declare @Casevar2 char(1)
set @casevar1 =’A’
set @casevar2 =’a’
set @Accentvar1 =’a’
set @Accentvar2 =’á’

if @casevar1 = @casevar2
begin
print “A and a are treated same”
end
else
begin
print “A and a are not treated same”
end

if @Accentvar1 = @Accentvar2
begin
print “A and á are treated same”
end
else
begin
print “A and á are not treated same”
end

When we execute these statements
on a CI_AI_KS database, the results are similar to those shown below.


A and a are treated same
A and á are treated same

When we execute these statements
on a CS_AS_KS_WS database, the results are similar to those shown below.


A and a are not treated same
A and á are not treated same

Simulating case sensitivity in a case in-sensitive database

It is often necessary to simulate case
sensitivity in a case insensitive database. The example below shows how you can
achieve that.


Use CI_AI_KS
go
declare @var1 varchar(10)
declare @var2 varchar(10)
set @var1 =’A’
set @var2 =’a’
if ASCII(@var1) = ASCII(@var2)
print “A and a are treated same”
else
print “A and a are not same”

However, the function ASCII
cannot be used for words. In order to achieve the same functionality of
simulating case sensitiveness, we can use the varbinary data type.


Use CI_AI_KS
go
declare @var1 varchar(10)
declare @var2 varchar(10)
set @var1 =’Good’
set @var2 =’gooD’
if cast(@var1 as varbinary) = cast(@var2 as varbinary)
print “Good and gooD are treated same”
else
print “Good and gooD are not treated same”

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles