Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

MS SQL

Posted Feb 19, 2003

Beware of Mixing Collation with SQL Server 2000 - Part 1

By Gregory A. Larsen

A few months ago, I was running a stored procedure (SP) that performed a cross database join on our SQL Server 2000 box. This particular procedure was processing records from a table in one database (database A), that was originally created on our SQL Server 2000 box, while also processing records from a table in another database (database B), that was also on our SQL Server 2000 box, but was created by restoring a SQL Server 7.0 database backup. The SQL Server 7 machine had a different character set and sort order than our SQL Server 2000 database. The SELECT statement the SP was running was joining records from the table in database A with the table in database B on a varchar key field. Can you imagine what happened? The SP failed with a "collation error."

SQL Server 2000 finally solved the problem of taking database backups from servers with a different character set / sort order and restoring them, without getting a collation error. Before SQL Server 2000, backups from one server could only be restored on a different server if the same character set and sort order where used by both servers. With SQL Server 2000, you are able to set the collating sequence at the server, database, and even the column level. Now that you can have multiple collating sequences on a single database server, there are a few things you should understand about mixing collations.

When you have character data stored with different collation sequences, you need to be aware that comparing columns becomes more involved. Columns with the same collating sequences can be compared or joined without any additional considerations. Although when you compare columns that have different collating sequences, it requires some special coding considerations.

Let's look at an example that will demonstrate what I am talking about. The example below tries to join data from two different tables on columns that have different collating sequences.

create table #a (char_set1 varchar(50) collate Latin1_General_CI_AS)
create table #b(char_set2 varchar(50)collate Latin1_General_BIN)
insert into #a values ('collate')
insert into #b values ('collate')
select * from #a join #b on char_set1 = char_set2

When the code above is executed, the select statement returns the following error:

Cannot resolve collation conflict for equal to operation.

Why does the SELECT statement get this error? This error happens because char_set1 column in table #a has collation of Latin1_General_CI_AS, where as char_set2 column in table #b uses a collation of Latin1_General_BIN.

Of course, SQL Server allows you to write code so you can join the above two tables, even though the join columns each have a different collation setting. By Changing the SELECT statement so that one of the columns does an explicit collation conversion prior to joining, resolves the issue shown above. Below char_set2, which has a collation setting of Latin1_General_BIN is explicitly converted to Latin1_General_CI_AS using the COLLATE clause:

select * from #a join #b on char_set1 = char_set2 collate Latin1_general_CI_AS

By using this explicit conversion from one collation to another you can then join both tables. Now if you really have to store and compare columns with different collation, then SQL Server is great at providing you this functionality. Although if you have no business requirements to store data with different collations then by all means don't. Storing data of varying collations causes additional unnecessary programming considerations therefore choose your collation requirements wisely.

In our environment, we have many contractors doing development. These contractors typically develop our databases on their own SQL Server box. Their server and/or database collation setting may or may not be the same as ours. Occasionally a contractor will send me a database backup and tell me to restore it onto our test server. This is easily done with SQL Server 2000, regardless of the collation setting on the developer's machine. Every so often when I do this, I find that the restored database has a different collation setting then our server. When this happens I have to convert the data in the restored database to our standard collation. (This is another topic that I will discuss in part 2). I do this conversion because we require all our data to have the same collation setting. If you are restoring databases from other SQL Server machines, be aware that you might be restoring data that has a different collation setting then your server.

I'm going to leave you with one more example of why knowing your collating setting is extremely import.

The TEMPDB database is created each time the server is rebooted. TEMPDB has the same default collation sequence as the model DB. If for some reason you have a SQL Server 2000 database that has a different sort sequence than your current TEMPDB collation setting, then you potentially might have a problem when comparing data between your application and TEMPDB.

Here is an example of the TEMPDB comparison I am talking about. This example creates database test with a collation of Latin1_General_BIN, then creates and populates table b with data. Next, it builds table #temp in TEMPDB and inserts records into it from table b. The columns in TEMPDB use the model db collation sequence settings, which is in my case is Latin1_general_CI_AS.

create database test collate Latin1_General_BIN
go
use test
go
Create table b(char_set1 varchar(50))
insert into b values ('collate')
CREATE TABLE #TEMP (CHAR_SET2 VARCHAR(50))
insert into #temp select * from b
select * from #temp
select * from #temp t join b on t.char_set2 = b.char_set1

When I run the above code, I get the collation conflict error on the second SELECT statement. If the two databases had the same collation sequence this error would not occur. If I change the second SELECT statement to read as follows, the join will work without a collation conflict:

select * from #temp t join b on t.char_set2 collate Latin1_General_BIN = b.char_set1

There are a couple of different ways to determine your default sort order and character set. The sp_helpsort SP can be used to determine the character set / sort order of your server. For determining the database character set / sort order use sp_helpdb. Now if you also what to know the sort order / character set for each column in a database then use the information_schema.columns view to display this information.

Conclusion

Being able to restore databases onto a SQL Server 2000 box, without requiring the target server to have the same character set and sort order as the source server, is a big improvement over SQL Server 7.0. Although you need to be aware of the collation problems that might occur if your SQL Server 2000 machine contains databases and columns with varying collations. If you really do not have a business requirement to store data with different collations, then verify collation information every time you restore a database where you are unsure of the collation setting on the source server/database. The bottom line is you had better know your column collation settings in order to avoid the collation conflict error message.

» See All Articles by Columnist Gregory A. Larsen



MS SQL Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
MS SQL Forum
Topic By Replies Updated
SQL 2005: SSIS: Error using SQL Server credentials poverty 3 August 17th, 07:43 AM
Need help changing table contents nkawtg 1 August 17th, 03:02 AM
SQL Server Memory confifuration bhosalenarayan 2 August 14th, 05:33 AM
SQL Server Primary Key and a Unique Key katty.jonh 2 July 25th, 10:36 AM


















Thanks for your registration, follow us on our social networks to keep up-to-date