The Globalization of Language in Oracle - The NLS_LANG variable
April 6, 2005
Let's zero in on the most important variable for implementing a global database in Oracle and some lessons learned.
Firstly I need to thank the professionals at Oracle for their help in assisting me in getting the facts straight on these globalization articles. If you haven't noticed, the facts and methods to setting the NLS_LANG variable, in the last article of this series, has been revised. I would encourage you to go back and read this last article as the examples are now more complete. This article is geared toward setting things straight around the NLS_LANG variable and some of the pitfalls I fell into.
If I have not already said it yet, and I will say it again in the future, if there were one variable that dictated how globalization in Oracle works, doesn't work, or just makes a plain mess of things, it would be the NLS_LANG variable. I have never had a single variable that could play such havoc on my testing methodologies and utterly confuse me at times. Just take a look at the next two exhibits from Oracle documentation. They seem quite harmless and straight to the point in how to set NLS_LANG. But after hours--yes hours--of scratching my head, and finally talking to the good people at Oracle I learned two things here. One, Windows does not natively support UTF8 (exhibit 1) and this may be considered an error in the Oracle documentation. In exhibit 2, even though I had set the NLS_LANG to a valid setting, if I ran SQL*Plus MS-DOS mode (sqlplus), I still had an issue with running .sql scripts into my database. Now let me say that exhibit 2 IS NOT WRONG but I just took it at face value for running SQL*Plus. If I had run SQL*Plus Windows Mode (sqlplusw) I would have had no problems. Actually I am very glad I was using SQL*Plus MS-DOS mode as I have truly learned more that way.
Truly, there does not have to be mass confusion when working with this NLS_LANG variable. Here are some of my lessons learned. I hope you benefit from them, I am sure there are more to come, but here is what I have to date. These lessons are from me running in a total Windows environment. Although, since I have discovered that all my lessons learned, do not rely upon the database at all, as NLS_LANG is really a client side parameter.
1. The Oracle character set is synonymous with the Windows code page. Basically, they mean the same thing.
2. For all intents and purposes, in most Windows' environments, the Windows command line mode has a different code page (uses a different character set) than the Windows GUI. This in particular was a huge thing for me to get use to as I have always used the DOS command mode SQL*Plus (sqlplus). When I switched to the Windows GUI SQL*Plus version (sqlplusw) all my problems went away. This is extremely important if you wish to run some scripts through sqlplus. Don't get into the trap of editing SQL commands through a GUI editor such as notepad and expect them to be recognized at the DOS command mode. You can easily verify this yourself by opening up notepad, typing in an international character such as 'Ä', saving the file, and then using the DOS command 'TYPE' to display the contents of the file. You will get something totally different. You can get around this by using the DOS command EDIT for your files but I don't know anyone who would want to. I would suggest to everyone to start using the Windows version of SQL*Plus. I have also been told that the web based iSQL*Plus is even better for globalization and representing international characters.
3. The NLS_LANG setting must reflect the character set (code page) that you are using on your client. You cannot switch to a different language, such as German, just by setting the NLS_LANG to GERMAN_GERMANY.
4. A client machine's locale is determined by the Regional options. My locale was set for English. So if you want to change to a different language you need to switch your locale at the operating system level to that region.
5. You can change your client locale very easily by going into the Regional options and changing to a different locale. Start -> Settings -> Control Panel -> Regional Options
6. NLS_LANG does not reflect the character set of the target database server or instance. Setting your NLS_LANG to the target database character set is a decision that will cause you issues down the road.
7. The NLS_LANG must reflect the character set (code page) that you are using on your client and is solely responsible for assisting Oracle in the conversion of characters from your client character set to the database server character set.
8. The LANGUAGE and TERRITORY parts of the NLS_LANG parameter do nothing for defining the character set. They are solely used for the presentation formats of such items as messages, dates, times, currencies, and numbers.
9. Read the manuals on this one. I have often been able to hack through setting Oracle parameters and quickly determining what is happening. Character sets, code pages, and getting a true understanding of the history and current implications is not trivial at times. You will no doubt need to visit not only Microsoft's web site for code page information but also try out www.unicode.org to assist you in your understanding.
In this ever-growing global economy and with the number of mergers and consolidations going on, it will become more and more important for us to provide for a more open database environment that allows for this consolidation and proper movement of data between disparate systems. The underling link for this to happen is opening up our databases for international character sets and allowing for the globalization of our databases. The two most important steps for this to happen is for us to first choose a character set at the database level that supports the regions we will be expected to support and secondly getting a handle on those different regions and verifying that we can set NLS_LANG such that those regions can provide and retrieve information from the database properly.