Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Slideshows
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

Oracle

Posted November 21, 2016

Storing Numbers As Strings In An Oracle Database

By David Fitzjarrell

The predominance of business applications designed with no particular database in mind is growing/ With that growth arises an interesting issue: designing tables to contain various types of data in a single column. It may be due to the nature of the data needing to be stored -- telephone numbers, Social Security numbers, ID numbers of various sorts all of which are called numbers but contain non-numeric components making them unsuitable for traditional numeric data types. Sometimes, though, 'convenience' can generate table designs that are flawed by allowing multiple types of data to be stored in a single column, such as character strings and numeric values. Oracle, like any other relational database, allows this and on the face of it this doesn't seem strange. Converting numbers to characters, though, produces results that can be unexpected, especially when sorting those values. Let's examine that situation and see what results it can produce.

Numbers are, well, numbers and sort accordingly. 1 is always 1, and always comes before 2, which comes before 3 and so on. Convert those numbers to characters and an entirely new order is produced, because the character code for '1' isn't the same as the numeric value of 1. A portion of the ASCII table for the character representations of the numbers 0 through 9 follows:

`

BinaryOctDecHexGlyph
196319651967
011 0000 060 48 30 0
011 0001 061 49 31 1
011 0010 062 50 32 2
011 0011 063 51 33 3
011 0100 064 52 34 4
011 0101 065 53 35 5
011 0110 066 54 36 6
011 0111 067 55 37 7
011 1000 070 56 38 8
011 1001 071 57 39 9

With ASCII representation there are only 10 'digits' represented; for multi-digit numbers each digit is replaced with its ASCII code. As such any 'number' represented in ASCII that starts with '1' will all sort together since it's not the display value, but the decimal code that's being sorted on (the value in the third column of the provided table). This results in '10' (ASCII codes 49,48) directly following '1' (ASCII code 49), and '2' (ASCII code 50) not appearing until after all of the 'numbers' starting with '1' have been displayed. To prove this let's look at an example where the lyrics to "Deck The Halls" are loaded into a table, along with their associated line numbers. Two tables will be used, the first where the line_no column is a character string, the second declaring line_no as the traditional NUMBER type. We begin:


SQL>
SQL> --
SQL> -- Create demo table
SQL> --
SQL> -- Make line_no a character string
SQL> --
SQL>
SQL> create table order_tst (line_no varchar2(5), verse varchar2(80));

Table created.

SQL>
SQL> --
SQL> -- Add primary key
SQL> --
SQL>
SQL> alter table order_tst add constraint ord_tst_pk primary key(line_no);

Table altered.

SQL>
SQL> --
SQL> -- Insert data, in order, with line numbers
SQL> --
SQL> -- Unfortunately the line 'numbers' are character
SQL> -- strings
SQL> --
SQL>
SQL> insert all
  2  into order_tst values('1	 ','Deck the halls with boughs of holly,')
  3  into order_tst values('2	 ','Fa la la la la la la la!')
  4  into order_tst values('3	 ','''Tis the season to be jolly,')
  5  into order_tst values('4	 ','Fa la la la la la la la!')
  6  into order_tst values('5	 ','Don we now our gay apparel,')
  7  into order_tst values('6	 ','Fa la la la la la la la!')
  8  into order_tst values('7	 ','Troll the ancient Yuletide carol,')
  9  into order_tst values('8	 ','Fa la la la la la la la!')
 10  into order_tst values('9	 ','************************************')
 11  into order_tst values('10	 ','See the blazing yule before us,')
 12  into order_tst values('11	 ','Fa la la la la la la la!')
 13  into order_tst values('12	 ','Strike the harp and join the chorus,')
 14  into order_tst values('13	 ','Fa la la la la la la la!')
 15  into order_tst values('14	 ','Follow me in merry measure,')
 16  into order_tst values('15	 ','Fa la la la la la la la!')
 17  into order_tst values('16	 ','While I tell of Yuletide treasure,')
 18  into order_tst values('17	 ','Fa la la la la la la la!')
 19  into order_tst values('18	 ','************************************')
 20  into order_tst values('19	 ','Fast away the old year passes,')
 21  into order_tst values('20	 ','Fa la la la la la la la!')
 22  into order_tst values('21	 ','Hail the new, year lads and lasses,')
 23  into order_tst values('22	 ','Fa la la la la la la la!')
 24  into order_tst values('23	 ','Sing we joyous all together!')
 25  into order_tst values('24	 ','Fa la la la la la la la!')
 26  into order_tst values('25	 ','Heedless of the wind and weather,')
 27  into order_tst values('26	 ','Fa la la la la la la la!')
 28  into order_tst values('27	 ','************************************')
 29  select * From dual;

27 rows created.

SQL>
SQL> commit;

Commit complete.

SQL>

Notice that the data was loaded 'in order'; unfortunately the table is a heap table and no order is maintained as the data is inserted. Knowing this an ORDER BY clause is used, but since the line_no values are VARCHAR2 strings they sort in what seems to be a rather strange order:


SQL>
SQL> --
SQL> -- Retrieve data
SQL> --
SQL> -- It doesn't come out as one might expect
SQL> --
SQL> -- Even with an order by
SQL> --
SQL>
SQL> select *
  2  from order_tst
  3  order by 1;

LINE_ VERSE
----- --------------------------------------------------------------------------------
1     Deck the halls with boughs of holly,
10    See the blazing yule before us,
11    Fa la la la la la la la!
12    Strike the harp and join the chorus,
13    Fa la la la la la la la!
14    Follow me in merry measure,
15    Fa la la la la la la la!
16    While I tell of Yuletide treasure,
17    Fa la la la la la la la!
18    ************************************
19    Fast away the old year passes,
2     Fa la la la la la la la!
20    Fa la la la la la la la!
21    Hail the new, year lads and lasses,
22    Fa la la la la la la la!
23    Sing we joyous all together!
24    Fa la la la la la la la!
25    Heedless of the wind and weather,
26    Fa la la la la la la la!
27    ************************************
3     'Tis the season to be jolly,
4     Fa la la la la la la la!
5     Don we now our gay apparel,
6     Fa la la la la la la la!
7     Troll the ancient Yuletide carol,
8     Fa la la la la la la la!
9     ************************************

27 rows selected.

SQL>

More work is required since the string data needs to be converted to numeric data before the ORDER BY will perform as we expect:


SQL>
SQL> --
SQL> -- Retrieve data, and sort by line_no
SQL> -- as a number, not a character string
SQL> --
SQL> -- It is more work to convert numbers
SQL> -- stored as characters back to numbers
SQL> --
SQL>
SQL> select *
  2  from order_tst
  3  order by to_number(line_no);

LINE_ VERSE
----- --------------------------------------------------------------------------------
1     Deck the halls with boughs of holly,
2     Fa la la la la la la la!
3     'Tis the season to be jolly,
4     Fa la la la la la la la!
5     Don we now our gay apparel,
6     Fa la la la la la la la!
7     Troll the ancient Yuletide carol,
8     Fa la la la la la la la!
9     ************************************
10    See the blazing yule before us,
11    Fa la la la la la la la!
12    Strike the harp and join the chorus,
13    Fa la la la la la la la!
14    Follow me in merry measure,
15    Fa la la la la la la la!
16    While I tell of Yuletide treasure,
17    Fa la la la la la la la!
18    ************************************
19    Fast away the old year passes,
20    Fa la la la la la la la!
21    Hail the new, year lads and lasses,
22    Fa la la la la la la la!
23    Sing we joyous all together!
24    Fa la la la la la la la!
25    Heedless of the wind and weather,
26    Fa la la la la la la la!
27    ************************************

27 rows selected.

SQL>

The song lyrics are now in proper sequence, and it took an explicit conversion between string and numeric data to effect that. It is important to note that had there been mixed data, or phone numbers, Social Security numbers or any data that wasn't fully numeric the conversion would have failed with the following error:


ERROR at line 1:
ORA-01722: invalid number

...making it difficult to produce the 'proper' order for the output. That may be an acceptable trade-off to have the 'convenience' of having a 'universal' field to store actual numbers, pseudo-numbers (SSN, ID numbers, telephone numbers) and actual character data (alpha-numeric product 'numbers' and the like). It also can make more work for the database, since character strings will behave differently with indexed access. (Indexing strings goes back to the nature of an ASCII string, which is a concatenation of individual characters and causes Oracle to collate the strings and sort according to the leading character, next character, etc. to put the strings in proper order. Indexing numbers doesn't require such 'gyrations' as each number, regardless of length, is a single entity which can be ordered by magnitude.) Dropping the original table, re-creating it using the proper data type (NUMBER) for the line_no column and loading the modified data the query is run again, absent the ORDER BY since the data was loaded 'in order' and line_no is actually a number field. This fixes the additional work needed to convert strings to numbers but also reveals the inherent 'problem' of data order in a heap table (there isn't any):


SQL>
SQL> --
SQL> -- Drop existing table
SQL> --
SQL>
SQL> drop table order_tst;

Table dropped.

SQL>
SQL> --
SQL> -- Create demo table again
SQL> -- Make line_no a number
SQL> --
SQL>
SQL> create table order_tst (line_no number, verse varchar2(80));

Table created.

SQL>
SQL> --
SQL> -- Add primary key
SQL> --
SQL>
SQL> alter table order_tst add constraint ord_tst_pk primary key(verse, line_no);

Table altered.

SQL>
SQL> --
SQL> -- Insert data, again in order
SQL> --
SQL>
SQL> insert all
  2  into order_tst values(1,'Deck the halls with boughs of holly,')
  3  into order_tst values(2,'Fa la la la la la la la!')
  4  into order_tst values(3,'''Tis the season to be jolly,')
  5  into order_tst values(4,'Fa la la la la la la la!')
  6  into order_tst values(5,'Don we now our gay apparel,')
  7  into order_tst values(6,'Fa la la la la la la la!')
  8  into order_tst values(7,'Troll the ancient Yuletide carol,')
  9  into order_tst values(8,'Fa la la la la la la la!')
 10  into order_tst values(9,'************************************')
 11  into order_tst values(10,'See the blazing yule before us,')
 12  into order_tst values(11,'Fa la la la la la la la!')
 13  into order_tst values(12,'Strike the harp and join the chorus,')
 14  into order_tst values(13,'Fa la la la la la la la!')
 15  into order_tst values(14,'Follow me in merry measure,')
 16  into order_tst values(15,'Fa la la la la la la la!')
 17  into order_tst values(16,'While I tell of Yuletide treasure,')
 18  into order_tst values(17,'Fa la la la la la la la!')
 19  into order_tst values(18,'************************************')
 20  into order_tst values(19,'Fast away the old year passes,')
 21  into order_tst values(20,'Fa la la la la la la la!')
 22  into order_tst values(21,'Hail the new, year lads and lasses,')
 23  into order_tst values(22,'Fa la la la la la la la!')
 24  into order_tst values(23,'Sing we joyous all together!')
 25  into order_tst values(24,'Fa la la la la la la la!')
 26  into order_tst values(25,'Heedless of the wind and weather,')
 27  into order_tst values(26,'Fa la la la la la la la!')
 28  into order_tst values(27,'************************************')
 29  select * From dual;

27 rows created.

SQL>
SQL> commit;

Commit complete.

SQL>

Retrieving the data from the table without an ORDER BY produces interesting results, as the expected 'order' doesn't exist. Oracle places rows where they will fit in the heap, so the fifth row inserted may end up in the 'second' slot in the data block because it fits better than any prior row, behavior which produces the results shown below:


SQL>
SQL> --
SQL> -- Retrieve data
SQL> --
SQL> -- It doesn't come out as one might expect
SQL> --
SQL>
SQL> select *
  2  from order_tst;

   LINE_NO VERSE
---------- --------------------------------------------------------------------------------
         3 'Tis the season to be jolly,
         9 ************************************
        18 ************************************
        27 ************************************
         1 Deck the halls with boughs of holly,
         5 Don we now our gay apparel,
         2 Fa la la la la la la la!
         4 Fa la la la la la la la!
         6 Fa la la la la la la la!
         8 Fa la la la la la la la!
        11 Fa la la la la la la la!
        13 Fa la la la la la la la!
        15 Fa la la la la la la la!
        17 Fa la la la la la la la!
        20 Fa la la la la la la la!
        22 Fa la la la la la la la!
        24 Fa la la la la la la la!
        26 Fa la la la la la la la!
        19 Fast away the old year passes,
        14 Follow me in merry measure,
        21 Hail the new, year lads and lasses,
        25 Heedless of the wind and weather,
        10 See the blazing yule before us,
        23 Sing we joyous all together!
        12 Strike the harp and join the chorus,
         7 Troll the ancient Yuletide carol,
        16 While I tell of Yuletide treasure,

27 rows selected.

SQL>

The ORDER BY is necessary, even for numeric data, but it doesn't require additional thought to produce the expected results:


SQL>
SQL> --
SQL> -- Retrieve data, and sort by line_no
SQL> --
SQL>
SQL> select *
  2  from order_tst
  3  order by 1;

   LINE_NO VERSE
---------- --------------------------------------------------------------------------------
         1 Deck the halls with boughs of holly,
         2 Fa la la la la la la la!
         3 'Tis the season to be jolly,
         4 Fa la la la la la la la!
         5 Don we now our gay apparel,
         6 Fa la la la la la la la!
         7 Troll the ancient Yuletide carol,
         8 Fa la la la la la la la!
         9 ************************************
        10 See the blazing yule before us,
        11 Fa la la la la la la la!
        12 Strike the harp and join the chorus,
        13 Fa la la la la la la la!
        14 Follow me in merry measure,
        15 Fa la la la la la la la!
        16 While I tell of Yuletide treasure,
        17 Fa la la la la la la la!
        18 ************************************
        19 Fast away the old year passes,
        20 Fa la la la la la la la!
        21 Hail the new, year lads and lasses,
        22 Fa la la la la la la la!
        23 Sing we joyous all together!
        24 Fa la la la la la la la!
        25 Heedless of the wind and weather,
        26 Fa la la la la la la la!
        27 ************************************

27 rows selected.

SQL>

Database design is an important aspect of application design, and choices that appear on their face to be logical may not be. There are still applications that are 'database agnostic', relying on 'standard' ANSI SQL so that no matter which RDBMS is chosen the application will function. Such a practice allows for more 'universal' code but it also ignores any RDBMS-specific modifications to that ANSI SQL that improve performance for that particular engine. Oracle provides functionality that may not be available in other RDBMS engines, or may not be implemented the same way, requiring RDBMS-specific functions and procedure calls that, with 'standard' ANSI SQL aren't provided in the standard. Along with that the practice of using character strings for storing numbers may be convenient but it isn't good practice if the data stored is actually numeric. Mixed types of data stored in a character field can, and often do, prevent any implicit conversion the database engine can perform. And, as illustrated, such decisions can produce results that aren't logical, such as the sort order, possibly confusing the end users.

There may not be much of a choice in how a vendor chooses to store data, but knowing such limitations can make it easier to produce 'correct' results even though it may take a bit more work.

See all articles by David Fitzjarrell



Oracle Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
Oracle Forum
Topic By Replies Updated
Oracle Data Mining: Classification jan.hasller 0 July 5th, 07:19 AM
Find duplicates - Unique IDs Lava 5 July 2nd, 08:30 AM
no matching unique or primary key rcanter 1 April 25th, 12:32 PM
Update values of one table based on condition of values in other table using Trigger Gladiator 3 February 29th, 06:01 PM