Beginning with later versions of Oracle8, Oracle has

provided a means of generating random numbers. This built-in package, DBMS_RANDOM, is fairly simple to use, and can generate

random numbers which are generally good enough for the needs of most users. If

you need to generate a large amount of data without having to provide a lot

thought about how random the data is, then DBMS_RANDOM will suit your needs.

If you need to encrypt sensitive data, then you should use

Oralce9i’s DBMS_OBSFUCATION_TOOLKIT feature. Oracle tells users "Do not

use DBMS_RANDOM as it is unsuitable for cryptographic key generation." Is

there something wrong with DBMS_RANDOM? Aren’t the numbers returned random

enough? Don’t you get the same output when given the same input? The answers

are a combination of "yes" and "no."

By un-wrapping the package Oracle uses to create the random

number generator, we will learn quite a bit about how DBMS_RANDOM works and

what its limitations are. Before looking at the package and some examples of

how it can be used, the meaning of "random" needs to be clarified.

Technically speaking, generating random numbers by a known method removes the

potential for true randomness. When generated in this manner, the numbers can

be properly described as pseudo-random numbers. However, if the pseudo-random

numbers meet several conditions or tests (chiefly, the numbers being

independent and identically distributed, or "iid"), then they are

considered to be random. Ideally, the distribution of the numbers is uniform

over the interval of 0 to 1 (and inclusive of the endpoints).

Knowing the parameters of the distribution helps us in

evaluating how random the numbers are. Conversely, observing the numbers and

calculating the mean and variance helps identify the distribution. Given that

our random numbers are (ideally) uniformly distributed over [0,1], we know that

the mean should be 1/2 and the variance should turn out to be 1/12. There are

many other tests which can be performed against the generated numbers. Having a

mean of 1/2 and a variance of 1/12 are rough indicators of a good uniform distribution,

but the real tests are more concerned with uniformity and independence. Your random

numbers can have a mean of 1/2, for example, but not be uniformly distributed.

The following properties of a good random generator – fast,

portable, long enough cycle, replicable results and output being uniformly "iid"

– are present with Oracle. In fact, by using the same seed value used in the

following examples, you should be able to produce the same results. Oracle’s

SQL Reference Guide lists four arguments or procedures you can use with

DBMS_RANDOM: initialize, seed, random, and terminate. There are several points

missing in this documentation. First, the range of numbers is from (-)2^{31}

to (+)2^{31}, or +/- 2147483648. Second is that the number of digits

may be as many as ten, not eight. Lastly, there are other undocumented

functions. One such function is DBMS_RANDOM.VALUE, and it will return the type

of value we are more interested in (a number between zero and one). The other

hidden functions you can use return normally distributed numbers and strings of

varying length and case.

Let’s look at some output from the DBMS_RANDOM package and

see how Oracle’s random number generator performs. We will use a 6-digit seed

number (123456) and start by generating 1,000 numbers, then increasing by a

factor of ten up to ten million. The table name is RAND and has columns named

LINE and RNO (for random number).

SQL> DECLARE

2 v_rand number;

3 BEGIN

4 DBMS_RANDOM.INITIALIZE (123456);

5 FOR i IN 1..1000 LOOP

6 v_rand := DBMS_RANDOM.value;

7 INSERT into rand values (i,v_rand);

8 END LOOP;

9 END;

10 /PL/SQL procedure successfully completed.

Selecting the first 10 rows shows:

SQL> select * from rand

2 where line < 11;LINE RNO

———- —————————————-

1 0.9253168129811330987378779577193159262

2 0.3703059867076638894717777425502136731

3 0.8562787602662748879896983860530778367

4 0.8747769791015347163677476210098089609

5 0.8538887894283505001033221816233701639

6 0.0139762421028966557398918466225500621

7 0.6789827768885798969202524863427842743

8 0.1219758197605125529485878115247706788

9 0.6384861881298654042162612548721038633

10 0.506041552777518563552277905896430016110 rows selected.

How did the average and variance "perform?"

SQL> select avg(rno), variance(rno)

2 from rand;AVG(RNO) VARIANCE(RNO)

———- ————-

.505209167 .081572912

The average and variance we would expect is .50000000 and

.08333333. Continuing on with the output from tables with 10,000 to 10,000,000

rows, we will see an improvement in those indicators:

# of Rows |
Average |
Variance |
Time to generate |

1,000 |
.505209167 |
.081572912 |
00:00:00.01 |

10,000 |
.502495652 |
.082522109 |
00:00:00.05 |

100,000 |
.498821863 |
.083579021 |
00:00:06.01 |

1,000,000 |
.500032274 |
.083360802 |
00:01:01.05 |

10,000,000 |
.500036405 |
.083323331 |
00:12:33.02 |

Up until a million rows, the average and variance both

tended to converge to (but not actually reach) their expected values. At the

ten million row mark, only the variance improved. Again, the mean and variance

are not the true tests of uniformity and independence. Other tests, which

include the following – frequency, runs, autocorrelation, gap and poker – could

be used to test uniformity and independence. For example, if the numbers were

uniformly distributed, we would expect to see the same count of numbers in

whatever intervals we were interested in.

Using RANDOM instead of VALUE in the million row table

reflects the transformation of the Uniform(0,1) range of numbers to plus or

minus 2147483648. You can see the minimum and maximum numbers are close to 2147483648

and that there is very little repetition of numbers. Out of a million

generated numbers, 109 numbers were duplicated (a rate around .01%).

SQL> select min(rno), max(rno), count(distinct(rno))

2 from rand;MIN(RNO) MAX(RNO) COUNT(DISTINCT(RNO))

————- ———- ——————–

-2147479960 2147480366 999891

Looking at the scripts behind DBMS_RANDOM shows how the

numbers from DBMS_RANDOM.RANDOM are created. You can look at the scripts which

create this package, or view the text selected from all_source. Here is the

first part of the source:

SQL> select text from all_source where name = ‘DBMS_RANDOM’;

TEXT

——————————————————————————-

PACKAGE dbms_random AS————

— OVERVIEW

—

— This package should be installed as SYS. It generates a sequence of

— random 38-digit Oracle numbers. The expected length of the sequence

— is about power(10,28), which is hopefully long enough.

—

——–

— USAGE

—

— This is a random number generator. Do not use for cryptography.

— For more options the cryptographic toolkit should be used.

—

— By default, the package is initialized with the current user

— name, current time down to the second, and the current session.

—

— If this package is seeded twice with the same seed, then accessed

— in the same way, it will produce the same results in both cases.