Free Newsletters:
DatabaseDaily  
Database Journal
Search Database Journal:
 
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» DBA Jobs
» Sitemap

News Via RSS Feed


follow us on Twitter





Brocade Doubles Down on 16 Gbps Fibre Channel

Microsoft Wants iOS Apps to Run on WP7

Avaya Debuts New Virtual Services Switch
Database Journal |DBA Support |SQLCourse |SQLCourse2







Technical Specialist – Pre-sales (MA)
Next Step Systems
US-MA-Littleton

Justtechjobs.com Post A Job | Post A Resume

Featured Database Articles

MS SQL

July 17, 2000

Generate Test Data Quickly With Cross Joins

By Neil Boyle

Introduction

Need to rough up some bulk test data in a hurry? A carefully thought-out Cross Join could be the answer.

Take any SQL query that joins two or more tables, delete the joining clause, and what do you get? In SQL terms you get a Cross Join, in relational database theory you get a Cartesian Product. Whatever you call it, you usually end up with far more rows than you wanted, and most of them make no sense. Although Cross Join queries are not normally much use, with a bit of thought we can use them to quickly create large amounts of useful test data.

A simple example

Take the following query:

select * from
(
    select "Fred" as fName union 
    select "Wilma" union
    select "Barney" union
    select "Betty"
) as flintstones_1 CROSS JOIN
(
    select "Flintstone" as lName union
    select "Rubble" 
) as flintstones_2

This will produce 8 rows - the result of multiplying the four rows in the first derived table (flintstones_1) against the two rows in the second derived table (flintstones_2):

fName  lName 
------ ---------- 
Betty  Rubble
Betty  Flintstone
Barney Rubble
Barney Flintstone
Wilma  Rubble
Wilma  Flintstone
Fred   Rubble
Fred   Flintstone

(8 row(s) affected)

Needless to say, not all the above are real Flintstones, but that is not the point. The point is that we have a cheap and cheerful way of generating multiple unique names. For a small extra investment we can generate eighteen, not eight, unique names:

select * from
(
    select "Fred" as fName union 
    select "Wilma" union
    select "Barney" union
    select "Betty" union
    select "Al" union
    select "Peggy" 

) as characters_1 CROSS JOIN
(
    select "Flintstone" as lName union
    select "Rubble" union
    select "Bundy"
) as characters_2

As many tables as you need can be Cross Joined to generate exponentially-large amounts of test data. This simple query generates 27 mostly-fake politicians with middle names:

select * from
(
    select "Harry" as fName union 
    select "Winston" union 
    select "Vladimir" 
) as polit1 CROSS JOIN
(
    select "S " as mName union 
    select "Spencer" union 
    select "Ilich" 
) as polit2 CROSS JOIN
(
    select "Trueman" as lName union 
    select "Churchill" union 
    select "Lenin" 
) as polit3 

A more practical example

In the following query I have raided a few more US Sitcoms to make a simple query that will generate no less than 150 unique authors in the PUBS database. Note that I have serialised the two parts of the data that will make up the author ID (and the phone number) to keep them unique, but I have chosen -55- to be the center portion of all my generated IDs (010-55-0010 for example) There were none in the initial authors table that matched this pattern so this gives me an at-a-glance way of identifying my auto-generated authors.

insert authors 
select au_id1 + '-' + au_id2 as au_ud, 
    fName, 
    lName,
    au_id1 + ' 5' + au_id2 as phone,
    'Test address for ' + fName + ' ' + lName, 
    'London',
    'UK',
    '12345',
    1
from
(
    select '009' as au_id1, 'Fred' as fName union 
    select '010', 'Wilma' union
    select '012', 'Barney' union
    select '013', 'Betty' union
    select '014', 'Al' union
    select '015', 'Peggy' union
    select '016', 'Frasier' union
    select '017', 'Niles' union
    select '018', 'Homer' union
    select '019', 'Marge' union
    select '020', 'Hawkeye' union
    select '021', 'Trapper' union
    select '024', 'Sam' union
    select '025', 'Diane' union
    select '026', 'Rebecca'
) as test_authors_part_1 CROSS JOIN
(
    select '55-0010' as au_id2, 'Flintstone' as lName union
    select '55-0021', 'Rubble' union
    select '55-0022', 'Bundy' union
    select '55-0023', 'Crane' union
    select '55-0024', 'Simpson' union
    select '55-0025', 'Pierce' union
    select '55-0026', 'John' union
    select '55-0028', 'Malone' union
    select '55-0029', 'Chambers' union
    select '55-0030', 'Howe'
) as test_authors_part_2

Summary

The principle will work for any test data provided you construct your query carefully - you can generate multiple orders for multiple books across multiple stores for multiple dates. The data will exhibit a regular pattern, rather than real-world randomness, but in most cases that will not be a problem.


Tools:
Add databasejournal.com to your favorites
Add databasejournal.com to your browser search box
IE 7 | Firefox 2.0 | Firefox 1.5.x
Receive news via our XML/RSS feed

MS SQL Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 



Latest Forum Threads
MS SQL Forum
Topic By Replies Updated
SQL 2005: SSIS: Script Component: Working with BLOB 0010 4 January 27th, 03:03 PM
Will an MS SQL db table trigger affect the value returned by scope_identity? wreade 2 December 19th, 04:48 PM
BULK UPDATE error benedec 1 December 14th, 08:39 AM
Toggling problem in Matrix report ssrs 2008 dev_ritesh 0 December 2nd, 02:17 PM