Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

MS SQL

Posted Apr 25, 2008

SqlCredit - Part 15: The Cost of Distribution

By Rob Garrison

Download the code for this article.

This is part 15 of a series. If you have not read part 14, you can find it here, but this article does not require any knowledge of the previous work in the series.

Distribution: Spreading Tables Across Multiple Databases

When you design a database from scratch, you probably put all the related tables in the same database. This allows for such database-level validations as foreign keys and simple two-part JOINs (schema.table). However, there are often reasons that you have to deal with tables that are related but spread across multiple databases.

History is often a factor, but there are other possibilities. Deployment, manageability, and security concerns can affect which data is on which database.

In this series’ current quest to understand costs of different architectures, we will look at what cost is paid when tables are separated into multiple databases on the same server. We will not look at separate-server scenarios because the number of variables is too high. However, the scripts that accompany this article could be modified to run separate-server scenarios if someone were looking for results for a specific set of servers. We’re also not recording the timing differences for the INSERT statements. These records may be created separately, or they may be created as part of a single transaction. Since all stored procedure calls are coming from outside of the core OLTP databases (similar to an application making calls into a database), the differences might be zero (where records are created one-at-a-time). If the records are created as a group, the differences would vary widely depending on the implementation.

Building the Tests

Test Introduction

In previous installments of this series, I introduced three tables that I will reuse here: Card, Vendor, and Purchase. The DDL for the tables is the same for each layout except

  • the name is different (Card1 for layout1, Card2 for layout2, etc.).
  • the database is different depending on the layout.
  • tables in the same database do have foreign keys where separated tables do not have foreign keys.
CREATE TABLE dbo.Card1 (
    CardID         int          NOT NULL  IDENTITY,
    CardNumber1    bigint       NOT NULL,
    CardNumber2    bigint       NOT NULL,
    CardNumber3    bigint       NOT NULL,
    CardNumber4    bigint       NOT NULL,
    SecurityCode1  smallint     NOT NULL,
    SecurityCode2  smallint     NOT NULL,
    SecurityCode3  smallint     NOT NULL,
    SecurityCode4  smallint     NOT NULL,
    SecureString1  varchar(36)  NOT NULL,
    SecureString2  varchar(36)  NOT NULL,
    SecureString3  varchar(36)  NOT NULL,
    SecureString4  varchar(36)  NOT NULL,
    CONSTRAINT Card1_PK PRIMARY KEY CLUSTERED (CardID)
)
CREATE TABLE dbo.Vendor1 (
    VendorID         int          NOT NULL  IDENTITY,
    BusinessName     varchar(40)  NOT NULL,
    Address1         varchar(40)  NOT NULL,
    Address2         varchar(40)  NOT NULL,
    City             varchar(40)  NOT NULL,
    [State]          char(2)      NOT NULL,
    ZipCode          varchar(10)  NOT NULL,
    DateCreated      datetime     NOT NULL
        CONSTRAINT Vendor1_DateCreated_DF     DEFAULT GETUTCDATE(),
    DateLastUpdated  datetime     NOT NULL
        CONSTRAINT Vendor1_DateLastUpdated_DF DEFAULT GETUTCDATE(),
    StatusCd         tinyint      NOT NULL
        CONSTRAINT Vendor1_StatusCd_DF        DEFAULT 1,
    CONSTRAINT Vendor1_PK PRIMARY KEY CLUSTERED (VendorID)
)
CREATE TABLE dbo.Purchase1 (
    PurchaseID   int       NOT NULL  IDENTITY,
    CardID       int       NOT NULL,
    VendorID     int       NOT NULL,
    Amount       money     NOT NULL,
    DateCreated  datetime  NOT NULL
        CONSTRAINT Purchase1_DateCreated_DF DEFAULT GETUTCDATE(),
    CONSTRAINT Purchase1_PK PRIMARY KEY CLUSTERED (PurchaseID),
    CONSTRAINT Purchase1_Card_FK FOREIGN KEY (CardID)
        REFERENCES dbo.Card1 (CardID),
    CONSTRAINT Purchase1_Vendor_FK FOREIGN KEY (VendorID)
        REFERENCES dbo.Vendor1 (VendorID)
)

The basic idea of the test here is that there are three “spreads” tested and compared:

  • Layout 1: All three tables on database DB1
  • Layout 2: Card and Purchase on DB1, Vendor on DBv
  • Layout 3: Purchase on DB1, Vendor on DBv, Card on DBc

Pictures help significantly here.

The test script was run from the “_test” database and called stored procedures in DB1, DBv, and DBc as appropriate.

Each run of the test script performs all of the following steps:

  • Truncate or DELETE FROM all tables and reset identity seeds where appropriate
  • Create 10,000 Card records
    • 10,000 records in SqlCreditDB1.dbo.Card1
    • 10,000 records in SqlCreditDB1.dbo.Card2
    • 10,000 records in SqlCreditDBc.dbo.Card3
  • Create 10,000 Vendor records
    • 10,000 records in SqlCreditDB1.dbo.Vendor1
    • 10,000 records in SqlCreditDBv.dbo.Vendor2
    • 10,000 records in SqlCreditDBv.dbo.Vendor3
  • Create 50,000 Purchase records
    • 50,000 records in SqlCreditDB1.dbo. Purchase1
    • 50,000 records in SqlCreditDB1.dbo. Purchase2
    • 50,000 records in SqlCreditDB1.dbo. Purchase3
  • DBCC FREEPROCCACHE and DBCC DROPCLEANBUFFERS
  • Run UPDATE STATISTICS on all nine tables
  • Read 500,000 random Purchase1 records (JOINed to Card1 and Vendor1) for Layout 1 (pass 1: ignored)
  • Read 500,000 random Purchase1 records (JOINed to Card1 and Vendor1) for Layout 1 (pass 2: elapsed time recorded)
  • Read 500,000 random Purchase2 records (JOINed to Card2 and Vendor2) for Layout 2 (pass 1: ignored)
  • Read 500,000 random Purchase2 records (JOINed to Card2 and Vendor2) for Layout 2 (pass 2: elapsed time recorded)
  • Read 500,000 random Purchase3 records (JOINed to Card3 and Vendor3) for Layout 3 (pass 1: ignored)
  • Read 500,000 random Purchase3 records (JOINed to Card3 and Vendor3) for Layout 3 (pass 2: elapsed time recorded)

Given a standard workload (SELECTs only), what will the penalty be for moving the tables into separate databases? You probably have an idea, so write down your guesses now. Layout 1 is the baseline (call it “+0%”). Layout 2: +___%. Layout 3: +___%.

Results

Here are the results of four tests. The integers (times) are in milliseconds.

Test

Layout 1
(baseline)

Layout 2

Penalty

Layout 3

Penalty

1

21,470

22,780

5.75%

23,983

10.48%

2

21,550

22,923

5.99%

24,030

10.32%

3

21,580

22,860

5.60%

24,080

10.38%

4

21,580

22,923

5.86%

24,030

10.20%

Average

21,545

22,871

5.80%

24,031

10.35%

The times and penalties are very consistent from test to test.

Conclusion

Spreading tables across multiple databases is done for many reasons, usually very specific to a particular environment and set of applications. When tables are separated, there is a performance penalty, but the numbers are relatively small. However, if your tables are currently spread, consider the performance gain available if you were to pull the tables into a common database.

What are your thoughts here? What are the reasons that your particular application has multiple databases? Take a minute and drop a comment in the forum.

Next month: The Cost of Bloat.

Download the code for this article.

» See All Articles by Columnist Rob Garrison

SqlCredit - Developing a Complete SQL Server OLTP Database Project



MS SQL Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
MS SQL Forum
Topic By Replies Updated
SQL 2005: SSIS: Error using SQL Server credentials poverty 3 August 17th, 07:43 AM
Need help changing table contents nkawtg 1 August 17th, 03:02 AM
SQL Server Memory confifuration bhosalenarayan 2 August 14th, 05:33 AM
SQL Server Primary Key and a Unique Key katty.jonh 2 July 25th, 10:36 AM