Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

MS SQL

Posted Aug 31, 2000

SQL Server 7.0: Merge Joins

By Alexander Chigrik


SQL Server 7.0 supports three types of join operations:

  • Nested-Loop joins
  • Merge joins
  • Hash joins
  • In this article I want to tell you about Merge joins and when SQL Server will choose this kind of join operation.

    This is a most effective method to join the tables. The merge join will be used, if both inputs are sorted on the merge columns. The best way, if the tables have a clustered index on the column that joins the tables.

    If you have two tables (Table1 and Table2), where n1 - the count of rows in the Table1 table, and n2 - the count of rows in the Table2 table, and there is no index on the column that joins this tables, then in the worse case (with Nested-Loop join) SQL Server will scan n1xn2 rows to return a results set (for each row from the outer table the inner table will be completely scanned).

    In the best case, if Table1 table has a clustered index on the column that joins the tables, and Table2 table has a clustered index on the column that joins the tables, and there is one-to-many relationship between Table1 and Table2, then Merge join will be used, and SQL Server will scan n1 + n2 rows to return a results set.

    This is the example:

    if object_id('dbo.Table1') is not null drop table Table1
    GO
    CREATE TABLE Table1 (Table1_id int primary key CLUSTERED, name char(10))
    GO
    if object_id('dbo.Table2') is not null drop table Table2
    GO
    CREATE TABLE Table2 (
      Table2_id int primary key NONCLUSTERED, 
      Table1_id int,
      name char(10))
    GO
    CREATE CLUSTERED INDEX indTable2 ON Table2 (Table1_id)
    GO
    
    DECLARE @i int
    SELECT @i = 1
    WHILE @i < 1000
      BEGIN
        INSERT INTO Table1 VALUES (@i, LTRIM(str(@i)))
        SELECT @i = @i + 1
      END
    GO
    
    DECLARE @i int
    SELECT @i = 1
    WHILE @i < 1000
      BEGIN
        INSERT INTO Table2 VALUES (@i, @i, LTRIM(str(@i)))
        SELECT @i = @i + 1
      END
    GO
    
    SET SHOWPLAN_TEXT ON
    GO
    SELECT a.Table1_id, b.Table1_id FROM Table1 a INNER JOIN Table2 b
      ON a.Table1_id = b.Table1_id
    GO
    SET SHOWPLAN_TEXT OFF
    GO
    

    These are the results:

    StmtText
    -------------------------------------------------------------------------------------------------------------------
      |--Merge Join(Inner Join, MERGE:([a].[Table1_id])=([b].[Table1_id]), RESIDUAL:([a].[Table1_id]=[b].[Table1_id]))
           |--Clustered Index Scan(OBJECT:([Test].[dbo].[Table1].[PK__Table1__1F4E99FE] AS [a]), ORDERED)
           |--Clustered Index Scan(OBJECT:([Test].[dbo].[Table2].[indTable2] AS [b]), ORDERED)
    

    This is a algorithm of the Merge join (description of its work in general case, for many-to-many relationship):

    while (not Table1.eof) and (not Table2.eof) do
      begin
        while Table2.Table1_id > Table1.Table1_id do Table1.MoveToNextRecord();
        value = Table1.Table1_id;
        while Table2.Table1_id < value do Table2.MoveToNextRecord();
        RID = Table1.RowID();
        while Table2.Table1_id = value do
          begin
            while Table1.Table1_id = value do
              begin
                < SELECT Table1.Table1_id, Table2.Table1_id > 
                Table1.MoveToNextRecord();
              end
            Table1.MoveTo(RID);
            Table2.MoveToNextRecord();
          end
      end
    

    NOTE. If the joined tables are small (contain only one data page for example), and at least one of the joined tables have index on the column that joins the tables, then SQL Server will use Nested-Loop join instead of Merge join or Hash join (usually).

    Because the query optimizer usually selects the best execution plan for a given select statement, it is not necessary to change the kind of join, but sometimes it can be useful. You can enforce the desirable join type with OPTION clause.

    This is the example to enforce Merge join:

    USE pubs
    GO
    SET SHOWPLAN_TEXT ON
    GO
    SELECT a.au_id FROM authors a JOIN titleauthor b
       ON a.au_id = b.au_id OPTION (MERGE JOIN)
    GO
    SET SHOWPLAN_TEXT OFF
    GO
    

    These are the results:

    StmtText
    ------------------------------------------------------------------------------------------------
    SELECT a.au_id FROM authors a JOIN titleauthor b
       ON a.au_id = b.au_id OPTION (MERGE JOIN)
    
    (1 row(s) affected)
    
    StmtText
    ---------------------------------------------------------------------------------------------------
      |--Merge Join(Inner Join, MERGE:([a].[au_id])=([b].[au_id]), RESIDUAL:([a].[au_id]=[b].[au_id]))
           |--Clustered Index Scan(OBJECT:([pubs].[dbo].[authors].[UPKCL_auidind] AS [a]), ORDERED)
           |--Index Scan(OBJECT:([pubs].[dbo].[titleauthor].[auidind] AS [b]), ORDERED)
    
    (3 row(s) affected)
    


    » See All Articles by Columnist Alexander Chigrik




    MS SQL Archives

    Comment and Contribute

     


    (Maximum characters: 1200). You have characters left.

     

     




    Latest Forum Threads
    MS SQL Forum
    Topic By Replies Updated
    SQL 2005: SSIS: Error using SQL Server credentials poverty 3 August 17th, 07:43 AM
    Need help changing table contents nkawtg 1 August 17th, 03:02 AM
    SQL Server Memory confifuration bhosalenarayan 2 August 14th, 05:33 AM
    SQL Server Primary Key and a Unique Key katty.jonh 2 July 25th, 10:36 AM